Daniel Himmelstein
Head of Data Integration at Related Sciences. Digital craftsman of the biodata revolution.
February 21, 2017
Pfizer, Cambridge, MA
By Daniel Himmelstein
@dhimmel
Slides at slides.com/dhimmel/pfizer
Visualizing Hetionet v1.0
Systematic integration of biomedical knowledge prioritizes drugs for repurposing
Daniel S Himmelstein, Antoine Lizee, Christine Hessler, Leo Brueggeman, Sabrina L Chen, Dexter Hadley, Ari Green, Pouya Khankhanian, Sergio E Baranzini
bioRxiv. 2016. DOI: 10.1101/087619
features = metapaths
observations =
compound–disease pairs
positives = treatments
negatives =
non-treatments
slide added after presentation
DWPC Δ AUROC
slide added after presentation
Compound–causes–SideEffect–causes–Compound–treats–Disease
Compound–binds–Gene–binds–Compound–treats–Disease
Compound–binds–Gene–associates–Disease
Compound–binds–Gene–participates–Pathway–participates–Disease
MATCH path = (n0:Compound)-[:BINDS_CbG]-(n1)-[:PARTICIPATES_GpPW]-
(n2)-[:PARTICIPATES_GpPW]-(n3)-[:ASSOCIATES_DaG]-(n4:Disease)
USING JOIN ON n2
WHERE n0.name = 'Bupropion'
AND n4.name = 'nicotine dependence'
AND n1 <> n3
WITH
[
size((n0)-[:BINDS_CbG]-()),
size(()-[:BINDS_CbG]-(n1)),
size((n1)-[:PARTICIPATES_GpPW]-()),
size(()-[:PARTICIPATES_GpPW]-(n2)),
size((n2)-[:PARTICIPATES_GpPW]-()),
size(()-[:PARTICIPATES_GpPW]-(n3)),
size((n3)-[:ASSOCIATES_DaG]-()),
size(()-[:ASSOCIATES_DaG]-(n4))
] AS degrees, path
RETURN
path,
reduce(pdp = 1.0, d in degrees| pdp * d ^ -0.4) AS path_weight
ORDER BY path_weight DESC
LIMIT 10
Cypher query to find the top CbGbPWaD paths
(browse all predictions at het.io/repurpose)
Discuss at thinklab.com/d/224
Discuss at thinklab.com/d/224#5
Discuss at thinklab.com/d/224#5
Discuss at thinklab.com/d/230#14
MATCH path = (n0:Compound)-[:BINDS_CbG]-(n1)-[:PARTICIPATES_GpPW]-
(n2)-[:PARTICIPATES_GpPW]-(n3)-[:ASSOCIATES_DaG]-(n4:Disease)
MATCH (n4)-[:LOCALIZES_DlA]-(anatomy)
MATCH (n1)-[:EXPRESSES_AeG]-(anatomy)-[:EXPRESSES_AeG]-(n3)
WHERE n0.name = 'Enalapril'
AND n4.name = 'coronary artery disease'
AND n1 <> n3
WITH
DISTINCT path,
n2 AS pathway,
[
size((n0)-[:BINDS_CbG]-()),
size(()-[:BINDS_CbG]-(n1)),
size((n1)-[:PARTICIPATES_GpPW]-()),
size(()-[:PARTICIPATES_GpPW]-(n2)),
size((n2)-[:PARTICIPATES_GpPW]-()),
size(()-[:PARTICIPATES_GpPW]-(n3)),
size((n3)-[:ASSOCIATES_DaG]-()),
size(()-[:ASSOCIATES_DaG]-(n4))
] AS degrees
RETURN
pathway.identifier AS pathway_id,
pathway.name AS pathway_name,
count(*) AS PC,
sum(reduce(pdp = 1.0, d in degrees| pdp * d ^ -0.4)) AS DWPC
ORDER BY DWPC DESC, pathway_name
MATCH path = (n0:SideEffect)-[r1:CAUSES_CcSE]
-(n1:Compound)-[r2:BINDS_CbG]-(n2:Gene)
WHERE n0.name = 'Cushingoid'
WITH
[
size((n0)-[:CAUSES_CcSE]-()),
size(()-[:CAUSES_CcSE]-(n1)),
size((n1)-[:BINDS_CbG]-()),
size(()-[:BINDS_CbG]-(n2))
] AS degrees, path, n2
WITH
n2,
count(path) AS PC,
sum(reduce(pdp = 1.0, d in degrees| pdp * d ^ -0.4)) AS DWPC
RETURN
n2.identifier AS gene_id,
n2.name AS gene_symbol,
n2.description AS gene_name,
PC, DWPC
ORDER BY DWPC DESC, gene_symbol
Query from https://thinklab.com/d/220#6
MATCH path = (n0:SideEffect)-[r1:CAUSES_CcSE]-(n1:Compound)-[r2:BINDS_CbG]-(n2:Gene)
WHERE n0.name = 'Cushingoid'
AND n2.name = 'NR3C1'
RETURN path
Slides at slides.com/dhimmel/pfizer
By Daniel Himmelstein
Presentation to Pfizer in Cambridge, MA. These slides are released under a CC BY 4.0 License.
Head of Data Integration at Related Sciences. Digital craftsman of the biodata revolution.