Daniel Himmelstein
Head of Data Integration at Related Sciences. Digital craftsman of the biodata revolution.
GraphConnect 2016
San Francisco
October 13, 2016
Market Foyer
2:35 pm – 2:50 pm
By Daniel Himmelstein
Slides at slides.com/dhimmel/graphconnect
Hetionet is a hetnet — a network with multiple node and relationship types. Version 1.0 contains 47,031 nodes of 11 types and 2,250,197 relationships of 24 types. Data was integrated from 29 public resources to connect compounds, diseases, genes, anatomies, pathways, biological processes, molecular functions, cellular components, perturbations, pharmacologic classes, drug side effects, and disease symptoms.
Hetionet was created as part of Project Rephetio, an open science project to systematically identify why drugs work and predict new therapies for drugs. Using advanced Cypher queries, we quantified the network connectivity between drug–disease pairs along 1,206 types of paths. We then used machine learning to predict the probability of treatment for 209,168 compound–disease pairs.
Hetionet is available online as a public Neo4j database instance. The Hetionet Neo4j Browser includes an introductory guide as well as guides showing the most supportive paths for each of the 209,168 predictions. The Hetionet Browser uses Docker for Neo4j. Join us at GraphConnect to learn about how Neo4j is a powerful technology for human disease research.
multilayer network, multiplex network, multivariate network, multinetwork, multirelational network, multirelational data, multilayered network, multidimensional network, multislice network, multiplex of interdependent networks, hypernetwork, overlay network, composite network, multilevel network, multiweighted graph, heterogeneous network, multitype network, interconnected networks, interdependent networks, partially interdependent networks, network of networks, coupled networks, interconnecting networks, interacting networks, heterogenous information network
networks with multiple node or relationship types
A 2012 Study identified 26 different names for this type of network:
hetnet
What's the best software for storing and querying hetnets?
dhimmel/hetio | |
---|---|
86 | |
5 | |
2 |
neo4j/neo4j |
---|
42,498 |
3,071 |
1,007 |
GitHub stats from 2016-10-09
Visualizing Hetionet v1.0
Details at doi.org/brsc
Project online at thinklab.com/p/rephetio
Compound–causes–SideEffect–causes–Compound–treats–Disease
Compound–binds–Gene–binds–Compound–treats–Disease
Compound–binds–Gene–associates–Disease
Compound–binds–Gene–participates–Pathway–participates–Disease
See in the Neo4j Browser
MATCH path = (n0:Compound)-[:BINDS_CbG]-(n1)-[:PARTICIPATES_GpPW]-
(n2)-[:PARTICIPATES_GpPW]-(n3)-[:ASSOCIATES_DaG]-(n4:Disease)
USING JOIN ON n2
WHERE n0.name = 'Bupropion'
AND n4.name = 'nicotine dependence'
AND n1 <> n3
WITH
[
size((n0)-[:BINDS_CbG]-()),
size(()-[:BINDS_CbG]-(n1)),
size((n1)-[:PARTICIPATES_GpPW]-()),
size(()-[:PARTICIPATES_GpPW]-(n2)),
size((n2)-[:PARTICIPATES_GpPW]-()),
size(()-[:PARTICIPATES_GpPW]-(n3)),
size((n3)-[:ASSOCIATES_DaG]-()),
size(()-[:ASSOCIATES_DaG]-(n4))
] AS degrees, path
RETURN
path,
reduce(pdp = 1.0, d in degrees| pdp * d ^ -0.4) AS path_weight
ORDER BY path_weight DESC
LIMIT 10
Cypher query to find the top CbGbPWaD paths
Content | Type | URL |
---|---|---|
Hetionet Neo4j Browser | Neo4j Instance | neo4j.het.io |
Cypher Tutorial for Project Rephetio | GraphGist | goo.gl/nO7wbU |
Graphistania Podcast | Interview | goo.gl/yqVhZz |
Thinklab Project | Lab Notebook | thinklab.com/p/rephetio |
Hetionet GitHub | Repository | git.io/vPa98 |
More Cypher queries on Hetionet | Discussion | doi.org/brsd |
My PhD Thesis Seminar on Hetnets | Video | youtu.be/H8DfXop8K7g |
A special thanks to:
By Daniel Himmelstein
Lightning talk at GraphConnect 2016. See abstract at http://graphconnect.com/speaker/daniel-himmelstein/. This presentation is released under a CC BY 4.0 License.
Head of Data Integration at Related Sciences. Digital craftsman of the biodata revolution.