Daniel Himmelstein
Head of Data Integration at Related Sciences. Digital craftsman of the biodata revolution.
most hetio project websites are offline
Systematic integration of biomedical knowledge prioritizes drugs for repurposing
Daniel S Himmelstein, Antoine Lizee, Christine Hessler, Leo Brueggeman, Sabrina L Chen, Dexter Hadley, Ari Green, Pouya Khankhanian, Sergio E Baranzini
eLife (2017) https://doi.org/cdfk
DOI: 10.1371/journal.pcbi.1004259
observations =
compound–disease pairs
features = types of paths
Systematic integration of biomedical knowledge prioritizes drugs for repurposing
Daniel S Himmelstein, Antoine Lizee, Christine Hessler, Leo Brueggeman, Sabrina L Chen, Dexter Hadley, Ari Green, Pouya Khankhanian, Sergio E Baranzini
eLife (2017) https://doi.org/cdfk
therapeutic crosspurposing for 209,168 compound–disease pairs
https://het.io/repurpose/
1,538 connected
138 connected
predictions can be decomposed into their component metapath and path contibutions
Hetnet connectivity search provides rapid insights into how biomedical entities are related
Daniel Himmelstein, Michael Zietz, Vincent Rubinetti, Kyle Kloster, Benjamin Heil, Faisal Alquaddoomi, Dongbo Hu, David Nicholson, Yun Hao, Blair Sullivan, Michael Nagle, Casey Greene
GigaScience (2023) https://doi.org/gsd85n
The probability of edge existence due to node degree: a baseline for network-based predictions
Michael Zietz, Daniel Himmelstein, Kyle Kloster, Christopher Williams, Michael Nagle, Casey Greene
GigaScience (2024) https://doi.org/gtcbks
A Proteome-Scale Map of the Human Interactome Network
Thomas Rolland, Murat Taşan, Benoit Charloteaux, Samuel J Pevzner, Quan Zhong, Nidhi Sahni, Song Yi, Irma Lemmens, Celia Fontanillo, Roberto Mosca, … Marc Vidal
Cell (2014-11) https://doi.org/f3mn6x
Figure 4A: Adjacency matrices showing Lit-BM-13 (blue) and HI-II-14 (purple) interactions, with proteins in bins of ∼350 and ordered by number of publications along both axes. The color intensity of each square reflects the total number of interactions for the corresponding bins
compound × disease, both with 1 treatment: prior = 0.12%
methotrexate × hypertension = 80% prior probability of treatment
The prior predicted in-sample treatments with AUROC = 97.9% but under-performed on validations:
The edge prior was not able to predict the separate PPI network better than by random guessing (AUROC of roughly 0.5). Only slightly better was its performance in predicting the separate TF-TG network, at an AUROC of 0.59. We find superior performance in predicting the coauthorship relationships (AUROC 0.75), which was expected as the network being predicted shared roughly the same degree distribution as the network on which the edge prior was computed
For all biomedical networks we've seen, degree is highly predictive of whether an edge exists, but it rarely generalizes to independent validation.
The probability of edge existence due to node degree: a baseline for network-based predictions
Michael Zietz, Daniel Himmelstein, Kyle Kloster, Christopher Williams, Michael Nagle, Casey Greene
GigaScience (2024) https://doi.org/gtcbks
empirical approximation of the edge prior
analytical approximation of the edge prior — Pᵢ,ⱼ
probability that an edge exists solely based on degree
Browse at het.io/repurpose/metapaths.html
DWPC Δ AUROC: performance of a metapath on the real network minus performance on permuted networks
By Daniel Himmelstein
Presentation as part of the CZI / Every Cure Data-Driven Drug Repurposing Virtual Workshop Series on 2024-02-12 with a session title: "deep dive into the challenges and potential applications of knowledge graphs." This presentation is released under a CC BY 4.0 License.
Head of Data Integration at Related Sciences. Digital craftsman of the biodata revolution.