LINCS
Leo Brueggeman, Daniel Himmelstein, Sergio Baranzini
http://slides.com/leoo/lincs#/
Background
- NIH funded consortium composed of ~10 centers
- Sparse and large dataset evaluating effect of drugs on:
- Cell morphology (e.g. nucleus size, relative amount of ER)
- Protein levels (mass spectrometry based assay that quantifies 100 probes, further 1100 are imputed)
- Gene expression (fluorescent probes quantify expression of 1000 genes, then expanded by imputation)
LINCS Perturbagens
http://www.lincscloud.org/
Central Question:
How to best define set of genes affected by a drug?
Gene Expression
Perturbagen
Gold Signatures
Non-gold Signatures
(replicable and distinct)
MCF7:6H
NEU:6H
MCF7:24H
NEU:24H
Consensus Signature
*Signature is array of probe-wise z scores
Creating Consensus Signatures
- In the event of multiple gold signatures for one drug:
- Create Spearman (rank based) correlation matrix
A
A
B
B
C
C
D
D
E
E
Our LINCS Subset
- We mapped LINCS perturbagens to DrugBank drugs
- Matches were made to 1232 DrugBank drugs
- Out of these 1232 DrugBank drugs, 899 of them are approved drugs
http://www.drugbank.ca/
Data Structure
DrugBank Drug
Consensus Signature
Perturbagen A
Perturbagen B
Gold Sig's A
Gold Sig's B
Central Question:
How to best define set of genes affected by a drug?
Variation within the data:
- Correlation between transcriptional profiles treated by the same drug varies
- Number of gold signatures per drug varies
- There are 72 cell lines represented in the gold signatures
- Signatures either represent the profile 6 hours post treatment or 24 hours post treatment
- Number of LINCS perturbagens mapped to each DrugBank drug varies
Correlation between transcriptional profiles treated by the same drug varies widely
Number of gold signatures per drug varies between 0 and ~ 1000
Question: Do drugs with more gold signatures have fewer significant genes?
There are 72 cell lines represented in the gold signatures
There are 72 cell lines represented in the gold signatures
MCF7: breast carcinoma (m)
VCAP: prostate carcinoma (m)
A549: lung non small cell carcinoma
PC3: prostate carcinoma (m)
HA1E: immortalized normal kidney
HCC535: lung carcinoma
A375: melanoma
HT29: colon adenocarcinoma
HEPG2: liver carcinoma
There are 72 cell lines represented in the gold signatures
prostate carcinoma (m)
immortalized normal kidney
melanoma
prostate carcinoma (m)
liver carcinoma
lung carcinoma
lung non small cell carcinoma
breast carcinoma (m)
colon adenocarcinoma
muscle myoblast
primary adipocyte stem cell
iPSc differentiated neurons
iPSc differentiated neural progenitors
Number of LINCS perturbagens mapped to each DrugBank drug varies between 1 and 7
Question: Where on 'correlation continuum' do merged perturbagens lie?
0
0.3
0.15
Unrelated Perturbagens
Merged Perturbagens
Identical Perturbagens
Question: Where on 'correlation continuum' do merged perturbagens lie?
0
0.3
0.15
Unrelated Perturbagens
Merged Perturbagens
Identical Perturbagens
DB07
DB04
Pert D
Pert E
[0.05]
Pert D Signatures
Pert E Signatures
Pert D Consensus
Pert E Consensus
Question: Where on 'correlation continuum' do merged perturbagens lie?
0
0.3
0.15
Unrelated Perturbagens
Merged Perturbagens
Identical Perturbagens
DB01
Pert A B C
[ 1 , 0.5, 0.4]
[ 0.5, 1, 0.1 ]
[ 0.4, 0.1, 1 ]
DB07
DB04
Pert D
Pert E
[0.05]
Pert D Signatures
Pert E Signatures
Pert D Consensus
Pert E Consensus
Question: Where on 'correlation continuum' do merged perturbagens lie?
0
0.3
0.15
Unrelated Perturbagens
Merged Perturbagens
Identical Perturbagens
DB01
Pert A B C
[ 1 , 0.5, 0.4]
[ 0.5, 1, 0.1 ]
[ 0.4, 0.1, 1 ]
DB07
DB04
Pert D
Pert E
[0.05]
DB09
Pert F
Pert D Signatures
Pert E Signatures
Pert D Consensus
Pert E Consensus
Pert F Sig Subset 2
Pert F Sig Subset 1
[0.3]
Pert F-2 Consensus
Pert F-1 Consensus
Number of LINCS perturbagens mapped to each DrugBank drug varies between 1 and 7
Correlation
Application of LINCS Data
