LINCS

Leo Brueggeman, Daniel Himmelstein, Sergio Baranzini

http://slides.com/leoo/lincs#/

Drugs

Symptoms

Side Effects

Background

  • NIH funded consortium composed of ~10 centers
  • Sparse and large dataset evaluating effect of drugs on:
  1. Cell morphology (e.g. nucleus size, relative amount of ER)
  2. Protein levels (mass spectrometry based assay that quantifies 100 probes, further 1100 are imputed)
  3. Gene expression (fluorescent probes quantify expression of 1000 genes, then expanded by imputation)

 

LINCS Perturbagens

http://www.lincscloud.org/

Central Question:

How to best define set of genes affected by a drug?

Gene Expression

Perturbagen 

Gold Signatures

Non-gold Signatures

(replicable and distinct)

MCF7:6H

NEU:6H

MCF7:24H

NEU:24H

Consensus Signature

*Signature is array of probe-wise z scores

Creating Consensus Signatures

  • In the event of multiple gold signatures for one drug:
    • Create Spearman (rank based) correlation matrix

A

A

B

B

C

C

D

D

E

E

Creating Consensus Signatures

  • In the event of multiple gold signatures for one drug:
    • Create Spearman (rank based) correlation matrix

A

A

B

B

C

C

D

D

E

E

A = 0.59

B = 0.43

C = 0.40

D = 0.60

E = -0.2    

------------

total = 1.84

Creating Consensus Signatures

  • In the event of multiple gold signatures for one drug:
    • Create Spearman (rank based) correlation matrix

A

A

B

B

C

C

D

D

E

E

A = 0.59

B = 0.43

C = 0.40

D = 0.60

E = -0.2    

------------

total = 1.84

A = 0.59/1.84 = .32 x Sig A

B = 0.43/1.84 = .23 x Sig B

C = 0.40/1.84 = .21 x Sig C

D = 0.60/1.84 = .32 x Sig D

    E = -0.20/1.84 = -.10 x Sig E    

total = 1.0

-----------------

Creating Consensus Signatures

  • In the event of multiple gold signatures for one drug:
    • Create Spearman (rank based) correlation matrix

A

A

B

B

C

C

D

D

E

E

A = 0.59

B = 0.43

C = 0.40

D = 0.60

E = -0.2    

------------

total = 1.84

A = 0.59/1.84 = .32 x Sig A

B = 0.43/1.84 = .23 x Sig B

C = 0.40/1.84 = .21 x Sig C

D = 0.60/1.84 = .32 x Sig D

    E = -0.20/1.84 = -.10 x Sig E    

total = 1.0

-----------------

Consensus

Signature

(sum weighted signatures)

Our LINCS Subset

  • We mapped LINCS perturbagens to DrugBank drugs   
  • Matches were made to 1232 DrugBank drugs
  • Out of these 1232 DrugBank drugs, 899 of them are approved drugs 

http://www.drugbank.ca/

Data Structure

DrugBank Drug

Consensus Signature

Perturbagen A

Perturbagen B

Gold Sig's A

Gold Sig's B

Central Question:

How to best define set of genes affected by a drug?

Variation within the data:

  1. Correlation between transcriptional profiles treated by the same drug varies
  2. Number of gold signatures per drug varies 
  3. There are 72 cell lines represented in the gold signatures
  4. Signatures either represent the profile 6 hours post treatment or 24 hours post treatment
  5. Number of LINCS perturbagens mapped to each DrugBank drug varies

Correlation between transcriptional profiles treated by the same drug varies widely

Number of gold signatures per drug varies between 0 and ~ 1000

Question: Do drugs with more gold signatures have fewer significant genes?

There are 72 cell lines represented in the gold signatures

There are 72 cell lines represented in the gold signatures

MCF7: breast carcinoma (m)

VCAP: prostate carcinoma (m)

A549: lung non small cell carcinoma

PC3: prostate carcinoma (m)

HA1E: immortalized normal kidney

HCC535: lung carcinoma

A375: melanoma

HT29: colon adenocarcinoma

HEPG2: liver carcinoma

There are 72 cell lines represented in the gold signatures

prostate carcinoma (m)

immortalized normal kidney

melanoma

prostate carcinoma (m)

liver carcinoma

lung carcinoma

lung non small cell carcinoma

breast carcinoma (m)

colon adenocarcinoma

muscle myoblast

primary adipocyte stem cell

iPSc differentiated neurons

iPSc differentiated neural progenitors

Number of LINCS perturbagens mapped to each DrugBank drug varies between 1 and 7

Question: Where on 'correlation continuum' do merged perturbagens lie?

0

0.3

0.15

Unrelated Perturbagens

Merged Perturbagens

Identical Perturbagens

Question: Where on 'correlation continuum' do merged perturbagens lie?

0

0.3

0.15

Unrelated Perturbagens

Merged Perturbagens

Identical Perturbagens

DB07

DB04

Pert D

Pert E

[0.05]

Pert D Signatures

Pert E Signatures

Pert D Consensus

Pert E Consensus

Question: Where on 'correlation continuum' do merged perturbagens lie?

0

0.3

0.15

Unrelated Perturbagens

Merged Perturbagens

Identical Perturbagens

DB01

Pert A       B         C

[ 1 , 0.5, 0.4]

[ 0.5, 1, 0.1 ]

[ 0.4, 0.1, 1 ]

DB07

DB04

Pert D

Pert E

[0.05]

Pert D Signatures

Pert E Signatures

Pert D Consensus

Pert E Consensus

Question: Where on 'correlation continuum' do merged perturbagens lie?

0

0.3

0.15

Unrelated Perturbagens

Merged Perturbagens

Identical Perturbagens

DB01

Pert A       B         C

[ 1 , 0.5, 0.4]

[ 0.5, 1, 0.1 ]

[ 0.4, 0.1, 1 ]

DB07

DB04

Pert D

Pert E

[0.05]

DB09

Pert F

Pert D Signatures

Pert E Signatures

Pert D Consensus

Pert E Consensus

Pert F Sig Subset 2

Pert F Sig Subset 1

[0.3]

Pert F-2 Consensus

Pert F-1 Consensus

Number of LINCS perturbagens mapped to each DrugBank drug varies between 1 and 7

Correlation

Application of LINCS Data

LINCS

By leoo

LINCS

  • 2,160