InvFBA (and more)
Looking for the microbial system goals
SAMOSA meeting, 29/01/2018
E. Delage, D.Eveillard, A.Larhlimi
Key Idea
Gain insight into the metabolic goals of the organism
Gene expression
Metabolic network
Can we compare the landscape of metabolic goals... :
- ... of individuals grown on different conditions ?
- ... across time ?
- etc...
Some questions of interest
Recap on metabolic network
- Set of all chemical reactions of metabolism
- Metabolic pathways : glycolysis, ...
Recap on metabolic network
1 | 2 | |
---|---|---|
A | -2 | 0 |
B | -1 | 0 |
C | 1 | -1 |
D | 0 | 1 |
Stoichiometric matrix
Bipartite weighted directed graph
R1 : 2A + B -> C
R2 : C <-> D
Recap on metabolic network
Gene expression
Enzyme
Reaction catalysis
Gene Protein Reaction (GPR) association :
rxn_1984 : (gene_b OR gene_c) AND gene_d
rxn_1983 : gene_a
FBA
Flux Balance Analysis
Calculates the flow of metabolites through the metabolic network, given :
- stoichiometric constraints
- bounds on fluxes
- an a priori defined objective (often biomass maximization)
- the steady state hypothesis
$$max \ cv$$
s.t.$$ Sv = 0 $$
$$L \leq v \leq U$$
Mathematical description
Simulation purpose
- gene deletion
- growth media optimization
- ...
steady state constraint
Biomass maximization
$$max \ v_{biomass}$$
s.t.$$ Sv = 0 $$
$$L \leq v \leq U$$
$$v=(v_1,...,v_n)$$$$c = (c_1,...,c_n)$$
$$L = (L_1,...,L_n)$$
$$U=(U_1,...,U_n)$$
$$S=(s_{j,i})_{1 \leq j \leq m, 1 \leq i \leq n} $$
m : number of metabolites
n : number of reactions
inv-FBA
FBA drawbacks :
- Inadequacy of the biomass function
- Organism goal may not be the biomass maximization
Motivation
Zhao Q. et al., Gen. Bio., 2016
inv-FBA
Plus of Lee et al. method:
- no a priori objective function
- no need for multiple transcriptomics dataset
- no need to define a threshold for genes high/low expression state
is not enough
inv-FBA limits :
- MATLAB implementation
- Works with fluxes as input
Find a way to go from gene expression to fluxes
Lee et al., BMC Systems Biology, 2012
No gold standard
Workflow
GENE EXPRESSION TO "REACTION EXPRESSION" (GPR association) |
---|
Gene expression
Metabolic network
Mean | SD | |
---|---|---|
reaction 1 | 8 | 2 |
reaction 2 | 4 | 1 |
REACTION EXPRESSION TO FLUX (Lee et al.) |
---|
Flux | |
---|---|
reaction 1 | 0,0017 |
reaction 2 | 0,02 |
InvFBA |
---|
METABOLIC GOALS
"Reaction expression"
Flux
RNA-Seq data :
- SAMOSA
- Normalized with Deseq2
- Conditions : Light, temperature, time
- Triplicates
Metabolic network built with
Focus on two conditions
- WH7803_LLCtT0
- WH7803_LLUVT6
Demonstration with synechococcus
Workflow
GENE EXPRESSION TO "REACTION EXPRESSION" (GPR association) |
---|
Gene expression
Metabolic network
Mean | SD | |
---|---|---|
reaction 1 | 8 | 2 |
reaction 2 | 4 | 1 |
"Reaction expression"
1 - Map gene expression onto reactions
Compute mean and standard deviation across samples for each gene
SD will be used as a confidence parameter when building fluxes
From gene expression to reaction expression
Sample 1 | Sample 2 | Sample 3 | |
---|---|---|---|
gene_1 | 8 | 7 | 9 |
gene_ 2 | 4 | 2 | 10 |
... |
Mean | SD | |
---|---|---|
gene_1 | 8 | 1 |
gene_ 2 | 5.3 | 4.16 |
... |
Evaluate reaction expression
reaction_1 : (B and C) or A
Abstract Syntax Tree (AST)
gene-protein-reaction (GPR) association
Gene name | Gene expression |
---|---|
A | 12 |
B | 3 |
C | 7 |
B and C : min(3,7) = 3
(B and C) or A : 3 + 12 = 15
reaction_1 : 15
Metabolic network :
- 1095 reactions (656 reversibles)
- 1221 metabolites
- 540 "known" genes
209/1095 reactions for which we can't compute an expression
RNA-Seq data :
- 3 replicates per condition
- 2544 genes (including the 540)
209 Gene-Protein-Reaction rules (GPR) untractable :
- 111 containing only "Unknown"
- 65 empty
- 33 mixing known and unknown genes
Demonstration with synechococcus
Workflow
GENE EXPRESSION TO "REACTION EXPRESSION" (GPR association) |
---|
Gene expression
Metabolic network
REACTION EXPRESSION TO FLUX (Lee et al.) |
---|
Flux | |
---|---|
reaction 1 | 0,0017 |
reaction 2 | 0,02 |
"Reaction expression"
Flux
2 - Build fluxes
Mean | SD | |
---|---|---|
reaction 1 | 8 | 2 |
reaction 2 | 4 | 1 |
Lee
$$Z = min\sum_i \frac 1 {\sigma_i} | v_i - d_i |$$
s.t.$$ Sv = 0 $$
$$L_i <= v_i <= U_i$$
$$d_i :$$
$$v_i :$$
reaction flux
reaction expression
$$\sigma_i :$$
reaction expression standard deviation
Problem with reversible reactions : flux can be positive or negative when expression is always positive
Maximize the correlation between the predicted fluxes and the corresponding gene expression data
$$V_{Irr} = \{r_1,r_2,r_3\}$$
$$V_{Rev} = \{r_4,r_5\}$$
$$Z_{Irr} = 2.5$$
$$1 \leq v_4 \leq 7$$
$$-2 \leq v_5 \leq 8$$
$$V_{Irr} = \{r_1,r_2,r_3,r_4\}$$
$$Z_{Irr} = 2.8$$
$$-2 \leq v_5 \leq 5$$
$$V_{Irr} = \{r_1,r_2,r_3,r_4\}$$
Demonstration with synechococcus
Only 61 reversible reactions remaining
Synechococcus
1095 reactions
656 reversible
439 irreversible
Before Lee
After Lee
Run a FBA without any objective to assign a flux to those reactions
Workflow
GENE EXPRESSION TO "REACTION EXPRESSION" (GPR association) |
---|
Gene expression
Metabolic network
REACTION EXPRESSION TO FLUX (Lee et al.) |
---|
Flux | |
---|---|
reaction 1 | 0,0017 |
reaction 2 | 0,02 |
InvFBA |
---|
METABOLIC GOALS
"Reaction expression"
Flux
3 - Get metabolic goals
Mean | SD | |
---|---|---|
reaction 1 | 8 | 2 |
reaction 2 | 4 | 1 |
Schematic representation of how FBA and invFBA work
Zhao Q. et al., Gen. Bio., 2016
OVA
Objective Variability Analysis
OVA determines the range each reaction coefficient can take
There exists possibly an infinity of invFBA solutions
$$ 0 \leq c^{min}_i \leq c_i \leq c^{max}_i \leq 1$$
Comparing objective function space
Technical
- Some adjustments required on Lee's method
- Return more solutions from Lee's method to give as input to invFBA
- Parallelization
Validation
- Apply the method to all SAMOSA experimental conditions
- Biological interpretation of coefficients values
- Test for statistical significance
Package the program using
Ongoing and future work
Thank you for your attention
InvFBA-SAMOSA
By edelage
InvFBA-SAMOSA
- 414