1.06k likes | 1.18k Views
Semantical and Algorithmic Aspects of the Living François Fages Constraint Programming project-team, INRIA Paris-Rocquencourt. To tackle the complexity of biological systems, investigate Programming Theory Concepts Formal Methods of Circuit and Program Verification Automated Reasoning Tools
E N D
Semantical and Algorithmic Aspects of the LivingFrançois FagesConstraint Programming project-team, INRIA Paris-Rocquencourt • To tackle the complexity of biological systems, investigate • Programming Theory Concepts • Formal Methods of Circuit and Program Verification • Automated Reasoning Tools • Prototype Implementation in the Biochemical Abstract Machine BIOCHAM • modeling environment available athttp://contraintes.inria.fr/BIOCHAM
Systems Biology ? • “Systems Biology aims at systems-level understanding [which] • requires a set of principles and methodologies that links the • behaviors of molecules to systems characteristics and functions.” • H. Kitano, ICSB 2000 • Analyze (post-)genomic data produced with high-throughput technologies (stored in databases like GO, KEGG, BioCyc, etc.); • Integrate heterogeneous data about a specific problem; • Understand and Predict behaviors or interactions in large networks of genes and proteins. • Systems Biology Markup Language (SBML) : exchange format for reaction models
Issue of Abstraction • Models are built in Systems Biology with two contradictory perspectives :
Issue of Abstraction • Models are built in Systems Biology with two contradictory perspectives : • 1) Models for representing knowledge : the more concrete the better
Issue of Abstraction • Models are built in Systems Biology with two contradictory perspectives : • 1) Models for representing knowledge : the more concrete the better • 2) Models for making predictions : the more abstract the better !
Issue of Abstraction • Models are built in Systems Biology with two contradictory perspectives : • 1) Models for representing knowledge : the more concrete the better • 2) Models for making predictions : the more abstract the better ! • These perspectives can be reconciled by organizing formalisms and models into hierarchies of abstractions. • To understand a system is not to know everything about it but to know • abstraction levels that are sufficient for answering questions about it
Semantics of Living Processes ? • Formally, “the” behavior of a system depends on our choice of observables. • ? ? Mitosis movie [Lodish et al. 03]
Boolean Semantics • Formally, “the” behavior of a system depends on our choice of observables. • Presence/absence of molecules • Boolean transitions 0 1
Continuous Differential Semantics • Formally, “the” behavior of a system depends on our choice of observables. • Concentrations of molecules • Rates of reactions x ý
Stochastic Semantics • Formally, “the” behavior of a system depends on our choice of observables. • Numbers of molecules • Probabilities of reaction n
Temporal Logic Semantics • Formally, “the” behavior of a system depends on our choice of observables. • Presence/absence of molecules • Temporal logic formulas F x F (x ^ F ( x ^ y)) FG (x v y) … F x
Constraint Temporal Logic Semantics • Formally, “the” behavior of a system depends on our choice of observables. • Concentrations of molecules • Constraint LTL temporal formulas F (x >0.2) F (x >0.2 ^ F (x<0.1^ y>0.2)) FG (x>0.2 v y>0.2) … F x>1
Language-based Approaches to Cell Systems Biology • Qualitative models:from diagrammatic notation to • Boolean networks [Kaufman 69, Thomas 73] • Petri Nets [Reddy 93, Chaouiya 05] • Process algebra π–calculus[Regev-Silverman-Shapiro 99-01, Nagasali et al. 00] • Bio-ambients [Regev-Panina-Silverman-Cardelli-Shapiro 03] • Pathway logic [Eker-Knapp-Laderoute-Lincoln-Meseguer-Sonmez 02] • Reaction rules[Chabrier-Fages 03] [Chabrier-Chiaverini-Danos-Fages-Schachter 04] • Quantitative models: from ODEs and stochastic simulations to • Hybrid Petri nets [Hofestadt-Thelen 98, Matsuno et al. 00] • Hybrid automata [Alur et al. 01, Ghosh-Tomlin 01] HCC [Bockmayr-Courtois 01] • Stochastic π–calculus [Priami et al. 03] [Cardelli et al. 06] • Reaction rules with continuous time dynamics[Fages-Soliman-Chabrier 04]
Overview of the Talk • Rule-based Modeling of Biochemical Systems • Syntax of molecules, compartments and reactions • Semantics at three abstraction levels: boolean, differential, stochastic • Cell cycle control models • Temporal Logic Language for Formalizing Biological Properties • CTL for the boolean semantics • Constraint LTL for the differential semantics • PCTL for the stochastic semantics • Automated Reasoning Tools • Inferring kinetic parameter values from Constraint-LTL specification • Inferring reaction rules from CTL specification • L. Calzone, N. Chabrier, F. Fages, S. Soliman. TCSB VI, LNBI 4220:68-94. 2006.
Molecules of the living • Small molecules: covalent bonds 50-200 kcal/mol • 70% water • 1% ions • 6% amino acids (20), nucleotides (5), • fats, sugars, ATP, ADP, … • Macromolecules: hydrogen bonds, ionic, hydrophobic, Waals 1-5 kcal/mol • 20% proteins (50-104 amino acids) • RNA (102-104 nucleotides AGCU) • DNA (102-106 nucleotides AGCT)
Structure Levels of Proteins • 1) Primary structure: word of n amino acids residues (20n possibilities) • linked with C-N bonds • Example: INRIA • Isoleucine-asparagiNe-aRginine-Isoleucine-Alanine
Structure Levels of Proteins • 1) Primary structure: word of n amino acids residues (20n possibilities) • linked with C-N bonds • Example: INRIA • Isoleucine-asparagiNe-aRginine-Isoleucine-Alanine • 2) Secondary: word of m a-helix, b-strands, random coils,… (3m-10m) • stabilized by hydrogen bonds H---O
Structure Levels of Proteins • 1) Primary structure: word of n amino acids residues (20n possibilities) • linked with C-N bonds • Example: INRIA • Isoleucine-asparagiNe-aRginine-Isoleucine-Alanine • 2) Secondary: word of m a-helix, b-strands, random coils,… (3m-10m) • stabilized by hydrogen bonds H---O • 3) Tertiary 3D structure: spatial folding • stabilized by • hydrophobic • interactions
Syntax of proteins • Cyclin dependent kinase 1 Cdk1 • (free, inactive) • Complex Cdk1-Cyclin B Cdk1–CycB • (low activity) • Phosphorylated form Cdk1~{thr161}-CycB • at site threonine 161 • (high activity) • mitosis promotion factor
BIOCHAM Syntax of Objects • E == compound | E-E | E~{p1,…,pn} • Compound: molecule, #gene binding site, abstract @process… • - : binding operator for protein complexes, gene binding sites, … • Associative and commutative. • ~{…}: modification operator for phosphorylated sites, … • Set of modified sites (Associative, Commutative, Idempotent). • O == E | E::location • Location: symbolic compartment (nucleus, cytoplasm, membrane, …) • S == _ | O+S • + : solution operator (Associative, Commutative, Neutral _)
Elementary Reaction Rules • Complexation: A + B => A-B Decomplexation A-B => A + B • cdk1+cycB => cdk1–cycB
Elementary Reaction Rules • Complexation: A + B => A-B Decomplexation A-B => A + B • cdk1+cycB => cdk1–cycB • Phosphorylation: A =[C]=> A~{p} Dephosphorylation A~{p} =[C]=> A • Cdk1-CycB =[Myt1]=> Cdk1~{thr161}-CycB • Cdk1~{thr14,tyr15}-CycB =[Cdc25~{Nterm}]=> Cdk1-CycB
Elementary Reaction Rules • Complexation: A + B => A-B Decomplexation A-B => A + B • cdk1+cycB => cdk1–cycB • Phosphorylation: A =[C]=> A~{p} Dephosphorylation A~{p} =[C]=> A • Cdk1-CycB =[Myt1]=> Cdk1~{thr161}-CycB • Cdk1~{thr14,tyr15}-CycB =[Cdc25~{Nterm}]=> Cdk1-CycB • Synthesis: _ =[C]=> A. Degradation: A =[C]=> _. • _ =[#E2-E2f13-Dp12]=> CycA cycE =[@UbiPro]=> _ • (not for cycE-cdk2 which is stable)
Elementary Reaction Rules • Complexation: A + B => A-B Decomplexation A-B => A + B • cdk1+cycB => cdk1–cycB • Phosphorylation: A =[C]=> A~{p} Dephosphorylation A~{p} =[C]=> A • Cdk1-CycB =[Myt1]=> Cdk1~{thr161}-CycB • Cdk1~{thr14,tyr15}-CycB =[Cdc25~{Nterm}]=> Cdk1-CycB • Synthesis: _ =[C]=> A. Degradation: A =[C]=> _. • _ =[#E2-E2f13-Dp12]=> CycA cycE =[@UbiPro]=> _ • (not for cycE-cdk2 which is stable) • Transport: A::L1 => A::L2 • Cdk1~{p}-CycB::cytoplasm => Cdk1~{p}-CycB::nucleus
From Syntax to Semantics • R ::= S=>S | S =[O]=> S | S <=> S | S <=[O]=> S • where A =[C]=> B stands for A+C => B+C • A <=> B stands for A=>B and B=>A, etc. • | kinetic for R (import/export SBML format) • In SBML : no semantics (exchange format)
From Syntax to Semantics • R ::= S=>S | S =[O]=> S | S <=> S | S <=[O]=> S • where A =[C]=> B stands for A+C => B+C • A <=> B stands for A=>B and B=>A, etc. • | kinetic for R (import/export SBML format) • In SBML : no semantics (exchange format) • In BIOCHAM : three abstraction levels • Boolean Semantics: presence-absence of molecules • Concurrent Transition System (asynchronous, non-deterministic)
From Syntax to Semantics • R ::= S=>S | S =[O]=> S | S <=> S | S <=[O]=> S • where A =[C]=> B stands for A+C => B+C • A <=> B stands for A=>B and B=>A, etc. • | kinetic for R (import/export SBML format) • In SBML : no semantics (exchange format) • In BIOCHAM : three abstraction levels • Boolean Semantics: presence-absence of molecules • Concurrent Transition System (asynchronous, non-deterministic) • Differential Semantics: concentration • Ordinary Differential Equations or Hybrid system (deterministic)
From Syntax to Semantics • R ::= S=>S | S =[O]=> S | S <=> S | S <=[O]=> S • where A =[C]=> B stands for A+C => B+C • A <=> B stands for A=>B and B=>A, etc. • | kinetic for R (import/export SBML format) • In SBML : no semantics (exchange format) • In BIOCHAM : three abstraction levels • Boolean Semantics: presence-absence of molecules • Concurrent Transition System (asynchronous, non-deterministic) • Differential Semantics: concentration • Ordinary Differential Equations or Hybrid system (deterministic) • Stochastic Semantics: number of molecules • Continuous time Markov chain
1. Differential Semantics • Associates to each molecule its concentration [Ai]= | Ai| / volume ML-1 • volume of diffusion …
1. Differential Semantics • Associates to each molecule its concentration [Ai]= | Ai| / volume ML-1 • volume of compartment • Compiles a set of rules{ ei for Si=>S’I }i=1,…,n (by default ei is MA(1)) • into the system of ODEs (or hybrid automaton) over variables {A1,…,Ak} • dA/dt = Σni=1 ri(A)*ei - Σnj=1 lj(A)*ej • where ri(A) (resp. li(A)) is the stoichiometric coefficient of A in Si (resp. S’i) • multiplied by the volume ratio of the location of A.
1. Differential Semantics • Associates to each molecule its concentration [Ai]= | Ai| / volume ML-1 • volume of compartment • Compiles a set of rules{ ei for Si=>S’I }i=1,…,n (by default ei is MA(1)) • into the system of ODEs (or hybrid automaton) over variables {A1,…,Ak} • dA/dt = Σni=1 ri(A)*ei - Σnj=1 lj(A)*ej • where ri(A) (resp. li(A)) is the stoichiometric coefficient of A in Si (resp. S’i) • multiplied by the volume ratio of the location of A. • volume_ratio (15,n),(1,c). • mRNAcycA::n <=> mRNAcycA::c. • means 15*Vn = Vc and is equivalent to15*mRNAcycA::n <=> mRNAcycA::c.
Numerical Integration • Adaptive step size 4th order Runge-Kutta can be weak for stiff systems • Rosenbrock implicit method using the Jacobian matrix ∂x’i/∂xj • computes a (clever) discretization of time • and a time series of concentrations and their derivatives • (t0, X0, dX0/dt), (t1, X1, dX1/dt), …, (tn, Xn, dXn/dt), … • at discrete time points
2. Stochastic Semantics • Associate to each molecule its number |Ai| in its location of volume Vi
2. Stochastic Semantics • Associate to each molecule its number |Ai| in its location of volume Vi • Compile the rule set into a continuous time Markov chain • over vector states X=(|A1|,…, |Ak|) • and where the transition rate τi for the reaction ei for Si=>S’I • (giving probability after normalization) is obtained from ei by replacing concentrations by molecule numbers
2. Stochastic Semantics • Associate to each molecule its number |Ai| in its location of volume Vi • Compile the rule set into a continuous time Markov chain • over vector states X=(|A1|,…, |Ak|) • and where the transition rate τi for the reaction ei for Si=>S’I • (giving probability after normalization) is obtained from ei by replacing concentrations by molecule numbers • Stochastic simulation [Gillespie 76, Gibson 00] • computes realizations as time series (t0, X0), (t1, X1), …, (tn, Xn), …
3. Boolean Semantics • Associate to each molecule a Boolean denoting its presence/absence in its location
3. Boolean Semantics • Associate to each molecule a Boolean denoting its presence/absence in its location • Compile the rule set into an asynchronous transition system
3. Boolean Semantics • Associate to each molecule a Boolean denoting its presence/absence in its location • Compile the rule set into an asynchronous transition system where a reaction like A+B=>C+D is translated into 4 transition rules taking into account the possible complete consumption of reactants: • A+BA+B+C+D • A+BA+B +C+D • A+BA+B+C+D • A+BA+B+C+D
3. Boolean Semantics • Associate to each molecule a Boolean denoting its presence/absence in its location • Compile the rule set into an asynchronous transition system where a reaction like A+B=>C+D is translated into 4 transition rules taking into account the possible complete consumption of reactants: • A+BA+B+C+D • A+BA+B +C+D • A+BA+B+C+D • A+BA+B+C+D • Necessary for over-approximating the possible behaviors under the stochastic/discrete semantics (abstraction N {zero, non-zero})
Hierarchy of Semantics abstraction Theory of abstract Interpretation [Cousot Cousot POPL’77] [Fages Soliman TCSc’07] Boolean model Discrete model Differential model Stochastic model Models for answering queries: The more abstract the better Optimal abstraction w.r.t. query Syntactical model concretization
Query: what are the stationary states ? Boolean circuit analysis abstraction abstraction Discrete circuit analysis Boolean model abstraction Jacobian circuit analysis Discrete model abstraction Differential model Positive circuits are a necessary condition for multistationarity [Thomas 73] [de Jong 02] [Soulé 03] [Remy Ruet Thieffry 05] Stochastic model Syntactical model concretization
Type Inference / Type Checking abstraction [Fages Soliman CMSB’06] Boolean model Discrete model Differential model Influence graph of proteins Protein functions (kinase, phosphatase,…) Compartments topology Stochastic model Syntactical model concretization
Type Inference / Type Checking abstraction [Fages Soliman CMSB’06] Boolean model Influence graph of proteins (activation/inhibition) Discrete model Differential model Influence graph of proteins Protein functions (kinase, phosphatase,…) Compartments topology Stochastic model Syntactical model concretization
Cell Cycle: G1 DNA Synthesis G2 Mitosis • G1: CdK4-CycD S: Cdk2-CycA G2,M: Cdk1-CycA • Cdk6-CycD Cdk1-CycB (MPF) • Cdk2-CycE
Example: Cell Cycle Control Model [Tyson 91] • MA(k1) for _ => Cyclin. • MA(k2) for Cyclin => _. • MA(K7) for Cyclin~{p1} => _. • MA(k8) for Cdc2 => Cdc2~{p1}. • MA(k9) for Cdc2~{p1} =>Cdc2. • MA(k3) for Cyclin+Cdc2~{p1} => Cdc2~{p1}-Cyclin~{p1}. • MA(k4p) for Cdc2~{p1}-Cyclin~{p1} => Cdc2-Cyclin~{p1}. • k4*[Cdc2-Cyclin~{p1}]^2*[Cdc2~{p1}-Cyclin~{p1}] for • Cdc2~{p1}-Cyclin~{p1} =[Cdc2-Cyclin~{p1}] => Cdc2-Cyclin~{p1}. • MA(k5) for Cdc2-Cyclin~{p1} => Cdc2~{p1}-Cyclin~{p1}. • MA(k6) for Cdc2-Cyclin~{p1} => Cdc2+Cyclin~{p1}.