1.06k likes | 1.09k Views
Explore formal methods and computational tools to study biological systems complexity, from high-throughput data analysis to Temporal Logic semantics. Learn about modeling hierarchy, quantitative and qualitative approaches, and automated reasoning tools. Discover how to reconcile models for representing knowledge and making predictions in Systems Biology. Gain insights into the diversity of biological molecules and their interactions.
E N D
Semantical and Algorithmic Aspects of the LivingFrançois FagesConstraint Programming project-team, INRIA Paris-Rocquencourt • To tackle the complexity of biological systems, investigate • Programming Theory Concepts • Formal Methods of Circuit and Program Verification • Automated Reasoning Tools • Prototype Implementation in the Biochemical Abstract Machine BIOCHAM • modeling environment available athttp://contraintes.inria.fr/BIOCHAM
Systems Biology ? • “Systems Biology aims at systems-level understanding [which] • requires a set of principles and methodologies that links the • behaviors of molecules to systems characteristics and functions.” • H. Kitano, ICSB 2000 • Analyze (post-)genomic data produced with high-throughput technologies (stored in databases like GO, KEGG, BioCyc, etc.); • Integrate heterogeneous data about a specific problem; • Understand and Predict behaviors or interactions in large networks of genes and proteins. • Systems Biology Markup Language (SBML) : exchange format for reaction models
Issue of Abstraction • Models are built in Systems Biology with two contradictory perspectives :
Issue of Abstraction • Models are built in Systems Biology with two contradictory perspectives : • 1) Models for representing knowledge : the more concrete the better
Issue of Abstraction • Models are built in Systems Biology with two contradictory perspectives : • 1) Models for representing knowledge : the more concrete the better • 2) Models for making predictions : the more abstract the better !
Issue of Abstraction • Models are built in Systems Biology with two contradictory perspectives : • 1) Models for representing knowledge : the more concrete the better • 2) Models for making predictions : the more abstract the better ! • These perspectives can be reconciled by organizing formalisms and models into hierarchies of abstractions. • To understand a system is not to know everything about it but to know • abstraction levels that are sufficient for answering questions about it
Semantics of Living Processes ? • Formally, “the” behavior of a system depends on our choice of observables. • ? ? Mitosis movie [Lodish et al. 03]
Boolean Semantics • Formally, “the” behavior of a system depends on our choice of observables. • Presence/absence of molecules • Boolean transitions 0 1
Continuous Differential Semantics • Formally, “the” behavior of a system depends on our choice of observables. • Concentrations of molecules • Rates of reactions x ý
Stochastic Semantics • Formally, “the” behavior of a system depends on our choice of observables. • Numbers of molecules • Probabilities of reaction n
Temporal Logic Semantics • Formally, “the” behavior of a system depends on our choice of observables. • Presence/absence of molecules • Temporal logic formulas F x F (x ^ F ( x ^ y)) FG (x v y) … F x
Constraint Temporal Logic Semantics • Formally, “the” behavior of a system depends on our choice of observables. • Concentrations of molecules • Constraint LTL temporal formulas F (x >0.2) F (x >0.2 ^ F (x<0.1^ y>0.2)) FG (x>0.2 v y>0.2) … F x>1
Language-based Approaches to Cell Systems Biology • Qualitative models:from diagrammatic notation to • Boolean networks [Kaufman 69, Thomas 73] • Petri Nets [Reddy 93, Chaouiya 05] • Process algebra π–calculus[Regev-Silverman-Shapiro 99-01, Nagasali et al. 00] • Bio-ambients [Regev-Panina-Silverman-Cardelli-Shapiro 03] • Pathway logic [Eker-Knapp-Laderoute-Lincoln-Meseguer-Sonmez 02] • Reaction rules[Chabrier-Fages 03] [Chabrier-Chiaverini-Danos-Fages-Schachter 04] • Quantitative models: from ODEs and stochastic simulations to • Hybrid Petri nets [Hofestadt-Thelen 98, Matsuno et al. 00] • Hybrid automata [Alur et al. 01, Ghosh-Tomlin 01] HCC [Bockmayr-Courtois 01] • Stochastic π–calculus [Priami et al. 03] [Cardelli et al. 06] • Reaction rules with continuous time dynamics[Fages-Soliman-Chabrier 04]
Overview of the Talk • Rule-based Modeling of Biochemical Systems • Syntax of molecules, compartments and reactions • Semantics at three abstraction levels: boolean, differential, stochastic • Cell cycle control models • Temporal Logic Language for Formalizing Biological Properties • CTL for the boolean semantics • Constraint LTL for the differential semantics • PCTL for the stochastic semantics • Automated Reasoning Tools • Inferring kinetic parameter values from Constraint-LTL specification • Inferring reaction rules from CTL specification • L. Calzone, N. Chabrier, F. Fages, S. Soliman. TCSB VI, LNBI 4220:68-94. 2006.
Molecules of the living • Small molecules: covalent bonds 50-200 kcal/mol • 70% water • 1% ions • 6% amino acids (20), nucleotides (5), • fats, sugars, ATP, ADP, … • Macromolecules: hydrogen bonds, ionic, hydrophobic, Waals 1-5 kcal/mol • 20% proteins (50-104 amino acids) • RNA (102-104 nucleotides AGCU) • DNA (102-106 nucleotides AGCT)
Structure Levels of Proteins • 1) Primary structure: word of n amino acids residues (20n possibilities) • linked with C-N bonds • Example: INRIA • Isoleucine-asparagiNe-aRginine-Isoleucine-Alanine
Structure Levels of Proteins • 1) Primary structure: word of n amino acids residues (20n possibilities) • linked with C-N bonds • Example: INRIA • Isoleucine-asparagiNe-aRginine-Isoleucine-Alanine • 2) Secondary: word of m a-helix, b-strands, random coils,… (3m-10m) • stabilized by hydrogen bonds H---O
Structure Levels of Proteins • 1) Primary structure: word of n amino acids residues (20n possibilities) • linked with C-N bonds • Example: INRIA • Isoleucine-asparagiNe-aRginine-Isoleucine-Alanine • 2) Secondary: word of m a-helix, b-strands, random coils,… (3m-10m) • stabilized by hydrogen bonds H---O • 3) Tertiary 3D structure: spatial folding • stabilized by • hydrophobic • interactions
Syntax of proteins • Cyclin dependent kinase 1 Cdk1 • (free, inactive) • Complex Cdk1-Cyclin B Cdk1–CycB • (low activity) • Phosphorylated form Cdk1~{thr161}-CycB • at site threonine 161 • (high activity) • mitosis promotion factor
BIOCHAM Syntax of Objects • E == compound | E-E | E~{p1,…,pn} • Compound: molecule, #gene binding site, abstract @process… • - : binding operator for protein complexes, gene binding sites, … • Associative and commutative. • ~{…}: modification operator for phosphorylated sites, … • Set of modified sites (Associative, Commutative, Idempotent). • O == E | E::location • Location: symbolic compartment (nucleus, cytoplasm, membrane, …) • S == _ | O+S • + : solution operator (Associative, Commutative, Neutral _)
Elementary Reaction Rules • Complexation: A + B => A-B Decomplexation A-B => A + B • cdk1+cycB => cdk1–cycB
Elementary Reaction Rules • Complexation: A + B => A-B Decomplexation A-B => A + B • cdk1+cycB => cdk1–cycB • Phosphorylation: A =[C]=> A~{p} Dephosphorylation A~{p} =[C]=> A • Cdk1-CycB =[Myt1]=> Cdk1~{thr161}-CycB • Cdk1~{thr14,tyr15}-CycB =[Cdc25~{Nterm}]=> Cdk1-CycB
Elementary Reaction Rules • Complexation: A + B => A-B Decomplexation A-B => A + B • cdk1+cycB => cdk1–cycB • Phosphorylation: A =[C]=> A~{p} Dephosphorylation A~{p} =[C]=> A • Cdk1-CycB =[Myt1]=> Cdk1~{thr161}-CycB • Cdk1~{thr14,tyr15}-CycB =[Cdc25~{Nterm}]=> Cdk1-CycB • Synthesis: _ =[C]=> A. Degradation: A =[C]=> _. • _ =[#E2-E2f13-Dp12]=> CycA cycE =[@UbiPro]=> _ • (not for cycE-cdk2 which is stable)
Elementary Reaction Rules • Complexation: A + B => A-B Decomplexation A-B => A + B • cdk1+cycB => cdk1–cycB • Phosphorylation: A =[C]=> A~{p} Dephosphorylation A~{p} =[C]=> A • Cdk1-CycB =[Myt1]=> Cdk1~{thr161}-CycB • Cdk1~{thr14,tyr15}-CycB =[Cdc25~{Nterm}]=> Cdk1-CycB • Synthesis: _ =[C]=> A. Degradation: A =[C]=> _. • _ =[#E2-E2f13-Dp12]=> CycA cycE =[@UbiPro]=> _ • (not for cycE-cdk2 which is stable) • Transport: A::L1 => A::L2 • Cdk1~{p}-CycB::cytoplasm => Cdk1~{p}-CycB::nucleus
From Syntax to Semantics • R ::= S=>S | S =[O]=> S | S <=> S | S <=[O]=> S • where A =[C]=> B stands for A+C => B+C • A <=> B stands for A=>B and B=>A, etc. • | kinetic for R (import/export SBML format) • In SBML : no semantics (exchange format)
From Syntax to Semantics • R ::= S=>S | S =[O]=> S | S <=> S | S <=[O]=> S • where A =[C]=> B stands for A+C => B+C • A <=> B stands for A=>B and B=>A, etc. • | kinetic for R (import/export SBML format) • In SBML : no semantics (exchange format) • In BIOCHAM : three abstraction levels • Boolean Semantics: presence-absence of molecules • Concurrent Transition System (asynchronous, non-deterministic)
From Syntax to Semantics • R ::= S=>S | S =[O]=> S | S <=> S | S <=[O]=> S • where A =[C]=> B stands for A+C => B+C • A <=> B stands for A=>B and B=>A, etc. • | kinetic for R (import/export SBML format) • In SBML : no semantics (exchange format) • In BIOCHAM : three abstraction levels • Boolean Semantics: presence-absence of molecules • Concurrent Transition System (asynchronous, non-deterministic) • Differential Semantics: concentration • Ordinary Differential Equations or Hybrid system (deterministic)
From Syntax to Semantics • R ::= S=>S | S =[O]=> S | S <=> S | S <=[O]=> S • where A =[C]=> B stands for A+C => B+C • A <=> B stands for A=>B and B=>A, etc. • | kinetic for R (import/export SBML format) • In SBML : no semantics (exchange format) • In BIOCHAM : three abstraction levels • Boolean Semantics: presence-absence of molecules • Concurrent Transition System (asynchronous, non-deterministic) • Differential Semantics: concentration • Ordinary Differential Equations or Hybrid system (deterministic) • Stochastic Semantics: number of molecules • Continuous time Markov chain
1. Differential Semantics • Associates to each molecule its concentration [Ai]= | Ai| / volume ML-1 • volume of diffusion …
1. Differential Semantics • Associates to each molecule its concentration [Ai]= | Ai| / volume ML-1 • volume of compartment • Compiles a set of rules{ ei for Si=>S’I }i=1,…,n (by default ei is MA(1)) • into the system of ODEs (or hybrid automaton) over variables {A1,…,Ak} • dA/dt = Σni=1 ri(A)*ei - Σnj=1 lj(A)*ej • where ri(A) (resp. li(A)) is the stoichiometric coefficient of A in Si (resp. S’i) • multiplied by the volume ratio of the location of A.
1. Differential Semantics • Associates to each molecule its concentration [Ai]= | Ai| / volume ML-1 • volume of compartment • Compiles a set of rules{ ei for Si=>S’I }i=1,…,n (by default ei is MA(1)) • into the system of ODEs (or hybrid automaton) over variables {A1,…,Ak} • dA/dt = Σni=1 ri(A)*ei - Σnj=1 lj(A)*ej • where ri(A) (resp. li(A)) is the stoichiometric coefficient of A in Si (resp. S’i) • multiplied by the volume ratio of the location of A. • volume_ratio (15,n),(1,c). • mRNAcycA::n <=> mRNAcycA::c. • means 15*Vn = Vc and is equivalent to15*mRNAcycA::n <=> mRNAcycA::c.
Numerical Integration • Adaptive step size 4th order Runge-Kutta can be weak for stiff systems • Rosenbrock implicit method using the Jacobian matrix ∂x’i/∂xj • computes a (clever) discretization of time • and a time series of concentrations and their derivatives • (t0, X0, dX0/dt), (t1, X1, dX1/dt), …, (tn, Xn, dXn/dt), … • at discrete time points
2. Stochastic Semantics • Associate to each molecule its number |Ai| in its location of volume Vi
2. Stochastic Semantics • Associate to each molecule its number |Ai| in its location of volume Vi • Compile the rule set into a continuous time Markov chain • over vector states X=(|A1|,…, |Ak|) • and where the transition rate τi for the reaction ei for Si=>S’I • (giving probability after normalization) is obtained from ei by replacing concentrations by molecule numbers
2. Stochastic Semantics • Associate to each molecule its number |Ai| in its location of volume Vi • Compile the rule set into a continuous time Markov chain • over vector states X=(|A1|,…, |Ak|) • and where the transition rate τi for the reaction ei for Si=>S’I • (giving probability after normalization) is obtained from ei by replacing concentrations by molecule numbers • Stochastic simulation [Gillespie 76, Gibson 00] • computes realizations as time series (t0, X0), (t1, X1), …, (tn, Xn), …
3. Boolean Semantics • Associate to each molecule a Boolean denoting its presence/absence in its location
3. Boolean Semantics • Associate to each molecule a Boolean denoting its presence/absence in its location • Compile the rule set into an asynchronous transition system
3. Boolean Semantics • Associate to each molecule a Boolean denoting its presence/absence in its location • Compile the rule set into an asynchronous transition system where a reaction like A+B=>C+D is translated into 4 transition rules taking into account the possible complete consumption of reactants: • A+BA+B+C+D • A+BA+B +C+D • A+BA+B+C+D • A+BA+B+C+D
3. Boolean Semantics • Associate to each molecule a Boolean denoting its presence/absence in its location • Compile the rule set into an asynchronous transition system where a reaction like A+B=>C+D is translated into 4 transition rules taking into account the possible complete consumption of reactants: • A+BA+B+C+D • A+BA+B +C+D • A+BA+B+C+D • A+BA+B+C+D • Necessary for over-approximating the possible behaviors under the stochastic/discrete semantics (abstraction N {zero, non-zero})
Hierarchy of Semantics abstraction Theory of abstract Interpretation [Cousot Cousot POPL’77] [Fages Soliman TCSc’07] Boolean model Discrete model Differential model Stochastic model Models for answering queries: The more abstract the better Optimal abstraction w.r.t. query Syntactical model concretization
Query: what are the stationary states ? Boolean circuit analysis abstraction abstraction Discrete circuit analysis Boolean model abstraction Jacobian circuit analysis Discrete model abstraction Differential model Positive circuits are a necessary condition for multistationarity [Thomas 73] [de Jong 02] [Soulé 03] [Remy Ruet Thieffry 05] Stochastic model Syntactical model concretization
Type Inference / Type Checking abstraction [Fages Soliman CMSB’06] Boolean model Discrete model Differential model Influence graph of proteins Protein functions (kinase, phosphatase,…) Compartments topology Stochastic model Syntactical model concretization
Type Inference / Type Checking abstraction [Fages Soliman CMSB’06] Boolean model Influence graph of proteins (activation/inhibition) Discrete model Differential model Influence graph of proteins Protein functions (kinase, phosphatase,…) Compartments topology Stochastic model Syntactical model concretization
Cell Cycle: G1 DNA Synthesis G2 Mitosis • G1: CdK4-CycD S: Cdk2-CycA G2,M: Cdk1-CycA • Cdk6-CycD Cdk1-CycB (MPF) • Cdk2-CycE
Example: Cell Cycle Control Model [Tyson 91] • MA(k1) for _ => Cyclin. • MA(k2) for Cyclin => _. • MA(K7) for Cyclin~{p1} => _. • MA(k8) for Cdc2 => Cdc2~{p1}. • MA(k9) for Cdc2~{p1} =>Cdc2. • MA(k3) for Cyclin+Cdc2~{p1} => Cdc2~{p1}-Cyclin~{p1}. • MA(k4p) for Cdc2~{p1}-Cyclin~{p1} => Cdc2-Cyclin~{p1}. • k4*[Cdc2-Cyclin~{p1}]^2*[Cdc2~{p1}-Cyclin~{p1}] for • Cdc2~{p1}-Cyclin~{p1} =[Cdc2-Cyclin~{p1}] => Cdc2-Cyclin~{p1}. • MA(k5) for Cdc2-Cyclin~{p1} => Cdc2~{p1}-Cyclin~{p1}. • MA(k6) for Cdc2-Cyclin~{p1} => Cdc2+Cyclin~{p1}.