1 / 32

Idea: apply Formal Methods of Program Verification to Systems Biology,

Formal Biology of the Cell Modeling, Computing and Reasoning with Constraints François Fages, Constraint Programming Group, INRIA Rocquencourt mailto:Francois.Fages@inria.fr http://contraintes.inria.fr/. Idea: apply Formal Methods of Program Verification to Systems Biology,

dahlgren
Download Presentation

Idea: apply Formal Methods of Program Verification to Systems Biology,

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Formal Biology of the CellModeling, Computing and Reasoning with ConstraintsFrançois Fages, Constraint Programming Group, INRIA Rocquencourtmailto:Francois.Fages@inria.frhttp://contraintes.inria.fr/ • Idea: apply • Formal Methods of Program Verification to Systems Biology, • Constraint Logic Programming and Constraint-based Model Checking • In course, • Learn bits of Biology through computational models, • Study new formalisms, languages and … implementations.

  2. Systems Biology • Multidisciplinary field aiming at getting over • the complexity walls to reason about • biological processes at the system level. • Conferences ICSB, CMSB, … journal TCSB • Virtual cell: emulate high-level biological processes in terms of their biochemical basis at the molecular level (in silico experiments) • Bioinformatics: end 90’s, genomic sequences  post-genomic data (RNA expression, protein synthesis, protein-protein interactions,… ) • Need for a strong effort on: • - the formal representation of biological processes, • - formal tools for modeling and reasoning about their global behavior.

  3. Language Approach to Cell Systems Biology • Qualitative models:from diagrammatic notation to • Boolean networks [Thomas 73] • Petri Nets [Reddy 93] • Milner’s π–calculus[Regev-Silverman-Shapiro 99-01, Nagasali et al. 00] • Bio-ambients [Regev-Panina-Silverman-Cardelli-Shapiro 03] • Pathway logic [Eker-Knapp-Laderoute-Lincoln-Meseguer-Sonmez 02] • Transition systems [Chabrier-Chiaverini-Danos-Fages-Schachter 04] • Biochemical abstract machine BIOCHAM-1[Chabrier-Fages 03] • Quantitative models: from differential equation systems to • Hybrid Petri nets [Hofestadt-Thelen 98, Matsuno et al. 00] • Hybrid automata [Alur et al. 01, Ghosh-Tomlin 01] • Hybrid concurrent constraint languages [Bockmayr-Courtois 01] • Rules with continuous dynamics BIOCHAM-2[Chabrier-Fages-Soliman 04]

  4. The Biochemical Abstract Machine BIOCHAM • Software environment based on two formal languages: • Biocham Rule Language for Modeling Biochemical Systems • Syntax of molecules, compartments and reactions • Semantics at 3 abstraction levels: Boolean, Concentrations, Populations • Biocham Temporal Logic for Formalizing Biological Properties • CTL for Boolean semantics • Constraint LTL for Concentration semantics • Machine learning Rules and Parameters from Temporal Properties • Learning reaction rules from CTL specification • Learning kinetic parameter values from Constraint-LTL specification • Internship topics: http://contraintes.inria.fr

  5. Overview of the Lectures • Introduction. Formal molecules and reactions in BIOCHAM. • Formal biological properties in temporal logic. Symbolic model-checking. • Continuous dynamics. Kinetics and transport models. • Computational models of the cell cycle control. • Abstract interpretation and typing of biochemical networks • Machine learning reaction rules from temporal properties. • Constraint-based model checking. Learning kinetic parameter values. • Constraint Logic Programming approach to protein structure prediction.

  6. References • A wonderful textbook: • Molecular Cell Biology. 5th Edition, 1100 pages+CD, Freeman Publ. • Lodish, Berk, Zipursky, Matsudaira, Baltimore, Darnell. Nov. 2003. • Modeling dynamic phenomena in molecular and cellular biology. • Segel. Cambridge Univ. Press. 1987. • Modeling and querying bio-molecular interaction networks. • Chabrier, Chiaverini, Danos, Fages, Schächter. Theoretical Computer Science 04 • The Biochemical Abstract Machine BIOCHAM. Chabrier, Fages, Soliman • http://contraintes.inria.fr/BIOCHAM

  7. Map of Course 1 • Introduction • BIOCHAM syntax • Proteins: complexation and phosphorylation • DNA: replication and transcription • Reaction and transport rules • Boolean semantics: concurrent transition system, Kripke structure • States and transitions • Examples: RTK membrane receptors, MAPK signaling pathways

  8. 2. Syntax: a Simple Algebra of Cell Molecules • Small molecules: covalent bonds 50-200 kcal/mol • 70% water • 1% ions • 6% amino acids (20), nucleotides (5), • fats, sugars, ATP, ADP, … • Macromolecules: hydrogen bonds, ionic, hydrophobic, Waals 1-5 kcal/mol • Stability and bindings determined by the number of weak bonds: 3D shape • 20% proteins (50-104 amino acids) • RNA (102-104 nucleotides AGCU) • DNA (102-106 nucleotides AGCT)

  9. Structure Levels of Proteins • 1) Primary structure: word of n amino acids residues (20n possibilities) • linked with C-N bonds • Example: MPRI • Methionine-Proline-Arginine-Isoleucine • 2) Secondary: word of m a-helix, b-strands, random coils,… (3m-10m) • stabilized by hydrogen bonds H---O • 3) Tertiary 3D structure: spatial folding • stabilized by • hydrophobic • interactions

  10. Formal proteins • Cyclin dependent kinase 1 Cdk1 • (free, inactive) • Complex Cdk1-Cyclin B Cdk1–CycB • (low activity) • Phosphorylated form Cdk1~{thr161}-CycB • at site threonine 161 • (high activity) • BIOCHAM syntax

  11. Deoxyribonucleic Acid DNA • Primary structure:word over 4 nucleotides • Adenine, Guanine, Cytosine, Thymine • 2) Secondary structure: • double helix of pairs • A--T and C---G stabilized • by hydrogen bonds • DNA replication: separation of the two helices and • production of one complementary strand for each copy

  12. DNA: Genome Size

  13. DNA: Genome Size 3,200,000,000 pairs of nucleotides single nucleotide polymorphism 1 / 2kb

  14. Genome Size

  15. Genome Size

  16. Transcription: DNA  pre-mRNA  mRNA  Protein • Genes: parts of DNA • Activation: transcription factors bind to the regulatory region of the gene • Transcription: RNA polymerase copies the DNA from start to stop positions into a single stranded pre-mature messenger pRNA • (Alternative)splicing: non coding regions of pRNA are removed giving mature messenger mRNA • Protein synthesis: mRNA moves to cytoplasm and binds to ribosome to assemble a protein • _ =[#E2-E2F13-DP12]=> pRNAcycA

  17. BIOCHAM Syntax of Objects • E == compound | E-E | E~{p1,…,pn} • Compound: molecule, #gene binding site, abstract @process… • - : binding operator for protein complexes, gene binding sites, … • Associative and commutative. • ~{…}: modification operator for phosphorylated sites, … • Set of modified sites (Associative, Commutative, Idempotent). • O == E | E::location • Location: symbolic compartment (nucleus, cytoplasm, membrane, …) • S == _ | O+S • + : solution operator (Associative, Commutative, Neutral _)

  18. Seven Fundamental Rule Schemas • Complexation: A + B => A-B Decomplexation A-B => A + B • cdk1+cycB => cdk1–cycB • Phosphorylation: A =[C]=> A~{p} Dephosphorylation A~{p} =[C]=> A • Cdk1-CycB =[Myt1]=> Cdk1~{thr161}-CycB • Cdk1~{thr14,tyr15}-CycB =[Cdc25~{Nterm}]=> Cdk1-CycB • Synthesis: _ =[C]=> A. Degradation: A =[C]=> _. • _=[#Ge2-E2f13-Dp12]=>cycA cycE =[@UbiPro]=> _ • (not for cycE-cdk2 which is stable) • Transport: A::L1 => A::L2 • Cdk1~{p}-CycB::cytoplasm=>Cdk1~{p}-CycB::nucleus

  19. BIOCHAM Syntax of Reaction Rules • R ::= S=>S | S=[O]=>S | S<=>S | S<=[O]=>S • where A=[C]=>B stands for A+C=>B+C • A<=>B stands for A=>B and B=>A, etc. • N ::= kinetic for R (import/export SBML format) • Three abstraction levels: • Boolean Semantics: presence-absence of molecules • Concurrent Transition System (asynchronous, non-deterministic) • Concentration Semantics: number / volume of diffusion • Ordinary Differential Equations or Hybrid system (deterministic) • Stochastic Semantics: number of molecules • Continuous time Markov chain

  20. The Actin-Myosin two-stroke Engine with ATP fuelMyosin + ATP => Myosin-ATP Myosin-ATP => Myosin + ADP • http://www.sci.sdsu.edu/movies

  21. The Actin-Myosin two-stroke Engine with ATP fuelMyosin + ATP => Myosin-ATP Myosin-ATP => Myosin + ADP • http://www.sci.sdsu.edu/movies

  22. The Actin-Myosin two-stroke Engine with ATP fuelMyosin + ATP => Myosin-ATP Myosin-ATP => Myosin + ADP • http://www.sci.sdsu.edu/movies

  23. The Actin-Myosin two-stroke Engine with ATP fuelMyosin + ATP => Myosin-ATP Myosin-ATP => Myosin + ADP • http://www.sci.sdsu.edu/movies http://www-rocq.inria.fr/sosso/icema2

  24. Cell to Cell Signaling by Hormones and Receptors • Signals: insulin, adrenaline, steroids, EGF, …, Delta, …, nutriments, light, pressure, … • Receptors: tyrosine kinases, G-protein coupled, Notch, … L + R <=> L-R RAS-GDP =[L-R]=> RAS-GTP

  25. Five MAP Kinase Pathways in Budding Yeast(Saccharomyces Cerevisiae)

  26. MAPK Signaling Pathways • Input: • RAF • Activated by the receptor • RAF-p14-3-3 + RAS-GTP • => RAF + p14-3-3 + RAS-GDP • Output: • MAPK~{T183,Y185} • moves to the nucleus • phosphorylates a transcription factor • which stimulates gene transcription

  27. MAPK Signaling Pathway in BIOCHAM • Pattern variables $P for • Phosphorylation sites • Molecules • with constraints • BIOCHAM rules are expanded in BIOCHAM-0 rules without patterns • RAF + RAFK <=> RAF-RAFK. • RAF-RAFK => RAFK + RAF~{p1}. • RAF~{p1} + RAFPH <=> RAF~{p1}-RAFPH. • RAF~{p1}-RAFPH => RAF + RAFPH. • MEK~$P + RAF~{p1} <=> MEK~$P-RAF~{p1} • where p2 not in $P. • MEK~{p1}-RAF~{p1} => MEK~{p1,p2} + RAF~{p1}. • MEK-RAF~{p1} => MEK~{p1} + RAF~{p1}. • MEKPH + MEK~{p1}~$P <=> MEK~{p1}~$P-MEKPH. • MEK~{p1}-MEKPH => MEK + MEKPH. • MEK~{p1,p2}-MEKPH => MEK~{p1} + MEKPH. • MAPK~$P + MEK~{p1,p2} <=> MAPK~$P-MEK~{p1,p2} • where p2 not in $P. • MAPKPH + MAPK~{p1}~$P <=> MAPK~{p1}~$P-MAPKPH. • MAPK~{p1}-MAPKPH => MAPK + MAPKPH. • MAPK~{p1,p2}-MAPKPH => MAPK~{p1} + MAPKPH. • MAPK-MEK~{p1,p2} => MAPK~{p1} + MEK~{p1,p2}. • MAPK~{p1}-MEK~{p1,p2}=>MAPK~{p1,p2}+MEK~{p1,p2}.

  28. Bipartite Proteins-Reactions Graph of MAPK GraphViz http://www.research.att.co/sw/tools/graphviz

  29. Random Boolean Simulation of MAPK Signaling

  30. Numerical simulation of MAPK in BIOCHAM-2

  31. Boolean Semantics • Associate: • Booleanstate variables to molecules • denoting the presence/absence of molecules in the cell or compartment • A Finite concurrent transition system [Shankar 93] to rules (asynchronous) over-approximating the set of all possible behaviors • A reaction A+B=>C+D is translated into 4 transition rules for the possibly complete consumption of reactants: • A+BA+B+C+D • A+BA+B +C+D • A+BA+B+C+D • A+BA+B+C+D

  32. Kripke Structure K=(S,R) • Given: • V is a set of state variables, with domain D, • T a set of transition rules between states. • Associate: • a Kripke structure (S,R) where • S=DV is the set of possible states with variables ranging in domain D • RSxS is the total relation induced by T, that is • (A,B) is in R if there exists a transition rule from state A to B • (A,A) is in R if there exist no transition from state A.

More Related