410 likes | 497 Views
Corrado Priami University of Trento. Mobile Process Algebras in Systems Biology. New Challenges and Opportunities. AGENDA. 1. What we can do 2. Why we want to do it 3. Where we are 4. How we can do it 5. The stochastic pi 6. Its biochemical version 7. The BioSPI tool
E N D
Corrado Priami University of Trento Mobile Process Algebras in Systems Biology New Challenges and Opportunities
AGENDA • 1. What we can do • 2. Why we want to do it • 3. Where we are • 4. How we can do it • 5. The stochastic pi • 6. Its biochemical version • 7. The BioSPI tool • 8. A success story • 9. Concluding remarks
“In Silico” Virtual Distributed Lab for Systems Biology • Modeling dynamic evolution of bio-systems • Not only structures (genome), but functions What we can do • Simulation of time/space evolution • Stochastic run-time of languages/Parameter fitness and exploration • Analysis of their properties • Causality, Locality, Concurrency, feedback loops • Comparison for similar/equivalent behavior • Bisimulation based equivalences/Modular Cell Biology • Application of knowledge to similar classes of diseases • Predicting behavior • Looking at the computational space of models • Data bases of (behavior) functionalities • Programs as data + a run time engine • Connection with high-throughput tools • Specifications inferred from actual data
A possible architecture • We need biologists to use our tools and this implies • We must hide as much formal details as possible from the user, • We must include in the framework all the tools they usually work with
Programming the cell A man on the moon vision New computational paradigms, new primitives for programming, new software development tools, new (living) hardware. New drugs development, new genetic therapies, new cell repairing tools, predictive, preventive, personalized medicine First step: complete understanding of living matter functions
Execution Perturbation of normal behavior Result Interpretation of the new behavior/ new states “Shapiro, Cardelli” High impact on health and quality of life environmental protection (reduction of in vivo and in vitro experiments) software development (new primitives and paradigms) social and economical models of evolution Why we want to do it E.coli: smaller than Pentium gate, ~ 1M molecules, ~ 1M ROM, ~ 1M aminoacids PS Living devices: machine are already there (bacteria, eukaryotic cells, etc.). Once we completely understand their physical layer, we only need a hierarchy of software on top of them BUILDING A CELL COMPUTER is BUILDING A SOFTWARE INTERPRETATION
Systems Biology Gain a comprehensive and predictive understanding of the dynamic, interconnected processes underlying living systems • goal 1 • Identify and characterize the molecular machines of life • goal 2 • Characterize gene regulatory network • goal 3 • Characterize the functional repertoire of complex microbial communities in their natural environments at the molecular level • goal 4 • Develop the computational capabilities to advance understanding of complex biological systems and predict their behavior DOE vision LONG-TERM IMPACT: predictive and preventive medicine, rationale drug discovery and design, cell models and simulation, cell programming and repair, biocomputing and biocomputers
Where we are On the starting blocks, but … • we developed the first tool (BioSPI and Stochastic pi) • we applied it to a real case study (inflammatory processes in brain vessels)
Leroy Hood (invented systems biology) Building models of biological systems and then tuning/validating them via (high-throughput) experiments that provide feedback. Reductionism is replaced by hypothesis driven investigation. What is Systems Biology Robin Milner (invented mobile process algebras) Computer science as an experimental science. Computer systems are first modeled (generation of hypothesis), then implemented and tested (experiments) to refine/validate the model (feedback loop). Abstracting from experiments, Systems Biology is Computer Science in the applicative domain of life science
New vision of biological systems • Bio-components as information and computational devices • Millions of simultaneous computational threads active • (e.g., metabolic networks, gene regulatory networks, signaling pathways). • Components interaction changes the future behavior • Interactions occur only if components are correctly located • (e.g., they are close enough or they are not • divided by membranes). From structures to functions in Biology Interpreting Bio-components as Processes, Concurrent, Distributed, Mobile Systems have the above characteristics.
“Meredith” Mobile process algebras
Process Algebras for Mobility • Compositionality • Simple Abstractions • Well-developed theory for analysis and verification • Tools already developed and available Formal models of Bio-Systems
Compositionality Assign meaning to the basic graphical notations Interpret them as process calculi primitives Compose the processes to formally specify the whole system
Modeling paradigm of bio-components With the same principles specify chemistry, organic chemistry, enzymatic reactions, metabolic pathways, signal-transduction pathways… and ultimately the entire cell.
ERK1 SYSTEM ::= … | ERK1 | ERK1 | … | MEK1 | MEK1 | …ERK1 ::= (new internal_channels) (Nt_LOBE |CATALYTIC_CORE|Ct_LOBE) Domains, molecules, systems ~ Processes Compartments, membranes ~ Restriction “Shapiro” Molecule --- ProcessesCompartments --- Private names and scope
Ready to send p-tyron tyr! Ready to receive on tyr? MEK1 ERK1 tyr! [p-tyr] . KINASE_ACTIVE_SITE + … | … + tyr? [tyr]. T_LOOP Y Actions consumed alternatives discarded p-tyr replaces tyr KINASE_ACTIVE_SITE| T_LOOP {p-tyr/ tyr} pY “Shapiro” Interaction capability --- Global channelsChange of future interactions --- mobility Molecular interaction and modification ~Communication and change of channel names
The stochastic pi-calculus Biology is driven by quantities (e.g., energy, time, affinity, distance, amount of components). Stochastic variant of process algebras must be considered Simulation techniques come into play
We associate the single parameter r in (0, ∞] of an exponential distribution to each prefix p; it describes the stochastic behavior of the activity p.P is replaced by (p, r).P The delay of the activity (x, r) is a random variable with an exponential distribution. Exponential distribution guarantees the memoryless property: the time at which a change of state occurs is independent of the time at which the last change of state occurred. Syntax and semantics Race condition is defined in a probabilistic competitive context: all the activities that are enabled in a state compete and the fastest one succeeds. Bang “!” is replaced by constant definition and the structural congruence accordingly extended with A(y) congruent to P{y/x} if A(x) = P is the unique defining equation of constant A with x = fn(P)
(A, r) (A, 2r) CTMC TS (A, r) A transition system is an oriented graph that connects the states through which a process can pass with arcs called transitions and possibly labeled with information on the activities that causes the state change. Stochastic TS and CTMC TS resembles stochastic (Markov) processes except that TS can have pair of states connected by more than one transition. Simple Graph Manipulation
“Shapiro” • Gillespie (1977): Accurate stochastic simulation of chemical reactions • Modification of the race condition and actual rate calculation according to biochemical principles Biochemical stochastic pi-calculus The actual rate of a reaction between two proteins is determined according to a basal rate and the concentrations or quantities of the reactants
Biochemical stochastic pi-calculus Reduction Semantics
Biochemical stochastic pi-calculus Inductively counts the number of receive operations Enabled on the channel x. Computing rates according to bio intuition
Compiles (full) pi calculus to FCP/Logix Incorporates Gillespie’s algorithm in the runtime engine The BioPSI system
Interphase • G1: growth phase, synthesis of organelles • S: synthesis of DNA (replication) • G2: growth; synthesis of proteins essential to cell division • Mitosis • prophase • methaphase • anaphase • telophase Eukaryotic cell cycle Cycle duration in human liver cells
Cycle with two states (G1 and S-G2-M) separated by two irreversible transitionsSTARTand FINISH. At START a cells confirms that internal and external conditions are favorable for a new round of DNA synthesis and division and commits itself to the process. Nasmyth’s model (1996) START is triggered by the activity of a protein kinase (CDK) associated with a cyclin subunit. When DNA replication is complete and all the chromosomes are aligned, the second transition of the cycle (FINISH) drives the cell in anaphase. FINISH is accomplished by proteolytic machinery (APC) that inhibits the activity of cyclin/CDK dimer. CDK = Cyclin-Dependent Kinase; APC = Anaphase-Promoting Complex
START FINISH degraded cyclin degraded CKI CDK activity drives cell through S phase, G2 phase and up to the metaphase The molecular mechanism CDK and APC are antagonistic proteins: • APC destroys CDK activity degrading cyclin and • cyclin/CDK dimers inactivate APC by phosphorilating some of itssubunits. Moreover, cyclin/CDK dimers can be put out of commission also by the stoichiometric binding with an inhibitor (CKI) CDK = Cyclin Dependent Kinase APC = Anaphase Promoting Complex CKI = Cyclin-dependent Kinase Inhibitor
APC 12 polypeptides + 2 auxiliary proteins CDH1 and CDC20 Fundamental antagonism CDC20 The APC extinguishes CDK activity by destroying its cyclin partners, whereas cyclin/CDK dimers inhibit APC activity by phosphorilating CDH1. • Two alternative stable steady states of the cell cycle: • G1 state with high CDH1/APC activity and low cyclin/CDK activity • S-G2-M state with high cyclin/CDK activity and low CDH1/APC activity.
SYSTEM = CYCLIN | CDK | CDH1 | CDC14 | CKI | CLOCK BioSPI specification specification
N. of molecules Time (min) ! BioSPI Simulations CYCLIN_BOUND Fictious values for the initial number of molecules
GF GF RTK RTK SHC GRB2 SOS MKP1 RAS PP2A GAP MKK1 RAF ERK1 IEP MP1 IEP J F IEG The RTK-MAPK pathway • 16 molecular species • 24 domains; 15 sub-domains • Four cellular compartments • Binding, dimerization, phosphorylation, de-phosphorylation, conformational changes, translocation • ~100 literature articles • 250 lines of code
Selectins/Mucins PSGL -1/E & P-Selectin Integrins lymphocyte a4 b1 / VCAM-1 LFA-1/ICAM-1 1. Tethering and rolling Hematic flow 2. Firm arrest 3. Diapedesis Endothelium Activation of G protein Activation of integrins A simulation of extra-vasation in multiple sclerosis has highlighted a new behavior of leukocytes proved in lab experiments a posteriori A success story
Results Prediction of rolling cells percentage as a function of vessel diameters
First attempt: lambda-calculus Buss, Fontana -- no concurrency Second attempt: (stochastic) pi-calculus Priami, Regev, Shapiro, Silvermann Then: BioAmbients, Brane Calculi -- Cardelli et al. Core Formal Biology, CCS-R -- Danos et al. Beta binders -- Priami, Quaglia Recent evolutions
We have a lot to do, but we are in the position to win the challenge, if Conclusions we establish a P2P collaboration between BIO and IT we find acommon language and common expectations we set up interdisciplinary curricula and carry out interdisciplinary research projects Unique opportunity to change future life science, but also future computer science
Bioinformatics group at the University of Trento: Corrado Priami, Paola Quaglia Daniel Errampalli, Katerina Pokozy Federica Ciocchetta, Claudio Eccher, Paola Lecca, Radu Mardare, Davide Prandi, Debora Schuch da Rosa, Alex Vagin Alessandro Romanel Acknowledgements: www.dit.unitn.it/~bioinfo