690 likes | 768 Views
Molecule as Computation. Ehud Shapiro Weizmann Institute of Science Joint work with Aviv Regev and Bill Silverman In collaboration with Corrado Priami, Naama Barkai and Luca Cardelli. The talk has three parts:. Briefly introduce molecular biology
E N D
Molecule as Computation Ehud Shapiro Weizmann Institute of Science Joint work with Aviv Regev and Bill Silverman In collaboration with Corrado Priami, Naama Barkai and Luca Cardelli
The talk has three parts: • Briefly introduce molecular biology • Computer-based consolidation of molecular biology • Our work on helping this happen
Pentium II E. Coli • 1 million macromolecules • 1 million bytes of static genetic memory • 1 million amino-acids per second • 3 million transistors • 1/4 million bytes of memory • 80 million operations per second Comparison courtesy of Eric Winfree
Pentium II E. Coli 1 micron
Pentium II E. Coli 1 micron 1 micron
Inside E. Coli (1Mbyte)
Ribosomes in operation Ribosomes translate RNA to Proteins RNA Polymerase transcribes DNA to RNA
Ribosomes in operation (= protein) Computationally: A stateless string transducer from the RNA alphabet of nucleic acids to the Protein alphabet of amino acids
Seqeunces and String Transducers Ribosomes translate RNA to Proteins RNA Polymerase transcribes DNA to RNA
Molecular Biology in One Slide • Sequence: Sequence of DNA and Proteins
Molecule as Computation Ehud Shapiro Weizmann Institute of Science Joint work with Aviv Regev and Bill Silverman In collaboration with Corrado Priami, Naama Barkai and Luca Cardelli
The talk has three parts: • Briefly introduce molecular biology • Computer-based consolidation of molecular biology • Our work on helping this happen
Pentium II E. Coli • 1 million macromolecules • 1 million bytes of static genetic memory • 1 million amino-acids per second • 3 million transistors • 1/4 million bytes of memory • 80 million operations per second Comparison courtesy of Eric Winfree
What about “The Rest” of biology: the function, activityand interaction of molecular systems in cells? ?
The “New Biology” • The cell as an information processing device • Cellular information processing and passing are carried out by networks of interacting molecules • Ultimate understanding of the cell requires an information processing model • Which?
“We have no real ‘algebra’ for describing regulatory circuits across different systems...” - T. F. Smith (TIG 14:291-293, 1998) “The data are accumulating and the computers are humming, what we are lacking are the words, the grammar and the syntax of a new language…” - D. Bray (TIBS 22:325-326, 1997)
Our Proposal:Molecule as Computational Process A system of interacting molecular entities is described and modelled by a system of interacting computational entities. “Cellular Abstractions: Cells as Computation”, to appear in Nature, September 26th, 2002
Composition of two processes is a process, therefore: • Molecular ensembles as processes • Molecular networks as processes • Cells as processes (virtual cell) • Multi-cellular organisms as processes • Collections of organisms as processes
Towards “Molecule as Process” • Use the p-calculus process algebra as molecule description language
The p-calculus (Milner, Walker and Parrow 1989) • A program specifies a network of interacting processes • Processes are defined by their potential communication activities • Communication occurs on complementary channels, identified by names • Message content: Channel name
Na + Cl < Na+ + Cl- Na | Na | … | Na | Cl | Cl | … | Cl Na::= e ! [] , Na_plus . Na_plus::= e ? [] , Na . Cl::= e ? [] , Cl_minus . Cl_minus::= e ! [] , Cl . Processes, guarded communication, alternation between two states.
GF GF RTK RTK SHC GRB2 SOS MKP1 RAS PP2A GAP MKK1 RAF ERK1 IEP MP1 IEP J F IEG The RTK-MAPK pathway • 16 molecular species • 24 domains; 15 sub-domains • Four cellular compartments • Binding, dimerization, phosphorylation, de-phosphorylation, conformational changes, translocation • ~100 literature articles • 250 lines of code
Molecular systems with p-calculus • Can express, qualitatively, the behavior of many complex molecular systems • Cannot express quantitative aspects
Towards “Molecule as Process” • Use the p-calculus process algebra as molecule description language • Provide a biochemistry-oriented stochastic extension (with Corrado Priami)
Stochastic p-Calculus(Priami, 1995,Regev, Priami, Shapiro, Silverman 2000) • Every channel x attached with a base rate r • A global (external) clock is maintained • The clock is advanced and a communication is selected according to a race condition • Rate calculation and race condition adapted for chemical reactions: • Rate(A+B C) = BaseRate *[A]*[B] • [A] = number of A’s willing to communicate with B’s. • [B] = number of B’s willing to communicate with A’s.
BioSPI implementation: p-calculus + Gillespie’s algorithm • Gillespie (1977): Accurate stochastic simulation of chemical reactions • The BioSPI system: • Compiles (full) p-calculus • Runtime incorporates Gillespie’s algorithm
Na + Cl < Na+ + Cl- global(e1(100),e2(10)). Na::= e1 ! [] , Na_plus . Na_plus::= e2 ? [] , Na . Cl::= e1 ? [] , Cl_minus . Cl_minus::= e2 ! [] , Cl .
Programming Experience with Stochastic Pi Calculus • Taught semesterial M.Sc. Course (available online) with lots of examples, exercises and final projects • Textbook examples from chemistry, organic chemistry, enzymatic reactions, metabolic pathways, signal-transduction pathways…
Circadian Clocks J. Dunlap, Science (1998) 280 1548-9
A R degradation A R degradation translation UTRA UTRR translation A_RNA R_RNA transcription transcription PA PR A_GENE R_GENE The circadian clock machinery(Barkai and Leibler, Nature 2000) Differential rates: Very fast, fast and slow
The machinery in p-calculus: “A” molecules A_GENE::=PROMOTED_A + BASAL_APROMOTED_A::= pA ? {e}.ACTIVATED_TRANSCRIPTION_A(e)BASAL_A::= bA ? [].( A_GENE | A_RNA)ACTIVATED_TRANSCRIPTION_A::=t1 . (ACTIVATED_TRANSCRIPTION_A | A_RNA) + e ? [] . A_GENE A_Gene RNA_A::= TRANSLATION_A + DEGRADATION_mATRANSLATION_A::= utrA ? [] . (A_RNA | A_PROTEIN)DEGRADATION_mA::= degmA ? [] . 0 A_RNA A_PROTEIN::= (new e1,e2,e3) PROMOTION_A-R + BINDING_R + DEGRADATION_APROMOTION_A-R ::= pA!{e2}.e2![]. A_PROTEIN+ pR!{e3}.e3![]. A_PRTOEINBINDING_R ::= rbs ! {e1} . BOUND_A_PRTOEIN BOUND_A_PROTEIN::= e1 ? [].A_PROTEIN+ degpA ? [].e1 ![].0DEGRADATION_A::= degpA ? [].0 A_protein
The machinery in p-calculus: “R” molecules R_GENE::=PROMOTED_R + BASAL_RPROMOTED_R::= pR ? {e}.ACTIVATED_TRANSCRIPTION_R(e)BASAL_R::= bR ? [].( R_GENE | R_RNA)ACTIVATED_TRANSCRIPTION_R::=t2 . (ACTIVATED_TRANSCRIPTION_R | R_RNA) + e ? [] . R_GENE R_Gene RNA_R::= TRANSLATION_R + DEGRADATION_mRTRANSLATION_R::= utrR ? [] . (R_RNA | R_PROTEIN)DEGRADATION_mR::= degmR ? [] . 0 R_RNA R_PROTEIN::= BINDING_A + DEGRADATION_RBINDING_R ::= rbs ? {e} . BOUND_R_PRTOEIN BOUND_R_PROTEIN::= e1 ? [] . A_PROTEIN+ degpR ? [].e1 ![].0DEGRADATION_R::= degpR ? [].0 R_protein
BioSPI simulation A R Robust to random perturbations
The A hysteresis module A A ON • The entire population of A molecules (gene, RNA, and protein) behaves as one bi-stable module Fast Fast OFF R R
Hysteresis module ON_H-MODULE(CA)::= {CA<=T1} . OFF_H-MODULE(CA) + {CA>T1} . (rbs ! {e1} . ON_DECREASE + e1 ! [] . ON_H_MODULE + pR ! {e2} . (e2 ! [] .0 | ON_H_MODULE) + t1 . ON_INCREASE) ON_INCREASE::= {CA++} . ON_H-MODULEON_DECREASE::= {CA--} . ON_H-MODULE ON OFF_H-MODULE(CA)::= {CA>T2} . ON_H-MODULE(CA) + {CA<=T2} . (rbs ! {e1} . OFF_DECREASE + e1 ! [] . OFF_H_MODULE +t2 . OFF_INCREASE ) OFF_INCREASE::= {CA++} . OFF_H-MODULEOFF_DECREASE::= {CA--} . OFF_H-MODULE OFF