580 likes | 878 Views
From Molecular Computing to Molecular Programming. 萩谷 昌己 Masami Hagiya. JSPS Project on Molecular Computing. Funded by Japan Society for Promotion of Science Research for the Future Program biocomputing field ( 生命情報 ) chaired by Prof. Anzai molecular computing
E N D
From Molecular Computingto Molecular Programming 萩谷 昌己 Masami Hagiya
JSPS Project on Molecular Computing • Funded by Japan Society for Promotion of Science • Research for the Future Program • biocomputing field (生命情報) chaired by Prof. Anzai • molecular computing • artificial cell (chemical IC) • evolutionary computation • signal transduction • complex systems • October 1996 - March 2001
JSPS Project on Molecular Computing • Project Leader - Masami Hagiya (Computer Science) • Members • Takashi Yokomori (Computer Science) • Masayuki Yamamura (Computer Science) • Masanori Arita (Genome Informatics) • Akira Suyama (Biophysics) • Yuzuru Husimi (Biophysics) • Kensaku Sakamoto (Biochemistry) • Shigeyuki Yokoyama (Biochemistry)
Goals of Molecular Computing • Analyses and Applications of Computational Power of Biomolecules • Understanding Life from the Viewpoint of Computation • computational mystery of life • origin of life, wet artificial life • Engineering Applications(not restricted to computation) • combinatorial optimization • gene expression analysis • nanotechnology, nanomachine • cryptography • medical and pharmaceutical applications in the future • New Computational Model, New Simulation Technology
Major Achievements • Suyama’s Dynamic Programming DNA Computers • reduction of molecules by breadth-first search • automation by robots • Sakamoto’s Hairpin Engines • Whiplash PCR and SAT Engine • molecular computation by hairpin formation • autonomous molecular computation • Theoretical Studies by Yokomori’s Group • Nishikawa’s Simulator for DNA computations • Arita’s New Tool for Code Design • Husimi’s 3SR-Based Evolutionary Reactor • Yamamura’s Aqueous Computing (with Head)
Adleman-Lipton Paradigm • Adleman(Science 1994) • Solving Hamilton Path Problem by DNA • Lipton, et al. • Solving SAT Problem by DNA • Massively Parallel Computation by Molecules • Mainly for Combinatorial Optimization • Random Generation by Self-Assembly • solution candidate = DNA molecule • Selection by Molecular Biology Experiments Scaling Up ⇒Efforts to increase yields and reduce errorsRobot and Chemical IC
Dynamic ProgrammingDNA Computer • “counting” (Ogihara and Ray) • “dynamic programming” (Suyama) • Iteration of Generation and Selection • Generation of Candidates of Partial Solutions • Selection of Solutions • The order of computational complexity does not decrease, but the amount of necessary molecules is drastically reduced. • 3-SAT
Brute Force v.s. Dynamic ProgrammingDNA Computers Brute Force Large pool size Low reaction rate Dynamic Programming small pool size High reaction rate Generating PARTIAL solution space STEP BY STEP Generating WHOLE solution space AT ONCE Selection Selection Solution Solution
Basic Operations forDynamic Programming DNA Computers get (T, +s), get (T, -s) get DNA molecules with a subsequence s (without s) in a tube T append (T, s, e) append a subsequence s at the end of DNA molecules with a splint e in a tube T merge (T, T1, T2, …, Tn) merge DNA molecules in tubes T1, T2, …, Tn into a tube T amplify(T, T1, T2, …, Tn) amplify DNA molecules in a tube T and divide them into tubes T1, T2, …, and Tn detect(T) detect DNA molecule in a tube T
Implementation of Basic Operations append (T, s, e) amplify (T, T1, T2, …Tn) get (T, +s), get (T, -s) annealing and ligation annealing annealing Taq DNA ligase T PCR s e s s s immobilization and cold wash immobilization and cold wash immobilization s e s s s hot wash hot wash and divide s get (T, -s) cold wash hot wash T1, T2, …Tn s get (T, +s)
DP algorithm for 3CNF-SAT Ú Ø Ú Ø Ú Ø Ú ( x x x ) ( x x x ) 2 3 4 2 3 4 Ú Ú ( x x x ) 2 3 4 k’s loop: k ranges over variable indices j’s loop: j ranges over clause indices ifxk is the 3rd literal of the j-th clause then remove those assignments which satisfy neither the 1st nor the 2nd literal append XkF to the remaining assignments (do similarly if ¬xk is the 3rd literal) k = 4 X1FX2TX3T X1FX2FX3T X1TX2TX3F X4F X1TX2TX3F X1T X2F X3F
Confirmation of the Solution by PCR (k=4) M M S no S
An Amount of DNA for ComputationBrute Force v.s. Dynamic Programming 100 variable 3-CNF SAT Adleman-Lipton’s Brute Force DNA Computers Dynamic Programming DNA Computers 2x1012 g of dsDNA 1x1012 g of ssDNA 2x10-3 g of dsDNA (1x10-3 g of ssDNA)
4×1016 1×1015 4×1013 n = 100 Number of Variables v.s.Number of Molecules
On Scaling Up the Size of Computations • The size of random pools currently used: • 210 … 310 Rapidly increasing. • The number of molecules in a test tube: • 1010… 1012 We will soon reach the limit on the number of molecules in a test tube. That is, we will fully utilize the parallelism of molecules in a single tube. ⇒ Multiple Tubes, Chemical IC, Cells, etc. … But reaching the limit is the current goal.
Annealing s s Immobilization s s get (T, -s) Cold wash Hot wash s get (T, +s) Automation of DNA Computations • Robot for DNA Computing Based onMAGTRATIONTM Automation of the Get Command
[Instrument] [Reset Counter] 0 [Home Position] 0 [MJ-Open Lid] ・・・ [Get1(0)] [Get2(1)] [Append(2)] ・・・ [Exit] (1-1-4) [MJ-Open Lid] Do 2 _SEND "LID OPEN" Do 10 _SEND "LID?" Wait_msec 500 _CMP_GSTR "OPEN" IF_Goto EQ 0 ;open Wait_msec 1000 Loop Loop ; Time out End ;open protocol-level script-level Pascal/C-level Programming in DNA Computer
Autonomous Molecular Computing • Adleman-Lipton Paradigm • Generation of Candidates = Autonomous Reaction • Selection of Solutions = Operations from Outside • One-Pot Reaction ⇒Autonomous Computation Comutation by Successive Autonomous Reactions by Molecules • Winfree’s DNA Tile • Sakamoto’s Hairpin Engines • Whiplash PCR and SAT Engine
Hairpin Engines • Molecular Computation by Hairpin Formation • Hairpin --- Typical Secondary Structure • Whiplash PCR • DNA Automaton: State Machine by DNA • 5 Transitions in a Control Experiment • SAT Engine • Selection by Hairpin Structures of DNA • 3‐SAT: 6-Variable 10-Clause Formula
SAT Engine • Sakamoto et al., Science, May 19, 2000. • Selection by Hairpin Structures of DNA • digestion by restriction enzyme • exclusive PCR • 3-SAT ssDNA consisting of literals, each selected from a clause complementary literal = complementary sequence detection of inconsistency ⇒ hairpin • The essential part of the SAT computation is done by hairpin formation. • Autonomous Molecular Computation
(a∨b∨c)∧(¬d∨e∨¬f)∧ … ∧(¬c∨¬b∨a)∧ ... e b ¬b digestion by restriction enzyme exclusive PCR b ¬b
Selection by Hairpin Structures • Digestion by Restriction Enzyme • Hairpins are cut at the restriction site inserted in each literal sequence. • Exclusive PCR • PCR is inefficient for hairpins. • In exclusive PCR, solution is diluted in each cycle to keep the difference in amplification. • The number of steps is independent on the number of variables or clauses.
Generation of Random Pool (a∨b∨c)∧(d∨e∨f)∧(g∨h∨i)∧(j∨k∨l) a d g j Chemically Synthesized b e h k c f i l
Generation of Random Pool (a∨b∨c)∧(d∨e∨f)∧(g∨h∨i)∧(j∨k∨l) a d g j b e h k c f i l
Generation of Random Pool (a∨b∨c)∧(d∨e∨f)∧(g∨h∨i)∧(j∨k∨l) a f h j b d i l c e g k
Generation of Random Pool BstXI BstXI BstNI BstNI BstNI 4 9 5 8 4 4 4 5 4 4 4 5 4 30
6-Variable 10-Clause Formula (a∨b∨!c)∧(a∨c∨d)∧(a∨!c∨!d)∧(!a∨!c∨d)∧ (a∨!c∨e)∧(a∨d∨!f)∧(!a∨c∨d)∧(a∨c∨!d)∧ (!a∨!c∨!d)∧(!a∨c∨!d) ! = ¬
Whiplash PCR • DNA Automaton: State Machine by DNA • Polymerization of a Hairpin Form • Polymerization Stop • Autonomous SIMD Computation of Boolean μ-formulas • Solving NP-Complete Problems in O(1)-Step e.g., vertex cover: vertex cover candidate = transition table = ssDNA vertex cover = transition table that reaches the final state • 5 Transitions in a Control Experiment
Whiplash PCR B x b a C x B A x
Whiplash PCR B C x B A x
Whiplash PCR a x x B A x C B
Whiplash PCR a c b x x B A x C B
7 6 5 4 3 2 1 0
A Perspective on Molecular Computing • Structure Formation = Computation • probabilistic process • governed by thermodynamics and kinetics • Molecular Computation as • Probabilistic (Randomized) Computation • It should be analyzed by • complexity theory (for random algorithms) • thermodynamics and kinetics. • cf. Winfree • Computational Mystery of Life • Why is life so efficient computationally? • protein folding • gene regulation, signal transduction, etc.
Molecular Programming • Designing and controlling biomolecular reactions • Biomolecules (DNA,RNA,protein) --- combinatorial complexity • Molecular program --- two parts • the part encoded in molecules themselves (e.g., DNA seq.) • We should go beyond simple code design. • the part implemented by a sequence of lab. operations • Molecular programming ... • controls conformational change and self-assembly by coordinating the two parts. • Various applications • gene expression analysis • nanotechnology and nanomachine • combinatorial chemistry
Simple Example of Molecular Programming • PCR (Polymerase Chain Reaction) • primers • various parameters • high temperature and period • low temperature and period • polymerase (enzyme) • Molecular programming in PCR ... • Designing primer sequences • Setting various parameters • Selecting polymerase