230 likes | 424 Views
A few ideas about DNA computing. Enrique Blanco (2006). 1. Definition. DNA computing or molecular computing can be defined as the use of biological molecules, primarily DNA (or RNA), to solve computational problems that are adapted to this new biological format.
E N D
A few ideas about DNA computing Enrique Blanco (2006)
1. Definition DNA computing or molecular computing can be defined as the use of biological molecules, primarily DNA (or RNA), to solve computational problems that are adapted to this new biological format
2. Bioinformatics, Biocomputing and DNA computing • Bioinformatics: Data mining on biological (sequence) data • Biocomputing: Design of algorithms based on evolutionary laws such as selection or mutation events • DNA computing: Use biochemical processes based on DNA to solve mathematical problems
3. Computers Vs DNA computing (I) 1010101011 GATCGACTAC
5. Why do we investigate about “other” computers? • Certain types of problems (learning, pattern recognition, fault-tolerant system, large set searches, cost optimization) are intrinsically very difficult to solve with current computers and algorithms • NP problems: We do not know any algorithm that solves them in a polynomial time all of the current solutions run in a amount of time proportional to an exponential function of the size of the problem • Exponential cost can be approached by massive paralellism an exponential amount of processors running in parallel could get it
6. Massive parallel machines (potential) • 6.022 x 1023 molecules/mole Massive parallel searches: • Desktop PC: 109 ops/sec • Supercomputer: 1012 ops/sec • 1 µmol of DNA: 1026 reactions
7. Advantages of DNA computing: Storage capacity: 1 bit per cubic nanometer (1 gm of DNA = 1 billion CDs) Massive production of DNA molecules with specific properties Great energetic efficiency (with 1 Joule, +10 magnitude orders better) Natural chemical interactions between DNA molecules, according to defined rules to produce new molecules Well known lab techniques for the isolation/identification of product molecules with specific properties: PCR, ligation, gel electrophoresis,...
8. DNA memory: A DNA string can be viewed as a memory resource to save info: • 4 types of units (A,C,G,T) numbers in base 4 • Complementary units: A-T,C-G • Double-stranded strings ATGGATCAGCTGA TACCTAGTCGACT
9. DNA operators: Lab technology • Hybridization • Ligation • Polymerase Chain Reaction (PCR) • Gel Electrophoresis • Affinity Separation • Restriction Enzymes
10. Hybridization and ligation • Base-pairing between 2 complementary single-strand molecules to form a double stranded DNA molecule + Joining DNA molecules together
11. PCR • Amplify (identical copies) of selected double stranded DNA molecules 2n copies/step
12. Gel electrophoresis • Molecular size fraction technique: detection of specific DNA
13. Affinity Separation • An iron bead is attached to a fragment complementary to a substring • A magnetic field is the used to pull out all of the DNA fragments containing such a sequence
14. Restriction enzymes • Cut the DNA at a specific sequence site
15. An example of NP-problem: the Traveling Salesman Problem • A hamiltonian path in a graph is a path visiting each node only once, starting and ending at a given locations
16. An example of NP-problem: the Traveling Salesman Problem (II) • TSP: A salesman must go from the city A to the city Z, visiting other cities in the meantime. Some of the cities are linked by plane. Is it any path from A to Z only visiting each city once? • A=ATLANTA Z=DETROIT, YES • A=BOSTON Z=DETROIT, NO
17. An example of NP-problem: the Traveling Salesman Problem (III) • Code each city (node) as an 8 unit DNA string • Code each permitted link with 8 unit DNA strings • Generate random paths between N cities (exponential) • Identify the paths starting at A and ending at Z • Keep only the correct paths (size, hamiltonian)
18.Coding the paths • Hybridization and ligation between city molecules and intercity link molecules Atlanta – Boston: ACTTGCAGTCGGACTG |||||||| CGTCAGCC R: (GCAGTCGG) (A+B)+Chicago: ACTTGCAGTCGGACTGGGCTATGT |||||||| TGACCCGA R: (ACTGGGCT) Solution A+B+C+D: ACTTGCAGTCGGACTGGGCTATGTCCGAGCAA
19.Filter the correct solutions 1.Identify the paths starting at A and ending at Z • PCR for identifying sequences starting with the last nucleotides of A and ending at the first nucleotides of Z 2. Keep only the paths with N cities (N=number of cities) • Gel electrophoresis 3. Keep only those paths with all of the cities (once) • Antibody bead separation with each vertex (city) The sequences passing all of the steps are the solutions
20. Other classical problems already approached • The SAT problem (satisfactibility of boolean clauses) • Breaking the Data Encription Standard (DES) • The maximum clique problem • The knights problem (RNA) • DNA computers for general purpose?
References • DNA computing (web): http://www.stanford.edu/~alexli/soco/index.htm • L.M. Adleman, "Molecular Computation of Solutions to Combinatorial Problems", Science 266:1021-1024, 1994 • Y. Benenson, T. Paz-Elizur, R. Adar, E. Keinan, Z. Livneh, and E. Shapiro, "Programmable and autonomous computing machine made of biomoleculres", Nature 414:430-434, 2001 • Byoung-Tak Zhang. Molecular Computing: An Overview. BiointelligenceLaboratory. School of Computer Science and Engineering,Seoul National University March 13, 2002.