1 / 33

The Secret of Life: Exploring the Meaning and Code of DNA

This website explores the study of life through computer science and biology, focusing on the role of DNA as a programming language and the encoding of proteins. Discover the fascinating connections between information, computation, and life itself.

ortizk
Download Presentation

The Secret of Life: Exploring the Meaning and Code of DNA

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Class 37: Secret of Life David Evans http://www.cs.virginia.edu/~evans CS200: Computer Science University of Virginia Computer Science

  2. From Lecture 15: Liberal Arts BNF replacement rules for describing languages, rules of evaluation for meaning • Grammar: study of meaning in written expression • Rhetoric: comprehension of verbal and written discourse • Logic: argumentative discourse for discovering truth • Arithmetic: understanding numbers • Geometry: quantification of space • Music: number in time • Astronomy Not yet… Interfaces between components, program and user Your PS8 web sites are a discourse between user and server. Trivium Rules of evaluation, if, recursive definitions Learned to count in Lambda Calculus Not much yet… wait until April Curves as procedures, fractals Quadrivium Yes, even if we can’t figure out how to play “Hey Jude!” Yes: Neil deGrasse Tyson says so CS 200 Spring 2003

  3. Today is the 50th anniversary of announcement of the most important scientific discovery of the 20th century! CS 200 Spring 2003

  4. Eagle Pub, Cambridge UK “Watson, we have discovered the meaning of life!” Francis Crick, 28 February 1953 “Watson, come here, I want to see you.” Alexander Graham Bell, 10 March 1876 CS 200 Spring 2003

  5. Molecular Structure of Nucleic Acids, “A Structure for Deoxyribose Nucleic Acid”, Nature25 April 1953 It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material. http://www.nature.com/genomics/human/watson-crick/watson_crick.pdf

  6. Brief History of Biology 2000 1950 1850 Life is about magic. (“vitalism”) Life is about chemistry. Life is about information. Life is about computation. Schrödinger (1944) life is information crack the information code Most biologists work on Classification Aristotle (~300BC) - genera and species Descartes (1641) explain life mechanically Watson and Crick (1953) DNA stores the information CS 200 Spring 2003

  7. DNA G • Sequence of nucleotides: adenine (A), guanine (G), cytosine (C), and thymine (T) • Two strands, A must attach to T and G must attach to C C A T CS 200 Spring 2003

  8. Central Dogma of Biology • RNA makes copies of DNA segments • RNA describes sequences of amino acids • Chains of amino acids make proteins Translation Transcription Protein RNA DNA Image from http://www.umich.edu/~protein/ CS 200 Spring 2003

  9. Encoding Proteins • There are 4 nucleotides: adenine (A), guanine (G), cytosine (C), and thymine (T) (replaced with uracil (U) in RNA) • There are 20 different amino acids, and a stop marker (to separate proteins) • How many nucleotides are needed to encode one amino acid? with 2, could encode 16 things: 4 * 4 with 3, could encode 64 things: 4 * 4 * 4 CS 200 Spring 2003

  10. Codons • Three nucleotides encode an amino acid • But, there are only 20 amino acids, so there may be several different ways to encode the same one From http://web.mit.edu/esgbio/www/dogma/dogma.html CS 200 Spring 2003

  11. How Big is the Make-a-Human Program? • 3 Billion Base Pairs • Each nucleotide is 2 bits (4 possibilities) • 3 B pairs * 1 byte/4 pairs = 750 MB • Every sequence of 3 base pairs one of 20 amino acids (or stop codon) • 21 possible codons, but 43 = 64 possible • So, really only 750MB * (21/64) ~ 250 MB CS 200 Spring 2003

  12. 1 CD ~ 650 MB CS 200 Spring 2003

  13. People are almost all the Same • Genetic code for 2 humans differs in only 2.1 million bases • 4 million bits = 0.5 MB • How big is 0.5MB? • 1/3 of a floppy disk • ~22 times the size of the PS6 adventure game code CS 200 Spring 2003

  14. Is DNA Really a Programming Language? CS 200 Spring 2003

  15. Stuff Programming Languages are Made Of • Primitives • Means of Combination • Means of Abstraction codons (sequence of 3 nucleotides that encodes a protein) ?? Morphogenesis? Not well understood (by anyone). This is where most of the expressiveness comes from! DNA itself – separate proteins from their encoding Genes – group DNA by function (sort of) Chromosomes – package Genes together Organisms – packages for reproducing Genes CS 200 Spring 2003

  16. My Research Group • Build robust, survivable systems from unreliable components • Learn from biological systems that do this • Cell-Based Programming Model • Genes turn on and off  state changes • Emit different chemicals depending on state, sense chemicals in surroundings • Cells can divide asymmetrically • Lots of simplifications: not simulating reality CS 200 Spring 2003

  17. Example A state A emits (alive, 1) diffuses (radius, 10) transitions (alive < 1) from any direction -> (A, B) in same direction; -> (A); state B emits (alive, 1) transitions (alive < 1) from any direction & (radius > 1) -> (B, B) in same direction; (alive > 0) from any direction -> (B); -> (radius); alive < 1 B alive > 0 alive < 1 & radius > 1 CS 200 Spring 2003

  18. Simulating Program A alive < 1 B alive > 0 alive < 1 & radius > 1 Simulation by Selvin George CS 200 Spring 2003

  19. Simulation by Selvin George CS 200 Spring 2003

  20. Complexity Molecular map of colon cancer cell from http://www.gnsbiotech.com/applications.shtml CS 200 Spring 2003

  21. Computing with DNA Leonard Adleman (Mathematical Consultant for Sneakers), 1995 CS 200 Spring 2003

  22. Hamiltonian Path Problem • Input: a graph, start vertex and end vertex • Output: either a path from start to end that touches each vertex in the graph exactly once, or false indicating no such path exists RIC start: CHO end: BWI BWI CHO How hard is the Hamiltonian path problem? IAD CS 200 Spring 2003

  23. Encoding The Graph • Make up a two random 4-nucleotide sequences for each city: CHO: CHO1 = ACTT CHO2 = gcagRIC: RIC1 = TCGG RIC2 = actg IAD: IAD1 = GGCT IAD2 = atgt BWI: BWI1 = GATC BWI2 = tcca • If there is a link between two cities (AB), create a nucleotide sequence: A2B1 CHORIC gcagTCGG RICCHO actgACTT Based on Fred Hapgood’s notes on Adelman’s talk http://www.mitre.org/research/nanotech/hapgood_on_dna.html CS 200 Spring 2003

  24. Encoding The Problem • Each city nucleotide sequence binds with its complement (A  T, G  C) : CHO: CHO1 = ACTT CHO2 = gcag CHO’: TGAAcgtcRIC: TCGGactg RIC’: AGCCtgac IAD: GGCTatgt IAD’ = CCGAtaca BWI: GATCtcca BWI’ = CTAGaggt • Mix up all the link and complement DNA strands – they will bind to show a path! CS 200 Spring 2003

  25. Path Binding BWI’ CTAGaggt RIC’ AGCCtgac IAD’ CCGAtaca CHO’ TGAAcgtc gcagGGCT CHOIAD atgtTCGG IADRIC actgGATC RICBWI TCGGactg RIC BWI CHO GATCtcca ACTTgcag IAD GGCTatgt CS 200 Spring 2003

  26. Getting the Solution • Extract DNA strands starting with CHO and ending with BWI • Easy way is to remove all strands that do not start with CHO, and then remove all strands that do not end with BWI • Measure remaining strands to find ones with the right weight (7 * 8 nucleotides) • Read the sequence from one of these strands CS 200 Spring 2003

  27. Why don’t we use DNA computers? • Speed: shaking up the DNA strands does 1014 operations per second ($400M supercomputer does 1010) • Memory: we can store information in DNA at 1 bit per cubic nanometer • How much DNA would you need? • Volume of DNA needed grows exponentially with input size • To solve ~45 vertices, you need ~20M gallons CS 200 Spring 2003

  28. DNA-Enhanced PC CS 200 Spring 2003

  29. Biology is (becoming) a subfield of Computer Science • Biological mechanisms are mostly understood (proteomics still has a way to go) • What is not understood is how those are combined to create meaning CS 200 Spring 2003

  30. PS8 • Before 10:55am Monday: • Submit a zip file of all your code using a form linked from the CS200 web site • If you want to use a few PowerPoint slides in your presentation, you may submit those also • You only have 3 or 5 minutes: use them wisely • Figure out beforehand what you will do • Recommend: one team member drive web browser, one (or two) talk • Talk about what users should know about your website, not about how you built it (unless there is something especially interesting) CS 200 Spring 2003

  31. McIntire Symposium Talk: Daniel Kahneman(Psychologist, Nobel Prize in Economics) • When you are 99% sure, how often are you actually right? • 85-90% of the time • Some of you will get a sticker on your Exam 2 that will make you 99.5% sure of the lowest grade you could receive in CS200 (the 0.5% is since you still need to do PS8 well) • Humans are overly optimistic and excessively risk averse • No risk in taking the final: it cannot lower your grade • You should be optimistic that it can help your grade CS 200 Spring 2003

  32. Final • Out Monday, due Monday, May 5 (4:55pm) • You have 8 days, but should not spend more than 4 hours on the exam • Will include: • A small programming problem (like a PS) • Some questions about computability and complexity CS 200 Spring 2003

  33. Graduation Photo CS 200 Spring 2003

More Related