1 / 37

CS273 Algorithms for Structure and Motion in Biology

Instructors: Serafim Batzoglou and Jean-Claude Latombe Teaching Assistant: Sam Gross | serafim | latombe | ssgross | @ cs.stanford.edu. Spring 2006 – http://www.stanford.edu/class/cs273/. CS273 Algorithms for Structure and Motion in Biology. Need a Scribe!!. Range of Bio-CS Interaction.

clove
Download Presentation

CS273 Algorithms for Structure and Motion in Biology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Instructors: Serafim Batzoglou and Jean-Claude Latombe Teaching Assistant: Sam Gross | serafim | latombe | ssgross | @ cs.stanford.edu Spring 2006 – http://www.stanford.edu/class/cs273/ CS273Algorithms for Structure and Motion in Biology

  2. Need a Scribe!!

  3. Range of Bio-CS Interaction Enormous range over space and time Body system Robotic surgery Tissue/Organs Soft-tissue simulation andsurgical training Cells Simulation ofcell interaction Molecules Molecular structures,similaritiesand motions Gene Sequencealignment CS273

  4. Focus on Proteins • Proteins are the workhorses of all living organisms • They perform many vital functions, e.g: • Catalysis of reactions • Transport of molecules • Building blocks of muscles • Storage of energy • Transmission of signals • Defense against intruders

  5. Proteins are also of great interest from a computational viewpoint • They are large molecules (few 100s to several 1000s of atoms) • They are made of building blocks (amino acids) drawn from a small “library” of 20 amino-acids • They have an unusual kinematic structure: long serial linkage (backbone) with short side-chains

  6. Proteins are associated with many challenging problems • Predict folded structures and motion pathways • Understand why some proteins misfold or partially fold, causing such diseases as: cystic fibrosis, Parkinson, Creutzfeldt-Jakob (mad cow) • Find structural similarities among proteins and classify proteins • Find functional structural motifs in proteins • Predict how proteins bind against other proteins and smaller molecules • Design new drugs • Engineer and design proteins and protein-like structures (polymers)

  7. Central Dogma of Molecular Biology

  8. translation transcription Central Dogma of Molecular Biology

  9. O N N N N O O O Protein Sequence (residue i-1) • Long sequence of amino-acids (dozens to thousands), also called residues • Dictionary of 20 amino-acids (several billion years old)

  10. O N N N N O O O Peptide bond(partial double bond character) Protein Sequence T

  11. Central Dogma of Molecular Biology Physiological conditions: aqueous solution, 37°C, pH 7, atmospheric pressure

  12. Levels of Protein Structures Quaternary hemoglobin (4 polypeptide chains)

  13. Mostly a-helices Mostly b-sheets Mixed

  14. Intermediate states Many pathways Folding Unfolded (denatured) state Folded (native) state

  15. How (we think) a protein folds ... DG = DH - TDS http://www-shakh.harvard.edu/ProFold2.html

  16. How (we think) a protein folds ... DG = DH - TDS http://www-shakh.harvard.edu/ProFold2.html

  17. How (we think) a protein folds ... DG = DH - TDS http://www-shakh.harvard.edu/ProFold2.html

  18. How (we think) a protein folds ... DG = DH - TDS http://www-shakh.harvard.edu/ProFold2.html

  19. How (we think) a protein folds ... DG = DH - TDS http://www-shakh.harvard.edu/ProFold2.html

  20. Motion of Proteins in Folded State HIV-1 protease

  21. Structural variability of the overall ensemble of native ubiquitin structures [Shehu, Kavraki, Clementi, 2005]

  22. Flexible Loop Loop 7 Amylosucrase

  23. Central Dogma of Molecular Biology

  24. Binding Inhibitor binding to HIV protease Ligand-protein binding Protein-protein binding

  25. GLN-101 Loop ARG-106 CH3 C O C O O Binding of Pyruvate to LDH(reduction of pyruvate to lactase) + ASP-195 + HIS-193 THR-245 Pyruvate ASP-166 NADH Nicotinamide adenine dinucleotide (coenzyme) + ARG-169 Lactate dehydrogenase environment

  26. What is CS273 about? • Algorithms and computational schemes for molecular biology problems • Molecular biology seen by computer scientists

  27. The Shock of Two Cultures • y = f(x) • Biologists like experiments, specifics and classifications They like it better to know many (xi,yi) – i.e., facts – and classify them, than to know f • Computer scientists like simulation, abstractions, and general algorithms They want to know f – the explanation of the facts – and efficient ways to compute it, but rarely care for any (xi,yi) • One challenge of Computational Biology is to fuse these two cultures

  28.  Two Views of a BioComputation Class • Where are IT resources for biology available and how to use them • How to design efficient data structures and algorithms for biology

  29. Main Ideas Behind CS273 • The information is in the sequence • Sequence  Structure (shape)  Function • Sequence similarity  Structural/functional similarity • Sequences are related by evolution

  30. Main Ideas Behind CS273 • The information is in the sequence • Sequence  Structure (shape)  Function • Sequence similarity  Structural/functional similarity • Sequences are related by evolution • Biomolecules move and bind to achieve their functions • Deformation  folded structures of proteins • Motion + deformation  multi-molecule complexes • One cannot just “jump” from sequence to function Ligand protein binding Protein folding

  31. sequencesimilarity structuresimilarity Sequence Structure Function

  32. Main Ideas Behind CS273 • The information is in the sequence • Sequence  Structure (shape)  Function • Sequence similarity  Structural/functional similarity • Sequences are related by evolution • Biomolecules move and bind to achieve their functions • Deformation  folded structures of proteins • Motion + deformation  multi-molecule complexes • One cannot just “jump” from sequence to function • CS273 is about algorithms for sequence, structure and motion- Finding sequence and shape similarities - Relating structure to function- Extracting structure from experimental data - Computing and analyzing motion pathways

  33. Vision Underlying CS273 • Goal of computational biology:Low-cost high-bandwidth in-silico biology • Requirements: Reliable models  Efficient algorithms • Algorithmic efficiency by exploiting properties of molecules and processes: • Proteins are long kinematic chains • Atoms cannot bunch up together • Forces have relatively short ranges • Computational Biology is more than using computers to biological problems or mimicking nature (e.g., performing MD simulation)

  34. Tentative Schedule

  35. Instructors and TAs • Instructors: • Serafim Batzoglou • Jean-Claude Latombe • TA: • Sam Gross • Emails: | serafim | latombe | ssgross | @ cs.stanford.edu • Class website: http://cs273.stanford.edu

  36. Expected Work • Regular attendance to lectures and active participation • Class scribing (assignments will depend on # of students) • Exciting programming project:http://www.stanford.edu/class/cs273/project/project.html - Structure prediction - Clustering and distance metrics - Protein design - Something else

  37. Questions?

More Related