1 / 39

Welcome to CS374 Algorithms in Biology

Welcome to CS374 Algorithms in Biology. Overview. Administrivia Molecular Biology and Computation DNA, proteins, cells, evolution Some examples of CS in biology Computer Scientists vs Biologists. CS374: Algorithms in Biology cs374.stanford.edu. Attendance

vea
Download Presentation

Welcome to CS374 Algorithms in Biology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Welcome to CS374Algorithms in Biology

  2. Overview • Administrivia • Molecular Biology and Computation • DNA, proteins, cells, evolution • Some examples of CS in biology • Computer Scientists vs Biologists

  3. CS374: Algorithms in Biologycs374.stanford.edu • Attendance • At most 2 classes missed without affecting grade • Lectures • Most important requirement • Select available topic & day, send email to Serafim and George • Read papers, meet with Serafim 1-2 weeks before lecture • Ask George any questions on papers while preparing presentation • Schedule long (2 hr) meeting with Serafim the day before lecture • Slides due at noon before lecture

  4. CS374: Algorithms in Biologycs374.stanford.edu • Scribing • Please sign up on a first-come first-serve basis • Due 1 week after lecture, edited & distributed 2 weeks after lecture • George will help you edit • Summaries • Select 1 lecture among first 10, 1 lecture among rest • Find one relevant paper • Write a 1-page summary of the paper • Paper reference • Abstract • Discussion • Ask George for questions/feedback • Have fun!

  5. Nitrogenous Base Phosphate Group Sugar A T G C C G G C A T C G A T G C Structure of DNA double helix A, C, G, T DNA Physicist Ornithologist

  6. RNA: carries the “message” for “translating”, or “expressing” one gene A G C G A C U G DNA to RNA, and genes DNA, ~3x109 long in humans Contains ~ 22,000 genes transcription translation folding

  7. Structure of proteins Composed of a chain of amino acids. R | H2N--C--COOH | H 20 possible groups Sequence of amino acids folds to form a complex 3-D structure. The structure of a protein is intimately connected to its function.

  8. All living organisms are composed of cells

  9. Genetics in the 20th Century

  10. 21st Century AGTAGCACAGACTACGACGAGACGATCGTGCGAGCGACGGCGTAGTGTGCTGTACTGTCGTGTGTGTGTACTCTCCTCTCTCTAGTCTACGTGCTGTATGCGTTAGTGTCGTCGTCTAGTAGTCGCGATGCTCTGATGTTAGAGGATGCACGATGCTGCTGCTACTAGCGTGCTGCTGCGATGTAGCTGTCGTACGTGTAGTGTGCTGTAAGTCGAGTGTAGCTGGCGATGTATCGTGGT AGTAGCACAGACTACGACGAGACGATCGTGCGAGCGACGGCGTAGTGTGCTGTACTGTCGTGTGTGTGTACTCTCCTCTCTCTAGTCTACGTGCTGTATGCGTTAGTGTCGTCGTCTAGTAGTCGCGATGCTCTGATGTTAGAGGATGCACGATGCTGCTGCTACTAGCGTGCTGCTGCGATGTAGCTGTCGTACGTGTAGTGTGCTGTAAGTCGAGTGTAGCTGGCGATGTATCGTGGT AGTAGGACAGACTACGACGAGACGATCGTGCGAGCGACGGCGTAGTGTGCTGTACTGTCGTGTGTGTGTACTCTCCTCTCTCTAGTCTACGTGCTGTATGCGTTAGTGTCGTCGTCTAGTAGTCGCGATGCTCTGATGTTAGAGGATGCACGATGCTGCTGCTACTAGCGTGCTGCTGCGATGTAGCTGTCGTACGTGTAGTGTGCTGTAAGTCGAGTGTAGCTGGCGATGTATCGTGGT

  11. Computational Biology AGTAGCACAGACTACGACGAGACGATCGTGCGAGCGACGGCGTAGTGTGCTGTACTGTCGTGTGTGTGTACTCTCCTCTCTCTAGTCTACGTGCTGTATGCGTTAGTGTCGTCGTCTAGTAGTCGCGATGCTCTGATGTTAGAGGATGCACGATGCTGCTGCTACTAGCGTGCTGCTGCGATGTAGCTGTCGTACGTGTAGTGTGCTGTAAGTCGAGTGTAGCTGGCGATGTATCGTGGT • Organize & analyze massive amounts of biological data • Enable biologists to use data • Form testable hypotheses • Discover new biology

  12. DNA to RNA, and genes RNA: carries the “message” for “translating”, or “expressing” one gene A DNA, ~3x109 long in humans Contains ~ 22,000 genes G C G transcription translation A folding C 1 U G

  13. AGTAGCACAGACTACGACGAGACGATCGTGCGAGCGACGGCGTAGTGTGCTGTACTGTCGTGTGTGTGTACTCTCCTAGTAGCACAGACTACGACGAGACGATCGTGCGAGCGACGGCGTAGTGTGCTGTACTGTCGTGTGTGTGTACTCTCCT ~500 nucleotides Some examples of central role of CS1. Sequencing 3x109 nucleotides

  14. AGTAGCACAGACTACGACGAGACGATCGTGCGAGCGACGGCGTAGTGTGCTGTACTGTCGTGTGTGTGTACTCTCCTAGTAGCACAGACTACGACGAGACGATCGTGCGAGCGACGGCGTAGTGTGCTGTACTGTCGTGTGTGTGTACTCTCCT Some examples of central role of CS1. Sequencing 3x109 nucleotides A big puzzle ~60 million pieces Computational Fragment Assembly Introduced ~1980 1995: assemble up to 1,000,000 long DNA pieces 2000: assemble whole human genome

  15. Complete genomes today More than 300 complete genomes have been sequenced

  16. DNA to RNA, and genes RNA: carries the “message” for “translating”, or “expressing” one gene A DNA, ~3x109 long in humans Contains ~ 22,000 genes G C G transcription translation 2 A folding C 1 U G

  17. 2. Gene Finding Where are the genes? In humans: ~22,000 genes ~1.5% of human DNA

  18. atg caggtg ggtgag cagatg ggtgag cagttg ggtgag caggcc ggtgag tga

  19. Exon 3 Exon 1 Exon 2 Intron 1 Intron 2 5’ 3’ Splice sites Stop codon TAG/TGA/TAA Start codon ATG 2. Gene Finding Topics in CS374: Finding noncoding RNA genes Finding short words that regulate the expression of genes

  20. DNA to RNA, and genes RNA: carries the “message” for “translating”, or “expressing” one gene A DNA, ~3x109 long in humans Contains ~ 22,000 genes G C G transcription translation easy 2 A 3 folding C 1 U G

  21. 3. Protein Folding • Topics on Proteins in CS374 • Protein Structure • Protein Structure Comparison • Evolution of Protein Domains • Molecular Dynamics & Drug Targets • Protein Classification • Protein Folding Dynamics • Protein Kinetics • 2. Protein Comparison • Latest multiple alignment tools • Selecting parameters for alignment • Phylogenetic trees • The amino-acid sequence of a protein determines the 3D fold • The 3D fold of a protein determines its function • Can we predict 3D fold of a protein given its amino-acid sequence? • Holy grail of compbio—35 years old problem • Molecular dynamics, robotics, machine learning, computational geometry

  22. Complete Genomes More than 200 complete genomes have been sequenced

  23. Evolution

  24. Evolution at the DNA level next generation OK OK OK X X Still OK?

  25. 4. Sequence ComparisonSequence conservation implies function • Sequence comparison is key to • Finding genes • Determining function • Uncovering the evolutionary processes

  26. query DB Sequence Comparison—Alignment AGGCTATCACCTGACCTCCAGGCCGATGCCC TAGCTATCACGACCGCGGTCGATTTGCCCGAC -AGGCTATCACCTGACCTCCAGGCCGA--TGCCC--- | | | | | | | | | | | | | x | | | | | | | | | | | TAG-CTATCAC--GACCGC--GGTCGATTTGCCCGAC BLAST Sequence Alignment Introduced ~1970 BLAST: 1990, most cited paper in history Still very active area of research

  27. Comparison of Human, Mouse, and Rat • Topics on Genomics in CS374 • Indexing Large Databases • Newest BLAST techniques • Repeat Detection • Genomic Rearrangements • Finding the order of shuffles • between two genomes

  28. 5. Clustering of MicroarraysClinical prediction of Leukemia type • 2 types • Acute lymphoid (ALL) • Acute myeloid (AML) • Different treatment & outcomes • Predict type before treatment? Bone marrow samples: ALL vs AML Measure amount of each gene

  29. 6. Protein networks • Topics on Protein Networks in CS374 • Integration • Build networks from • multiple sources • 2. Alignment • Compare networks • across species • Mathematical properties • Modular, scale free Newer research area • Construct networks from multiple data sources • Navigate networks • Compare networks across organisms • Statistics • Machine learning • Graph algorithms • Databases

  30. G G A G A T A C A G G A T A T A A G C T G C A G G A G A G A A C T A G G A G T T A T A T C G A A C A A C G A A G A C G A A A C T T A C G A A C G A C G A A G C A A C 7. Human evolution • Topics on Human Population • Genetics in CS374 • Evolution • Finding fast-evolving • genes in human populations • 2. Migration • Tracing the migration of • humans out of Africa by • genetic studies

  31. 8. Building circuits from cells

  32. The abstract submission deadline is 11:59 pm, Sunday, October 1, 2006.

  33. Computer Scientists vs Biologists

  34. Computer scientists vs Biologists • (almost) Nothing is ever true or false in Biology • Everything is true or false in computer science

  35. Computer scientists vs Biologists • Biologists strive to understand the complicated, messy natural world • Computer scientists seek to build their own clean and organized virtual worlds

  36. Computer scientists vs Biologists • Biologists are obsessed with being the first to discover something • Computer scientists are obsessed with being the first to invent or prove something

  37. Computer scientists vs Biologists • Biologists are comfortable with the idea that all data have errors • Computer scientists are not

  38. Computer scientists vs Biologists • Computer scientists get high-paid jobs after graduation • Biologists typically have to complete one or more 5-year post-docs...

  39. Computer Science is to Biology what Mathematics is to Physics “Antedisciplinary” Science What is computational biology? http://compbiol.plosjournals.org/perlserv/?request=get-document&doi=10.1371/journal.pcbi.0010006

More Related