1 / 48

Next Generation Sequencing Technologies

Next Generation Sequencing Technologies. Rob Mitra Lecture 02/17/09. Forward Genetics. Genotype. Phenotype. Hypothesis. Test Hypothesis By Genetic Manipulation. Forward Genetics. Mutation in APC Gene. Two groups: 1. Develop Colorectal cancer At Young Age 2. Do not. Genotype.

almira
Download Presentation

Next Generation Sequencing Technologies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Next Generation Sequencing Technologies Rob Mitra Lecture 02/17/09

  2. Forward Genetics Genotype Phenotype Hypothesis Test Hypothesis By Genetic Manipulation

  3. Forward Genetics Mutation in APC Gene Two groups: 1. Develop Colorectal cancer At Young Age 2. Do not Genotype Phenotype Hypothesis APC is a Tumor Supressor Gene Test Hypothesis By Genetic Manipulation Delete APC in Mouse Control: Isogenic APC+

  4. The Cycle of Forward Genetics In 2005 $9 million/genome Not feasible ?Sequencing? Genotype Observation Thinking Phenotype Hypothesis Test Hypothesis By Genetic Manipulation Gene Deletion/Replacement Recombinant Technology

  5. End Runs • Linkage Studies (Humans, Model Organisms) • Association Studies (GWAS) BUT, these end runs have a cost! 1. Requires a large family (many crosses in model organisms); very difficult to analyze multi-factorial traits 2. Common variants But, these end runs will not be needed in 5-10 years. Why?

  6. The Problem with Forward Genetics Currently $60,000 /genome Cost is rapidly dropping Sequencing Genotype Observation Thinking Phenotype Hypothesis Test Hypothesis By Genetic Manipulation Gene Deletion/Replacement Recombinant Technology

  7. Bp/US dollar: increases exponentially with time Adapted from Shendure et al 2004

  8. Two questions: • How was this dramatic acceleration achieved? • What will it mean?

  9. How was this achieved? • Integration (Think about sequencing pipeline) • Parallelization • Miniaturization Same concepts the revolutionarized integrated circuits Plus one additional insight

  10. Read Length is Not As Important For Resequencing Jay Shendure

  11. Two Short Read Techologies • Illumina GA • ABI SOLID

  12. Technology Overview: Solexa/Illumina Sequencing http://www.illumina.com/

  13. Immobilize DNA to Surface Source: www.illumina.com

  14. Technology Overview: Solexa Sequencing

  15. Sequence Colonies

  16. Sequence Colonies

  17. Call Sequence

  18. ABI Solid Dressman 2003

  19. Sequencing By Ligation Shendure et al

  20. ABI SOLID

  21. ABI SOLID

  22. ABI SOLID

  23. ABI SOLID

  24. ABI SOLID This allows for error correction: See board Raw error rate = ~3% Corrected error rate = ~0.1%

  25. Paired End Reads are Important! Known Distance Read 1 Read 2 Repetitive DNA Unique DNA Paired read maps uniquely Single read maps to multiple positions

  26. Paired Ends are Important Part 2 Deletion Insertion Inversion Shendure et al 2005

  27. How can we generate paired end reads? • Amplify Large Fragments and Sequence From Each End (some trickery required – see board) • Length is limited (150bp – 1kb). • Jumping Library

  28. Jumping Library Contruction From Shendure et al

  29. Other Second Generation Technologies • 454 • Emulsion PCR • Polymerase • Natural Nucleotides • 20-100Mb for 5-15k • 1% error rate • Homopolymers

  30. Helicos • No Amplification Single molecule detection • Homopolymer (solved) • Expensive Detection

  31. Pacific Biosciences: A Third Generation Sequencing Technology Eid et al 2008

  32. How did they do? • 150 bp circular template • ~93% raw accuracy • 15x coverage 99.3% accuracy • Still early days

  33. Where are they going • Phi29 so long read lengths possible • Ease of sample prep • Camera costs

  34. Summary • Sequencing will become very inexpensive in 3-5 years • So now what?

  35. Areas of Broad Impact • Understanding Common Diseases • Cancer

  36. Why don’t we understand common traits or diseases? • GWAS is relatively new • But, this method can only analyze common variants • If rare variants play a significant role in common traits then we need to sequence. (Board) SO DO THEY?

  37. Studies on human height • Heritability of height is 0.8 (80% of variation in height is due to genetic factors) • 3 studies genotyped 63,000 individuals at 500,000 loci (biggest cohort analyzed to date) • 54 loci explain ~4% of the variance. WHAT!?

  38. Do rare variants matter? • What is the genetic basis of variation in blood pressure? • Lifton and colleagues sequenced 1000 individuals at these 3 loci (SLC12A3, SLC12A1, and KCNJ1) and correlated the observed genetic variation with blood pressure measurements. • 20 individuals had heterozygous, rare mutations that caused a significant decrease in blood pressure. Each rare mutation had a relatively large effect, and these mutations also protected individuals against developing clinical hypertension. • Although only about 2% of the population has a functional mutation in one of these three genes, Lifton and colleagues hypothesize: “Because these three genes comprise only a small fraction of those in which mutations are known to alter blood pressure, and because there are likely to be many more genes yet to be discovered, it seems probable that the combined effects of rare independent mutations will account for a substantial fraction of blood pressure variation in the population.” Ji et al 2008

  39. Conclusions • CDCV may not hold for many common traits • Rare variants may cumulatively play a big role in common traits, but sequencing candidate genes isn’t getting it done. • Whole genome sequencing.

  40. Cancer and Whole Genome Sequencing • Cancer is a disease of the genome • Acquisition of somatic mutation • The genome records a history of disease

  41. Complete genome sequence of AML genome • 32.7 fold haploid coverage • 14 fold coverage of normal skin • Remove SNPs, check for non-synonymous somatic mutations in coding DNA • 10 mutations found (2 known to be involved in cancer progression)

  42. We need more genomes! • Complete genomics ($5000) • ABI ($10,000) • Illumina (?) • Intelligent Biosystems (<$1000)

  43. Sequencing coverage calculations • Let’s say you need a base to be sequenced 5x for an accurate base call • How much average coverage do you need to ensure that 95% of the genome is sequenced at least 5 times?

  44. Poisson Distribution Originally derived for time. Average coverage = lambda Probability of getting k reads from a base given the average coverage lambda

  45. Example • Average coverage = 5x • Probability of a given base being sequenced 10 times is: 510e-5/10! = 0.018 or about 2% of bases will have 10x coverage.

  46. What about our question?

More Related