1 / 29

Quality of assemblies—mouse

Quality of assemblies—mouse. Terminology: N50 contig length If we sort contigs from largest to smallest, and start Covering the genome in that order, N50 is the length Of the contig that just covers the 50 th percentile. 7.7X sequence coverage. Quality of assemblies—dog. 7.5X

iman
Download Presentation

Quality of assemblies—mouse

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Quality of assemblies—mouse Terminology:N50 contig length If we sort contigs from largest to smallest, and start Covering the genome in that order, N50 is the length Of the contig that just covers the 50th percentile. 7.7X sequence coverage

  2. Quality of assemblies—dog 7.5X sequence coverage

  3. Quality of assemblies—chimp 3.6X sequence Coverage Assisted Assembly

  4. History of WGA 1997 • 1982: -virus, 48,502 bp • 1995: h-influenzae, 1 Mbp • 2000: fly, 100 Mbp • 2001 – present • human (3Gbp), mouse (2.5Gbp), rat*, chicken, dog, chimpanzee, several fungal genomes Let’s sequence the human genome with the shotgun strategy That is impossible, and a bad idea anyway Phil Green Gene Myers

  5. $985 deCODEme (November 2007) $399 Personal Genome Service (November 2007) $2,500 Health Compass service (April 2008) Genetic Information Nondiscrimination Act (May 2008) $350,000 Whole-genome sequencing (November 2007)

  6. Applications Whole-genome sequencing Comparative genomics Genome resequencing Structural variation analysis Polymorphism discovery Metagenomics Environmental sequencing Gene expression profiling Genotyping Population genetics Migration studies Ancestry inference Relationship inference Genetic screening Drug targeting Forensics

  7. New sequencing applications Sequencing applications Increase in sequencing data output Demand for more sequencing Sequencing technology improvement

  8. Sequencing technology Sanger sequencing $10.00 $1.00 Cost per finished bp: $0.10 $0.01 1975 1980 1990 2000 2008 Fred Sanger Read length: 15 – 200 bp 500 – 1,000 bp Throughput: “grad-student years” 2 ∙ 106 bp/day

  9. Sequencing technology Sanger sequencing 3 ∙ 109 bp 1x coverage 10x coverage × 3 ∙ 109 bp = 40 years 2 ∙ 106 bp/day = $30 million 10x coverage × 3 ∙ 109 bp × $0.001/bp

  10. Pyrosequencing on a chip • Mostafa Ronaghi, Stanford Genome Technologies Center • 454 Life Sciences

  11. Sequencing technology Next-generation sequencing “short reads” Read length: 250 bp Throughput: 300 Mb/day Cost: ~ 10,000 bp/$ De novo: yes Genome Sequencer / FLX

  12. Single Molecule Array for Genotyping—Solexa

  13. Sequencing technology Next-generation sequencing Genome Analyzer SOLiD Analyzer “microreads” Read length: ~ 35 bp Throughput: 300 – 500 Mb/day Cost: ~ 100,000 bp/$ De novo: yes

  14. Sequencing technology Next-generation sequencing Genome Analyzer SOLiD Analyzer reads Read length: ~ 50-150 bp Throughput: 3 Gb/day Cost: ~ 3,000,000 bp/$ De novo: yes

  15. Illumina Projections

  16. Complete Genomics • $5,000 this summer • Quality?... • 1,000 genomes in 2009 • 20,000 genomes in 2010

  17. Pacific Biosciences

  18. So, how fast is cost going down? • 2006: $10 million • 2008: $100,000 • 2009: $10,000 • ? $1,000 • ??? $100

  19. Molecular Inversion Probes

  20. Illumina Genotype Arrays

  21. Sequencing technology Next-generation sequencing “SNP chips” Infinium Assay GeneChip Array genotypes Read length: 1bp Throughput: 1 – 2 Mb/day Cost: 5,000 bp/$ De novo: no

  22. Nanopore Sequencing http://www.mcb.harvard.edu/branton/index.htm

  23. Sequencing technology Next-generation sequencing

  24. Sequencing technology

  25. Multiple Sequence Alignment

  26. Evolution at the DNA level Deletion Mutation …ACGGTGCAGTTACCA… SEQUENCE EDITS …AC----CAGTCCACCA… REARRANGEMENTS Inversion Translocation Duplication

  27. Evolutionary Rates next generation OK OK OK X X Still OK?

  28. Orthology, Paralogy, Inparalogs, Outparalogs

More Related