300 likes | 435 Views
Next Now -Generation Genomics: methods and applications for modern disease research. Aaron J. Mackey, Ph.D. amackey@virginia.edu Center for Public Health Genomics Wednesday October 7 th , 2009 BIMS 853 Special Topics in Cardiovascular Research. “omic” Disease Research.
E N D
Next Now-Generation Genomics:methods and applications formodern disease research Aaron J. Mackey, Ph.D. amackey@virginia.edu Center for Public Health Genomics Wednesday October 7th, 2009 BIMS 853 Special Topics in Cardiovascular Research
“omic” Disease Research source: Francis Ouellette, OICR
Basics of the “old” technology • Clone the DNA. • Generate a ladder of labeled (colored) molecules that are different by 1 nucleotide. • Separate mixture on some matrix. • Detect fluorochrome by laser. • Interpret peaks as string of DNA. • Strings are 500 to 1,000 letters long • 1 machine generates 57,000 nucleotides/run • Assemble all strings into a genome. source: Francis Ouellette, OICR
Basics of the “new” technology • Get DNA. • Attach it to something. • Extend and amplify signal with some color scheme. • Detect fluorochrome by microscopy. • Interpret series of spots as short strings of DNA. • Strings are 30-300 letters long • Multiple images are interpreted as 0.4 to 1.2 GB/run (1,200,000,000 letters/day). • Map or align strings to one or many genome. source: Francis Ouellette, OICR
Differences between platforms: • Nanotechnology used. • Resolution of the image analysis. • Chemistry and enzymology. • Signal to noise detection in the software • Software/images/file size/pipeline • Cost $$$ source: Francis Ouellette, OICR
Adapted from Richard Wilson, School of Medicine, Washington University, “Sequencing the Cancer Genome” http://tinyurl.com/5f3alk 3 Gb == source: Francis Ouellette, OICR
NGS technologies • Roche/454 Life Sciences • Illumina (Solexa) • ABI SOLiD • Helicos • Complete Genomics • Pacific Biosciences • Polonator
454 flowgram 454 has difficulty quantizing luminescence of long homopolymers;problem gets worse with homopolymer length
Roche/454 • first commercially available NGS platform • long reads (most 100-500bp; soon 1000bp) • paired-end module available • relatively expensive runs • homopolymer error rate is high • common uses: metagenomics, bacterial genome (re)sequencing • James Watson’s genome done entirely on 454 • UVA Biology Dept. has one (Martin Wu)
Illumina (Solexa) • 75 bp reads, PE • 150-250 bp fragments • 8 lanes per flowcell • ~3 Gbp per lane • < 5% error rate • available at UVA BRF DNA Core
ABI SOLiD • short reads (~35 bp) • cheapest cost/base • high fidelity reads (easy to detect errors) • Common uses: SNP discovery • 1000 genome project • with PET libraries, all applications within reach …
Comparing Sequencers source: Stefan Bekiranov, UVA
Other NGS platforms • Helicos (Stephen Quake, Stanford) • single molecules on slide • like Illumina, but no PCR, greater density • Complete Genomics • sequencing factory • 10K human genomes/year, $10K each • Pacific Biosciences – SMRT • DNA polymerase bound to laser/camera hookup • records a movie of DNA replication with fluoroscent dNTPs as single strand moves through nanopore • Polonator (Shendure and Church) • homebrew, $200K flowcell+laser machine • allows custom chemistry protocols
NGS applications • genome (re)sequencing • de novo genomes: 454 in Bact, small Euks • SNP discovery and genotyping (barcoded pools) • targeted, “deep” gene resequencing • metagenomics • structural/copy-number variation • Tumor genome SV/CNV: Illumina/PET • epigenomics – last week’s seminar • RNA-seq: now-generation transcriptomics • ChIP-seq: now-generation DNA-binding
RNA-seq • “unbiased” digital measure of abundance • residual PCR artifacts? Helicos says “yes” • larger dynamic range than microarray • depends on sequencing depth cost • ability to see alt./edited transcripts • multiple AS sites confounded; 454? • Total RNA vs. cDNA • 3’ end bias of cDNA • non-polyA transcripts in total RNA
some things I didn’tget to talk about much: • personal genome sequencing/medicine • microbial metagenomics • ENCODE/modENCODE projects • HapMap project • human 1000 Genome Project (1KGP) • targeted- and/or deep-resequencing • microRNAs, piRNAs, ncRNAs, … • SVs and CNVs (cancer) • read alignment issues (“mapability”)
Questions? amackey@virginia.edu