320 likes | 840 Views
Genome sequencing. MUPGRET Workshop Joe Polacco. Size of human genome. 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked up would reach top of Washington monument. Human Genome Project. Began as a academic effort
E N D
Genome sequencing MUPGRET Workshop Joe Polacco
Size of human genome • 23 pairs of chromosomes • 3.1 billion bp • If code written in NYC phone books and stacked up would reach top of Washington monument.
Human Genome Project • Began as a academic effort • Initially involved 5 research centers in US and England. • Soon joined by Celera, spin off company.
Some surprises • Initial estimate 100,000 to 150,000 genes but found to be 35,000 to 50,000. (C. elegans ~19,000 genes) • Mass of genome that codes for protein originally estimated as 5% but found to be 1.5%.
Some completely sequenced genomes • Mycoplasma genetialium • 578,000 bp, 400 genes • Haemophilus influenza • 1,830,138 bp, 1738 genes • E. coli • 4,639,221 bp, 4377 genes • S. cervisiae • 12 x 106 bp, 5885 genes
More genomes • C. elegans • 95.5 x 106, 19,820 genes • D. melanogaster • 1.8 x 108, 13,601 genes • A. thaliana • 1.17 x 108, 25, 498 genes
More genomes • M. musculus • 3 x 109, ~30,000 genes • H. sapiens • 3.3 X 109, 30-50,000 genes • O. sativa • 4.3 x 108, 30-63,000 genes
The beginning • Human genome project initially discussed at a UC-Santa Cruz meeting in 1985.
What were the concerns? • What will it do to biology? • How will be pay for it? • Is this really science? • Why bother to sequence it all? • all vs. just the genes (skim sequencing)
Dept. of Energy • Initially funded project in 1987. • $5.3 million • Study radiation induced mutations, repair and effect on humans.
NIH • Joined in 1988. • James Watson leader • 3% of research budget devoted to examining the ethical, legal, and social implications of gene research (ELSI)
Other genomes • Parallel sequencing of E. coli, S. cerevisiae, C. elegans, D. melanogaster, and M. musculus • Why • Work out the technology and methods
Watson’s vision • Sequence it all not just genes. • Use genetic maps and markers to help assemble the pieces.
Academic players • Wash U • Baylor • Whitehead • Wellcome Trust • Joint Genome Institute—DOE Center
$1 to 10 cents a finished bp • automated processing of cloned DNA • automated DNA sequencing • computer system to support sequence data • algorithms to assess sequence fidelity, assemble sequences, and “find” genes.
Maps • Thomas Hunt Morgan (early 1900s)—low resolution phenotypic markers • 1970s restriction maps • 1980s RFLPs • 1989 Maynard Olson, Leroy Hood, Charles Cantor, and David Botstein sequence itself is a marker! (STS)
PCR • Polymerase Chain Reaction • http://www.dnai.org/b/index.html • Techniques • Amplifying • Making copies of DNA
The PCR revolution • 1985 • Kary Mullis-Cetus Corporation • No need to send clones back and forth • Allowed automated DNA sequencing • No need for large clone repositiory for all human genes • Unrestricted access to genes via public sequence databases.
Kary Mullis talks about PCR • http://www.dnai.org/b/index.html • Techniques • Amplifying • Interviews • Making DNA copies • Naming PCR
Sequencing-the old way • Maxim and Gilbert or Sanger methods • http://www.dnai.org/b/index.html • Techniques • Sorting and Amplifying • Early DNA sequencing • http://www.dnai.org/b/index.html • Techniques • Sorting and Amplifying • Interviews • Dideoxy method of sequencing
Automated Sequencing • Automation made possible by new dye chemistry developed by Leroy Hood and Lloyd Smith at Cal. Inst. Tech. in 1986. • http://www.dnai.org/b/index.html • Techniques • Sorting and Amplifying • Cycle Sequencing
Inside the automated sequencer • Collaboration with ABI produced first automated sequencer. • Laser detection of each bp. • http://www.dnai.org/b/index.html • Techniques • Sorting and Amplifying • Interviews • Making sequencing automated • Inside an automated sequencer
Sequencing • Detect all 4 nucleotides in one lane so quadrupled the output from a single sequencing gel. • Dupont dye terminators—allowed all four nucleotides to be attached to terminal nucleotide in the same sequencing reaction. • Capillary eliminated need to cast gels.
Sequencing the Genome an Overview • Show sequencing.exe file containing movie about sequencing the human genome.
Two approaches to sequence the genome • Hierarchical Shotgun clone libraries • Use map to pick pieces of genome in order, break them, sequence and reassemble. (Watson) • Whole genome shotgun • Break up genomic DNA randomly, sequence several genome equivalents, and reassemble. (Ventner)
Hierarchical Shotgun Clone Libraries • Top-down strategy • Ordered library of clones based on large scale maps. • Subclone larger inserts into sequencing vector. • Reassemble sequence. • Based on order.
ESTs • Expressed sequence tags • Reverse transcribe mRNA and sequence. • Venter used nonspecific primer to randomly amplify 150-400 bp fragments of genes.
Patent controversy • NIH announced it would seed a patent on Venter’s STS. • Very controversial since functionally unknown. • More appropriate to private company. • Watson said it was “sheer lunacy” and resigned due to conflict with Bernardine Healy NIH director.
More patent • Many biotech companies arose at the time to mine ESTs and applied for patents on the genes for diagnostics and pharmaceuticals. • NIH withdrew patent application. • ESTs must be novel to be patented. • ESTs must be useful to be patented.
The result • No patents granted thus far on genes without known function.
Whole genome shotgun • Break the genome into a bunch of pieces often by mechanical shearing. • Sequence pieces and reassemble. • Weber (Marshfield Medical Research Foundation) and Myers (U of AZ) proposed method to speed sequencing. • 1998 Venter leaves NIH to head Celera and promised to sequence human genome in 3 years for $300 million.
Accelerated the public project. • Whole genome method was tested by sequencing 120 Mbp of Drosophila genome.