170 likes | 454 Views
Seminal achievement. Scientific milestone. Scientific implications. Social implications. Human Genome Project. HGP: Background. International Human Genome Sequencing Consortium: Proposed 1985, endorsed in 1988. 20 governmental groups. “Public project.”. Craig Venter & Celera Genomics:
E N D
Seminal achievement. Scientific milestone. Scientific implications. Social implications. Human Genome Project
HGP: Background • International Human Genome Sequencing Consortium: • Proposed 1985, endorsed in 1988. • 20 governmental groups. • “Public project.” • Craig Venter & Celera Genomics: • Founded 1998. • Sequence in 3 years. • Technology: automation, computers. • Had access to public project’s data. Race ends in tie Feb. 2001: both publish in Science and Nature.
International Human Genome Sequencing Consortium • Approach was conservative and methodical. • Had to wait for technology. • First produced a clone-based physical map of the genome that would serve as a scaffold for the later sequence data: • Broke genome into chunks of DNA whose position on chromosome was known from maps, clone into bacteria using BACs. • Digest BAC-inserted clonal chunks of DNA into small fragments. • Sequence small fragments. • Stitch together BAC clones to assemble sequence. • Assemble genome sequence from BAC clone sequences, using clone-based physical map.
Celera • Approach using "shotgun sequencing" (no organized map). • Shreds genome randomly into small fragments with no idea of where they are physically located. • Clones and sequences fragments. • Uses computer to stitch together genome by matching overlapping ends of sequenced fragments.
Timeline • Genome sequencing driven by technology. • 1985: 500 base pairs per day by hand. • 1985-86: PCR and automated DNA sequencing. • 1992: BACs. • 2000: 1000 bases per second.
Waiting for Technology • Eyes on the human genome. • While waiting for technology other genomes were sequenced.
Current Status • Human genome ~3.2 Gb. • “Rough draft” sequence of the human genome. • Have sequenced 90% of the 2.5 Gb of gene-rich (euchromatic) DNA. • What is considered finished? • Fewer than 1 base in 10,000 is incorrectly assigned. • More than 95% of the euchromatic regions are assigned. • Each gap is smaller than 150 kb.
Access to Information • All public project data on the Internet. • NCBI Website: www.ncbi.nlm.nih.gov. • Human genome database. • Sequence and mapping tools.
Database Search Example • The genome database has many tools to locate a gene of interest or search for potential traits of the gene. • Example–chromosomal map search result for the "breast cancer–causing gene" BRCA2:
Early Statistics • Only 28% is transcribed into RNA. • Only 1.1%-1.4% of genome actually encodes protein (=5% of transcribed RNA). • Surprises: • More junk DNA. • Fewer genes.
Junk DNA • No apparent direct biological function. • Long stretches of repeated sequence. • Hot area of investigation. • Human genome has far more repeat DNA than any other sequenced organism (over half). • Parasitic elements–45% of this repeat DNA is from selfish, parasitic DNA: • Transposable elements. • May play role in evolution.
Gene Count • Many fewer genes than expected (half): • Only 35,000-45,000 genes vs. previously predicted 100,000. • Only twice the amount of a nematode or a fruit fly. • Does not correlate to twice as complex. • Alternative splicing: Invertebrate genes are more innovative in their assembly of genes. • Protein domains are mixed more creatively and in larger numbers by invertebrates. • Genes elusive.
Genetic Variation • The International Single Nucleotide Polymorphism (SNP) Map. • Compiled 1.4 million SNPs (single-base pair differences between individuals). • Investigate: • Disease resistance. • Response to therapeutics. • Evolution. • Natural selection. • Individual traits.
Gene Variation Example • Mutations in "breast cancer gene” BRCA2. • Chromosomal location and beginning sequence with one of the mapped variations.
Future Directions • Fill gaps (refinement). • Bioinformatics. • Sequence additional genomes. • For comparison. • Upcoming: mouse, fish, dogs, kangaroo, chimpanzee (most valuable). • Proteomics. • Gene and Protein Chips (Microarrays).