1 / 38

Chapter 24 topics: Genomics, Proteomics, Bioinformatics

Chapter 24 topics: Genomics, Proteomics, Bioinformatics. Student learning outcomes: Describe tools to obtain DNA sequences of genomes Explain how microarrays analyze the transcriptome Describe how proteomics studies proteins of cells

ankti
Download Presentation

Chapter 24 topics: Genomics, Proteomics, Bioinformatics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 24 topics:Genomics, Proteomics, Bioinformatics Student learning outcomes: • Describe tools to obtain DNA sequences of genomes • Explain how microarrays analyze the transcriptome • Describe how proteomics studies proteins of cells • Define how bioinformatics manages vast stores of DNA data Figures: 1, 3-13, 16, 17, 19, 20, 23, 24, 27, 28, 30; Tables 1, 2, 3 Problems: 1, 2*, 3-7, 9,12*, 15, 17,18, 20*, 22, 23*, 24, AQ3*,4

  2. 24.1 Positional Cloning Positional cloning: discover genes for genetic traits • Mapping studies to roughly locate gene of interest to relatively small region of DNA on chromosome • Physical landmarks - relate to gene position: • Restriction Fragment Length Polymorphisms (RFLP): lengths of restriction fragments from a specific enzyme vary among individuals • CpG Islands: DNA with unmethylated CpG is often actively expressed; find with methylation-sensitive restriction enzymes (HpaII vs. MspI for CCGG)

  3. Southern blots detect RFLPs Fig. 1 People differ in presence of particular HindIII site

  4. Classic example: Identifying Gene Mutated in Human Huntington’s Disease (HD) • Dominant disease, late onset, degenerative • Used RFLPs with huge family groups having disesase, Wexler, Gusella to map HD gene near end of chromosome 4 • Mutation causing disease is expansion of CAG repeat from normal range of 11-34 copies to abnormal range of > 38 copies (triplet expansion) • Extra repeats -> extra Gln inserted into huntingtin, product of HD gene • Huntingtin has normal role in brain: interferes with transcription factor SP1 binding TAF130 • Mouse knockout: heterozygotes have neuro problems; null are dead

  5. RFLPs helped locate Huntington’s disease gene Fig. 3 Combinations of RFLP distinguish 4 possible haplotypes Fig. 4 Southern blot defines haplotype genotypes of members

  6. HD gene identified from studies large families Pedigree studies, molecular studies of haplotypes, and correlation with disease: lead to cloning of gene and prediction for disease (variable age onset) Fig. 24.5 Haplotype C is associated with disease - predictive

  7. 24.2 Sequencing Genomes • Information from genome sequences: • Location of exact coding regions for all genes • Spatial relationships among genes, exact distances between them in bp • Sanger dideoxy sequencing 1977 (fX174 phage) • How is coding region recognized? • Contains an ORF long enough to code for protein • ORF (open reading frame) must • Start with ATG triplet • End with stop codon • Phage or bacterial ORF same as coding region • Eukaryotic ORF definition is more difficult: introns

  8. Genome Results (Table 1 examples) Numerous RNA or DNA sequences of genomes of viruses and organisms have been obtained: • Phages, viruses • Bacteria • Animals • Plants • Human, Neanderthal Comparison of related genomes (close or distant) sheds light on evolution of species: phylogeny from combination of traditional and molecular data

  9. * * * *

  10. Human Genome Project (3 x 109 bp haploid) A. Original plan systematic and conservative: (1990) • Funded by NIH, Dept. of Energy • Prepare genetic, physical maps with markers: then piece DNA sequences together in proper order • Plan most sequencing after mapping complete • [Also many model organisms sequenced to compare] • Celera, a private, for-profit company (J.C. Venter) vowed to complete rough draft of genome by 2000 B. Celera method was shotgun sequencing: • Whole genome chopped up and cloned • Clones sequenced randomly • Sequences pieced together by computer programs

  11. Vectors for Large-Scale Genome Projects BAC YAC Figs. 7, 8 • Two high-capacity vectors for Human Genome Project • Mapping mostly used yeast artificial chromosome (YAC), accepts million base pairs • Sequencing used bacterial artificial chromosomes (BAC) accepts about 300,000 bp • BACs are more stable, easier to work with than YACs

  12. A. Clone-by-Clone Strategy • Mapping requires set of physical landmarks to relate positions of cloned genes, then sequence • Some markers are genes; many are nameless stretches of DNA (must organize it all) • RFLPs– want polymorphic regions • Ideally different pattern for people with disease vs. normal people locates disease genes (like HD) • VNTRs, variable number tandem repeats of small seq. • Mini-satellite, Highly polymorphic, useful for forensics • STSs, sequence-tagged unique sites, expressed-sequence tags and microsatellites

  13. Sequence-Tagged Sites- physical maps • STSs unique sequences • 60-1000 bp long • Detectable by PCR • Need sequence information for primers; • Need not be in a gene • Design short primers • Hybridize few hundred bp apart • Amplify predictable length of DNA – see on gel Fig. 9

  14. Sequence-Tagged Sites - Physical Maps Align cloned sequences to form contigs (contiguous overlapping DNA sequences) Fig. 10

  15. Shotgun-Sequencing Method used by Celera Fig. 11: Connect overlapping BAC clones by identification of STCs, sequence-tagged connectors

  16. Human Genome Project • Working draft (2001) reported by Venter (Celera) and NIH/DOE consortium: • Estimated genome contained fewer genes than anticipated – 25,000 to 30,000 • 2007 completed version • About half of genome from action of transposons • Bacteria also donated dozens of genes • Provides information about human evolution: chimpanzee, Neanderthal, many other genomes

  17. Findings from Chromosome 22 – 1st one 679 annotated genes: • 274 Known genes, previously identified • 150 Related genes, homologous to known genes • 134 Pseudogenes, sequences homologous to known genes, but defects preclude proper expression Coding regions of genes only a tiny fraction • Annotated genes 39% of total length • Exons only 3% • Repeat sequences (Alu, LINEs, etc) are 41% Large chunks of human chromosome 22q conserved in several different mouse chromosomes

  18. Homologs • Orthologs: homologous genes in different species evolved from common ancestor: • 8 regions to 7 mouse chromosomes • Paralogs: homologous genes that evolved by gene duplication within a species • Homologs:any kind of homologous genes, both orthologs and paralogs Fig. 13 Large chunks of human chromosome 22q conserved in several different mouse chromosomes (113 genes)

  19. Chromosome 21 • Relative few genes • 225 genes • 59 pseudogenes • All 24 genes shared with mouse chromosome 10 are in same order in both chromosomes • Disease genes associated with chromosome 21: • Down syndrome is extra chromosome • Alzheimer’s, ALS (Lou Gehrig’s disease) genes

  20. The X Chromosome • Sequence of 151 Mb of human X chromosome: - 1098 protein-encoding genes • 168 genes governing X-linked phenotype • Genes for 173 noncoding RNAs • Lot of genes identified for human disease (sex-linked) • Chromosome rich in LINE1 repetitive elements • Involved in X inactivation mechanism in female cells • XIST RNA (X-inactivation specific) • 32-kb RNA responsible for X-inactivation, heterochromatin X (and partner Y) evolved from ancestral autosomes

  21. Other Vertebrate Genomes • Comparing human genome with other vertebrates: • helped identify many human genes • help identify defective genes for human genetic diseases • Closely related species (mouse) identify when and where genes are expressed; predict when and where human genes likely expressed Fig. 14 Mouse, human

  22. The Minimal Genome – J. Craig Venter • Define essential gene set of simple organism • Mutate one gene at a time; see which required for life • In theory, could define minimal genome: set of genes required for life • Minimum genome likely larger than essential gene set • Sequence a small genome, then delete genes Mycoplasmagenitalium, 580 kb (480 protein-coding genes) • No cell wall, intracellular parasite, only glycolysis • 2010 placed synthetic minimal genome (1 x 106bp) into Mycoplasma cell lacking genes : • new life form that can live and reproduce under lab conditions – controversial approaches

  23. The Barcode of Life • CBOL (Consortium for the Barcode of Life: plan to create barcode to identify any species of life on earth • First such barcode - sequence of 648-bp piece of mitochondrial COI gene from each organism • Cytochrome C oxidase • Isolate mitochondrial DNA, sequence • Sequence can uniquely identify most organisms • Other sequences needed for plants and bacteria, since less variation among their COI genes

  24. 24.3 Applications : Functional Genomics • Functional genomics deals with function or expression of genomes • Transcriptome: all transcripts an organism makes at any given time • Genomic functional profiling: use of genomic information to block expression systematically • Proteomics: study structures and functions of protein products of genomes

  25. Transcriptomics • Study all transcripts organism makes • Create DNA microarrays (microchips) that hold 1000s of cDNAs or oligos • Hybridize labeled RNAs (cDNAs) from cells to chips • Intensity of hybridization at each spot reveals the extent of expression of corresponding gene • Arrays measure expression of many genes at once • Clustered expression of genes in time and space suggests products of these genes collaborate in some process -> function • Affymetrix makes chips, 25-mer unique sequences

  26. DNA chips: Oligo-nucleotides on a Glass Substrate Fig. 16 Serum-starved human cells cDNA (labeled green); serum-fed cells cDNA (red) Equal expression of mRNA = yellow Fig. 17

  27. Genomic Functional Profiling • Deletion analysis - mutants created by replacing genes with antibiotic resistance gene flanked by oligomers serving as barcode for that mutant • Functional profile can be obtained by growing whole group of mutants together under various conditions to see which mutants disappear most rapidly Fig. 21 Growth of yeast mutants on galactose C source

  28. RNAi Analysis • Genomic functional analysis: RNAi inactivates genes • Ex. genes involved in early embryogenesis in C. elegans: • 661 important genes (early embryo defect) • 326 involved in embryogenesis Fig. 22: initial screen showed which genes were mutated with RNAi; Then see which stage of embryogenesis affected

  29. * Locating Target Sites for Transcription Factors (ChIP-chip) • Chromatin immunoprecipitation (ChIP) followed by DNA microarray analysis can identify DNA-binding sites for activators and other proteins • Small genome organisms - all intergenic regions can be included in microarray • If genome is large, not practical • To narrow areas of interest, can use CpG islands • Non-methylated CpG associated with gene control region • If timing/conditions of activator’s activity are known, control regions of genes known to be activated at those times, or under those conditions, can be used

  30. ChIP-chip assays locate target sites for specific transcription factors • ChIP with specific antibody • PCR adding generic primer to all • fluorescent label • microarray • See Fig. 25 • Yeast Gal4 protein binding sites Fig. 24

  31. In Situ Expression Analysis‘Mouse blots’ • Mouse as human surrogate in large-scale expression studies (ethically impossible in humans) • Studied expression of almost all mouse orthologs of genes on human chromosome 21 • Followed stages of embryonic development (E) • Catalogued embryonic tissues in which genes expressed Fig. 26

  32. Single-Nucleotide Polymorphisms; pharmacogenomics • Single-nucleotide polymorphisms (SNPs) are single bp differences between people;account for many genetic conditions caused by single genes, even multiple genes • Might be able to predict response to a drug • New focus for therapeutics • Haplotype map with > 1 million SNPs: sort out important SNPs from those with no effect

  33. 24.4 Proteomics • Proteome: all proteins produced by an organism • Proteomics: Study of all proteins, or subsets • More accurate picture of gene expression than transcriptomics studies: • Sometimes mRNA is degraded, not translated • First separate proteins, often on massive scale • 2-D gel electrophoresis is good tool • After separation, identify proteins • Digest proteins with proteases • Identify peptides by mass spectrometry

  34. MALDI-TOF Mass Spectrometry Matrix-assisted laser desorption ionization – time of flight Peptides ionized; time to reach detector accurately reflects mass Fig. 27

  35. Detecting Protein-Protein Interactions Epitope tag on one protein (from gene level) permits isolation of complex containing that protein using affinity resins Common epitopes: His6-tag, HA- tag Flag-tag, TAP-tag ** In future, microchips with antibodies may allow analysis of proteins in complex mixtures without separation Fig. 28

  36. Identifying Protein Interactions, networks • Most proteins function with other proteins • Yeast two-hybrid analysis • Protein microarrays • Immunoaffinity chromatography with mass spectrometry Fig. 29. Identifying proteins binding kinases using Flag-tagged KssI or Cdc28

  37. 24.5 Bioinformatics • Bioinformatics: building and using biological databases • DNA sequences of genomes • mining massive amounts of biological data for meaningful knowledge about gene structure and expression • National Center for Biological Information (NCBI) website: vast store of biological information (genomic and proteomic) • Start with DNA sequence, discover gene, then compare that sequence with that of similar genes or organisms • View 3D protein structures on computer

  38. Review questions 2. What kind of mutation gave rise to Huntington disease? 12. Compare/ contrast the clone-by-clone sequencing strategy with the shotgun sequencing strategy for large genomes 15. The pufferfish genome is nine times smaller than human genome, but contains as many genes. How can that be? 20. Describe hypothetical experiment using DNA microarray to measure transcription from SV40 viral genes at two stages of infection of cells by the virus. Show example results.

More Related