360 likes | 376 Views
Explore chromosome mutations, duplication events, gene and species trees, pseudogenes, orthology, and more to comprehend gene conservation, function, and rearrangements in the genome. Discover the significance of ultraconserved elements and how to measure genomic conservation effectively.
E N D
MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TA: Cory McLean http://cs273a.stanford.edu [Bejerano Aut08/09]
Lecture 8 • Chains & Nets cont’d • Conservation and Function • Gene Regulation http://cs273a.stanford.edu [Bejerano Aut08/09]
Chromosome Mutations Five types exist: Deletion Inversion Translocation Nondisjunction Duplication
Gene tree Speciation Duplication Loss (deletion) A Gene tree evolves with respect to a Species tree Species tree
Chains & Nets http://cs273a.stanford.edu [Bejerano Aut08/09]
Net highlights rearrangements A large gap in the top level of the net is filled by an inversion containing two genes. Numerous smaller gaps are filled in by local duplications and processed pseudo-genes. http://cs273a.stanford.edu [Bejerano Aut08/09]
Useful in finding pseudogenes Ensembl and Fgenesh++ automatic gene predictions confounded by numerous processed pseudogenes. Domain structure of resulting predicted protein must be interesting! http://cs273a.stanford.edu [Bejerano Aut08/09]
And Retrogenes http://cs273a.stanford.edu [Bejerano Aut08/09]
Conservation Track Documentation http://cs273a.stanford.edu [Bejerano Aut08/09]
A Rearrangement Hot Spot Rearrangements are not evenly distributed. Roughly 5% of the genome is in hot spots of rearrangements such as this one. This 350,000 base region is between two very long chains on chromosome 7. http://cs273a.stanford.edu [Bejerano Aut08/09]
Cautionary Note 1 http://cs273a.stanford.edu [Bejerano Aut08/09]
Cautionary Note 2 http://cs273a.stanford.edu [Bejerano Aut08/09]
Same Region… same in all the other fish http://cs273a.stanford.edu [Bejerano Aut08/09]
Orthology vs. Paralogy http://cs273a.stanford.edu [Bejerano Aut08/09]
Meet Your Genome contd. [Human Molecular Genetics, 3rd Edition] http://cs273a.stanford.edu [Bejerano Aut08/09]
Sequence Conservation implies Function • (but which function/s?...) Comparative Genomics of Distantly related species: functional region! human ...CTTTGCGA-TGAGTAGCATCTACTATTT... common ancestor ...ACGTGGGACTGACTA-CATCGACTACGA... anotherspecies Note: the inverse “no conservation no function”is a much weaker statement given current knowledge http://cs273a.stanford.edu [Bejerano Aut08/09]
Our Place in the Tree of Life Which species to compare to? Too close and purifying selection will be largely indistinguishable from the neutral rate. Too far and many functional orthologs will diverge beyond our ability to accurately align them. you are here [Human Molecular Genetics, 3rd Edition] http://cs273a.stanford.edu [Bejerano Aut08/09]
Metazoans (multi-cellular organisms) you are here [Human Molecular Genetics, 3rd Edition] http://cs273a.stanford.edu [Bejerano Aut08/09]
Vertebrates: what to sequence? , Stickleback , Lizard , Opossum too far sweet spot too close you are here [Human Molecular Genetics, 3rd Edition] http://cs273a.stanford.edu [Bejerano Aut08/09]
The Dawn of Whole Genome Comparative Genomics 2001 2002 40% DNA alignable 95% coding genes shared http://cs273a.stanford.edu [Bejerano Aut08/09]
More Species Have Joined Since http://cs273a.stanford.edu [Bejerano Aut08/09]
How They Measured all human-mouse alignments human-mouse ancestral repeats alignment Difference: 5% of Human Genome [Mouse consortium, Nature 2002] http://cs273a.stanford.edu [Bejerano Aut08/09]
Conserved elements in the Human Genome all human-mouse alignments human-mouse ancestral repeats alignment human-mouse ancestral repeats alignment election Difference: 5% of Human Genome Ultraconservation 85%id on average [Mouse consortium, Nature 2002] http://cs273a.stanford.edu [Bejerano Aut08/09]
Ultraconserved Elements fish 481 elements perfectly conserved (100%id) over 200bp or more between human, mouse and rat. [Bejerano et al., Science 2004] http://cs273a.stanford.edu [Bejerano Aut08/09]
* * * * * Ultraconserved Elements: Why? Hundreds of long substrings identical between amniotes they must have rejected many different changes. But... all functions we understand in our genome are encoded using redundant codes. Coding: 3 DNA letters → 1 Protein letter. E.g. Protein Coding Genes: DNA – 108 letters over alphabet of 4. Protein – 102 letters over alphabet of 20. [Bejerano et al., Science 2004] http://cs273a.stanford.edu [Bejerano Aut08/09]
* * * * * No known function requires this much conservation ? CDS ncRNA TFBS seq. http://cs273a.stanford.edu [Bejerano Aut08/09]
Other Ways to Measure [Lunter et al, 2006] [Cooper et al., 2005] http://cs273a.stanford.edu [Bejerano Aut08/09]
Phylogenetic Shadowing , Stickleback , Lizard , Opossum “too close” can actually be a boon if you have enough closely related genomes too close you are here [Human Molecular Genetics, 3rd Edition] http://cs273a.stanford.edu [Bejerano Aut08/09]
What They Found Human Genome: 3*109 letters 1.5% known function compare to other species >50% junk >5% human genome functional 3x more functional DNA than known! ~106 substrings do not code for protein What do they do then? [Science 2004 Breakthrough of the Year, 5th runner up] http://cs273a.stanford.edu [Bejerano Aut08/09]
Gene number does not correlate with Complexity Gene families are important. Many are surprisingly old. But - 1013 cells pre-genomic era: “100,000 genes to the human genome” 103 cells fly worm human weed fish rice last count down to 20,500 human genes # genes http://cs273a.stanford.edu [Bejerano Aut08/09]
Gene regulation = when/where to make protein • gene (how to) • control region(when & where) DNA ~103 letters http://cs273a.stanford.edu [Bejerano Aut08/09]
Vertebrate Gene Regulation • gene (how to) • control region(when & where) distal: in 106 letters DNA DNA binding proteins proximal: in 103 letters http://cs273a.stanford.edu [Bejerano Aut08/09]
Unicellular vs. Multicellular unicellular multicellular http://cs273a.stanford.edu [Bejerano Aut08/09]
Most Non-Coding Elements are likely cis-regulatory “IRX1 is a member of the Iroquois homeobox gene family. Members of this family appear to play multiple roles during pattern formation of vertebrate embryos.” gene deserts regulatory jungles 9Mb http://cs273a.stanford.edu [Bejerano Aut08/09]
Transient Transgenic Enhancer Assay in situ Conserved Element Minimal Promoter Reporter Gene Construct is injected into 1 cell embryos Taken out at embryonic day 10.5-14.5 Assayed for reporter gene activity transgenic http://cs273a.stanford.edu [Bejerano Aut08/09]
Enhancer verification Matched staining in dorsal apical ectodermal ridge (part of limb bud) Matched staining in genital eminence http://cs273a.stanford.edu [Bejerano Aut08/09]