90 likes | 382 Views
Human Genome Resources. Chiki Gupta November 21 st , 2005 Biophysics 101. Background on Two Resources. ENCODE- Enc yclopedia o f D NA E lements OMIM - O nline M endelian I nheritance in M an. What is ENCODE?.
E N D
Human Genome Resources Chiki Gupta November 21st, 2005 Biophysics 101
Background on Two Resources • ENCODE- Encyclopedia of DNA Elements • OMIM- Online Mendelian Inheritance in Man
What is ENCODE? • Purpose: To identify all functional elements in the human genome sequence • What are functional elements? • Protein-coding genes, regulators, enhancers, DNA sequences that regulate chromosome folding, etc. • Started in 2003 at University of California Santa Cruz • Open Consortium • Academic, government and private sector scientists are encouraged to contribute and use the online info
The 3 Phases of ENCODE • Phases 1 & 2: Identify a suite of approaches for a comprehensive identification of functional elements • Scope: Only 30 Mb (1%) of a target genome • Determination of Target: • 50 % of the 30 Mb selected manually based on presence of well-studied genes, and the existence of a substantial amount of comparative sequence data • Remaining 50% selected randomly according to a stratified random-sampling strategy based on gene density and level of non-exonic conservation • Phase 3: Expand methodology to identify all functional elements • Scope: Entire genome
The 3 Phases of ENCODE • Pilot Project Phase -Analysis of a set of representative regions • Technology Development Phase -Develop new high throughput methods to identify functional elements for target region • Planned Production Phase- Scale up to analyze the entire human genome and to find gaps in our ability to identify functional elements in genomic sequence.
ENCODE-HapMap Coordination • International HapMap Project- Focuses on 10 ENCODE random regions for an in-depth study of human genetic variation. • Goal: This data will serve as the “gold standard” data set because of the high density of SNP coverage. • Methodology of ENCODE data production: • Generate sequencing information from a number of different genomes • Perform comparative analysis to extract maximum amount of information about the human genome
Example with ENCODE • http://genome.ucsc.edu/ENCODE/encode.hg17.html • Lots of linked/annotated information! • Of particular interest to us: SNPs, Recombination Hotspots, Repeats, Introns, Splicing Locations, Conserved sequences from chimps, dogs, chicken, etc.
What is OMIM? • What is it? • Catalog of human genes and genetic disorders • Developed at Johns Hopkins University • Strength: • Information regarding diseases, and affected proteins is linked and readily accessible. • Weakness: Less useful for our project. • We’re concerned more with comparing genetic sequences, not necessarily with the details of various human diseases (especially if we model using bacterial genome)
Example with OMIM • Alzheimer’s Disease • Gene for Microtubule affinity- regulating kinase • http://www.ncbi.nlm.nih.gov/Omim/getmap.cgi?chromosome=alzheimer&first=+Find+&start=0 • Clicking on OMIM under “Summary of Maps” provides detailed information regarding the function of the specified gene region.