1 / 40

Working in Real Time: Building Ontologies While Annotating the Mouse from Genotype to Phenotype

Working in Real Time: Building Ontologies While Annotating the Mouse from Genotype to Phenotype. Judith Blake, Ph.D . Mouse Genome Informatics The Jackson Laboratory Bar Harbor, ME 04609. Mouse Genome Informatics. Genotype. Expression. Phenotype. Mouse Genome Database Project (MGD)

lore
Download Presentation

Working in Real Time: Building Ontologies While Annotating the Mouse from Genotype to Phenotype

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Working in Real Time: Building Ontologies While Annotating the Mouse from Genotype to Phenotype Judith Blake, Ph.D. Mouse Genome Informatics The Jackson Laboratory Bar Harbor, ME 04609

  2. Mouse Genome Informatics Genotype Expression Phenotype • Mouse Genome Database Project (MGD) • Genes and Gene Products • Comparative Analysis • Alleles and Phenotypes • Gene Expression DB Project (GXD) • Embryonic gene expression • Extensive experimental data • Mouse Genome Sequence Project (MGS) • Connecting sequence & biology Objective: Facilitate the use of the mouse as a model for human biology by furthering our understanding of the relationship between genotype and phenotype.

  3. MGI Integration Efforts • Integrated experimental and consensus views • Mapping, molecular, alleles, expression, phenotypes • Gene to GO associations • Canonical gene and sequence • Collaborations with SWISS-PROT and LocusLink • Nomenclature standards, gene groupings • Curated mammalian orthologies • used in collaborations with RatDB, NCBI and others • Index of primary literature • Share knowledge from mouse disease models with medical informatics resources All data associations supported with evidence and citation

  4. Common Issues for Model Organism Databases • Data Integration • From Genotype to Phenotype • Experimental and Consensus Views • Incorporation of large datasets • Whole genome annotation pipelines • Large scale mutagenesis projects • Computational vs. Literature-based data collection and evaluation • Data Mining…extraction of new knowledge

  5. Challenges • Genotype • Mouse and Human genome sequences • Integrating genes/models with existing biological information • Updates, emerging knowledge • Phenotype • Mega-mutagenesis programs • Phenome project / baselines • Standard screens • Integration of mutant information, targeted mutations, transgenes, expression arrays jblake-Manchester BioInform Wk

  6. Numbers (20 March 2002) No. of References 70,874 No. of Genes 35,404 No. of Markers 54,834 Genes w/ NT Seq 31,386 Genes w/ AA Seq 12,875 Genes w/ Orthologs 7,051 Genes Mapped 19,058 jblake Manchester BioInfor Wk

  7. Genes and Markers Mammalian Homology Sequences and Maps Strains and Polymorphisms Embryonic Expression mouse BLAST, molecular segments References, AccID, Access to MGI resources Alleles and Phenotypes

  8. Enable Complex Queries “Show me all genes with their human orthologs located between cM 5 and 7 on Chr. 3 whose gene products localize to the mitochondrial membrane and whose associated mutant phenotypes include ‘skeletal dysmophology” jblake Manchester BioInfor Wk

  9. GOannotations Gene detail page in MGD for the vitamin D receptor gene, Vdr

  10. Sets of Orthologs Data associations supported by evidence and citation Orthologs of Vdr

  11. Gene/Marker Type Allele Type Assay Type Expression Mapping Molecular Mutation Inheritance Mode Nomenclature Evidence Codes Tissue Cell Lines Units Cytogenetic Molecular ES Cell Line Strain Multiple Keyword Sets jblake Manchester BioInfor Wk

  12. Allele Query Form Controlled Vocabularies for Describing Alleles

  13. Structured Vocabularies and Ontologies • Anatomy • GO: • Molecular function, • Biological process, • Cellular component • Phenotypes • Disease Models jblake Manchester BioInfor Wk

  14. Anatomical Dictionary Theiler stage 10 (7 dpc) http://genex.hgu.mrc.ac.uk/Databases/Anatomy/ Collaboration with MRC / Edinburgh 3D-Atlas project

  15. Links between anatomical structures at successive stages of mouse development enable the analysis of differentiation pathways

  16. Alternative anatomical hierarchies - describe and view anatomy from different anatomical, physiological, and disease perspectives (not just ‘geographical location’, but systems (circulatory) that ‘span geography’ - integrated analysis of expression and phenotype / disease data

  17. 94 lines Consolidated Anatomical Dictionary | heart | %cardiogenic plate | %primitive heart tube | | <myocardium | | <endocardium | | <cardiac jelly | <aortic sinus | <atrio-ventricular canal (ependymal canal) | <atrio-ventricular cushion tissue (bulbar cushion,ependymal cushion tissue) | <atrium | | %primitive atrium | | %common atrial chamber | | | <common atrial chamber bulbous cordis | | | <common atrial chamber, left part | | | | <common atrial chamber, left part, cardiac muscle (myocardium) | | | | <common atrial chamber, left part, endocardial lining | | | | <common atrial chamber, left part, cardiac jelly | | | <common atrial chamber, right part | | | | <common atrial chamber, right part, cardiac muscle (myocardium) | | | | <common atrial chamber, right part, endocardial lining | | | | <common atrial chamber, right part, cardiac jelly | | <left atrium | | | < left atrium auricular region | | | | <left atrium auricular region cardiac muscle (myocardium) | | | | < left atrium auricular region endocardial lining | | | <left atrium cardiac muscle (myocardium) | | | <left atrium endocardial lining | | <right atrium | | | <right atrium auricular region | | | | <right atrium auricular region cardiac muscle (myocardium) | | | | <right atrium auricular region endocardial lining | | | <right atrium cardiac muscle (myocardium) | | | <right atrium endocardial lining | | | <right atrium valve | | | | % right atrium venous valve | | < interatrial septum | | | < foramen ovale | | | < septum primum | | | | < foramen primum (ostium primum) | | | | < foramen secundum (ostium secundum) | | | < septum secundum | <endocardial tissue | | <endocardial cushion tissue (bulbar cushion) | | <bulboventricular groove | | <bulbus cordis | | | < bulbus cordis caudal half (myocardium) | | | | <bulbus cordis caudal half cardiac muscle (myocardium) | | | | <bulbus cordis caudal half endocardial lining | | | | <bulbus cordis caudal half cardiac jelly | | | < bulbus cordis rostral half (conotruncus) | | | | < bulbus cordis rostral half cardiac muscle (myocardium) | | | | < bulbus cordis rostral half endocardial lining | | | | < bulbus cordis rostral half cardiac jelly | < heart mesentery | | <dorsal mesocardium (dorsal mesentery of heart) | | | <dorsal mesocardium transverse pericardial sinus | <outflow tract | | <outflow tract aortic component | | <outflow tract aortico-pulmonary spiral septum | | | <outflow tract future ascending aorta | | <outflow tract pulmonary component

  18. Biol. Process Phenotype Anatomy Gene expression jblake Manchester BioInfor Wk

  19. Mouse Heart Development From The Heart by Margaret Kirby in “Embryos, Genes and Birth Defects”. Edited by Peter Thorogood Beyond mouse • Data integration depends on indexing to defined sets of objects. • Speaking the same language • ‘Development’ • ‘Heart’ • Comparisons between model organisms

  20. http://www.geneontology.org

  21. Goals of the Consortium • Develop structured vocabularies (ontologies) • Unique ID, Definition, Defined relationships • Annotate genes /gene products to vocabularies • Evidence and citation • Support common data resource for integrated queries across multiple organisms

  22. Opens browser

  23. Search returns children

  24. Returns annotated terms

  25. First-Pass Phenotype Set jblake Manchester BioInfor Wk

  26. Query: genes with mutants classified with term ‘eye dysmorphology’ Ey

  27. Genotype/Phenotype A genotype consists of zero, one or more allele pairs on a defined genetic background. The genetic background may be an inbred strain, or it may be unknown.

  28. Some Definitions • Trait: measurable characteristic of individual or population • Blood pressure, coat color, % body fat • May be associated with anatomical structure, e.g., an immune response with its site of action • Phenotype: name for a group of traits, syndrome, condition • e.g., type II diabetes, obesity, lymphocytic leukemia jblake Manchester BioInfor Wk

  29. a phenotype can be characterized by many traits & a trait can help characterize many phenotypes Leprdb-3J/Leprdb-3J Phenotype a Phenotype b Phenotype c Trait 1 Trait 2 ….. Trait n jblake Manchester BioInfor Wk

  30. Developing structured descriptors for traits • Use existing and develop new controlled vocabularies that cover orthogonal concepts • Combine terms from these vocabularies to describe traits • Assign phenotype (disease) terms for nomenclature ease Joel Richardson, Michael Ashburner, Martin Ringwald jblake Manchester BioInfor Wk

  31. Concept Examples System: Immune system, cardiovascular system Tissue: heart, lung, liver, eye, skin Cell type: epithelial, fibroblast, myoblast, melanocyte Age: E15, P25 Biol.Process: apoptosis, growth, cell differentiation, behavior Metabolite: Glucose, Calcium Qualifier: abnormal, absent, enlarged, increased, disrupted DCS = dolichostenomelia = disproportionally long limbs, due to long bone overgrow

  32. Relationships of Mouse Models to Human Diseases • Mouse gene ortholog, same mutation • Same phenotype • Different phenotype • Mouse gene ortholog, different or unknown mutations • Same or different phenotypes • Mouse phenotype same as human • Mouse gene ortholog • Another mouse gene • Gene unknown • Mouse phenotype similar • Unknown genetic component • Gene same or different

  33. Relationship to human genes and disease

  34. Goal: Query Mouse Data by Human Disease Test Results • 1676 disease listings in OMIM • 382 have phenotype reports • 3187 notated mouse/human orthologs • 958 correspond to OMIM entries • 305 have phenotype reports • 8535 listings in MESH disease tree • 709 correspond to orthologs • 237 have phenotype reports

  35. Summary • Integration • Requires both manual and computational approaches • Attention to data modeling, object identity, data migration issues • Ontologies and standardized vocabularies • Integral component of integration effort • Essential for extracting knowledge • Parallel development • ontology representations • data acquisition and integration efforts jblake Manchester BioInfor Wk

  36. Acknowledgments - MGI Carol Bult Ben King Richard Baldarelli Dirck Bradt Sridhar Ramachandran Deborah Reed Diane Dahman Sophia Zhu Donnie Qi LongLong Yang Pat Grant Nancy Butler Janan Eppig Joel Richardson Martin Ringwald Jim Kadin Lois Maltais Louise McKenzie Harold Drabkin Tom Weigers Jon Beal Lori Corbani Cathy Lutz Cynthia Smith Teresa Chu Sharon Cousins Donna Burkart Ira Lu Li Ni Carroll Goldsmith Moyha Lennon-Pierce Antonio Planchart www.informatics.jax.org David Hill Dale Begley Terry Hayamizu Ingeborg McCright Connie Smith Matt, Mike, Leslie, Jeff, Prita, Jill, Diane, DebbieK, Dieter, Lucette, Janice,

  37. Mouse Genome Informatics http://www.informatics.jax.org Gene Ontology http://www.geneontology.org jblake Manchester BioInfor Wk

More Related