120 likes | 261 Views
“Ontology” Group Report: Summary. Xiaoshu, John, Vinay, Duncan, Robert, Amit, Alfredo, Vipul - An attempt to summarize and organize …. Outline. Use Cases What is an “ontology”? What knowledge can/should they represent? How should they represent knowledge? How are ontologies created?
E N D
“Ontology” Group Report: Summary Xiaoshu, John, Vinay, Duncan, Robert, Amit, Alfredo, Vipul - An attempt to summarize and organize …
Outline • Use Cases • What is an “ontology”? • What knowledge can/should they represent? • How should they represent knowledge? • How are ontologies created? • How are ontologies maintained? • How is their quality evaluated? • How may they be used? In which applications?
Use Cases • A clinical researcher wants to explore Mass Spectrometry Data and the results of the analysis for clinical use • Ability to share patient labs, test results, observations, etc. across various systems.. • The ability of a researcher to reuse gene information across various data sets (URI/Identifier mapping) • Pathologist proposes a rating of a diagnostic test. These ratings need to be reused by: • Billing for revenue generation • Cancer Registry for registering patients for trials, support groups etc. • Surgeon for determining course of treatment • Brain Atlas project: Data related to the brain model of a male mouse is strored and annotated.. A sleep disorder researcher wants to use these results to propose cures for sleep walking • A bioinformaticist discovers a DNA sequence which is uncharacterized.. Which web services will help get information related to characterizing the DNA sequence, say for clinical significance… • A high throughput experiment using Microarray Data is performed for environmental reasons. When using terms from an Environmental ontology, I discover that I need terms from the toxicology ontology. I discover that some terms are missing… I make suggestions for those missing terms and request them to be included in the ontology .. • A glycomic researcher wants experimental data annotated and combined with other annotated data generated at a different center • A diabetic patient visits a clinic in an emergency. Need to measure sugar and insulin levels.. In order to speed up the process, we want the patient the ability to provide information • Need a patient centric ontology, a clinical ontology and mappings between them • Chemotherapy at home
What is an ontology? • Model of use v/s model of meaning…. • Need to respect and assimilate current usage models… • Some “ontologies” out there… • Thesauri, Controlled Vocabularies • Taxonomies • Database Schemas • Metadata models • Ontologies • First Order Theories …. • Need to look at current W3C definition of an ontology (does it have one?) and “specialize” it for HCLS…
Need for a common shared vision • HCLSIG • HL7 • CDISC • NLM • FDA • … • Take pointers from the “Ecosystem” Group …
What knowledge should they represent? • Terminologies: Snomed, GO • Information Models • Various Genomic Artifacts: Genes, Proteins, Variants, Clinical Significances, Gene Test Result Reporting Templates • Various Clinical Artifacts: Documentation Templates, Clinical Decision Support Rules, … • Process Models • Pathways • Clinical Guidelines • Clinical Care Protocols • Clinical/Genomic Research Protocols • Web Services Annotation Models • Webservices for Ontologies v/s Ontologies for WebServices • Namespaces • Mappings to underlying heterogeneous database schemas? • ID/Value Mappings? • Gene X has ID1 in GeneBank and ID2 in NCBI OMIM • Identifier mapping algorithms?
How should the knowledge be represented? • Best practices related to use of RDF, OWL, SWRL … and any other relevant information • Probabilistic Information • Uncertainty in data: Uncertainty in genotyping data from affymetrix chip • Uncertainty in evidence • Uncertainty in hypotheses • Quality/Value judgements/Trust… e.g., I trust HCM results from Lab X more than from Lab Y • Should we propose OWL/RDF extensions for these? • Or can the current standards accommodate these issues?
How should ontologies be created? • Collaborative Ontology Development • In the context of a specific use case? – Application Requirement • Reasoning/Inferencing Requirements • E.g., interleaving the processes of annotating data with the process of creating the ontology… (typically there are independent?) • Need to distinguish between various actors: • Subject Matter Experts (create “knowledge”) • Information Modelers (create “models” or “ontologies”) • Consumers (evaluate “goodness” of ontologies indirectly via how well does the application performs) • Enable, faciliate collaboration processes • Community v/s Collaborative Ontology Development • Sociological issues, Spheres of Influences • NLP, Data Mining approaches to create ontologies • Best practice guidelines • Recommendations for namespaces, identifiers? • Human language descriptions of various pieces of knowledge • When to use RDF/OWL/SWRL, etc. • Provide quality guidance? • Provide guidance related to modularity • Building blocks and templates for HCLS? • For e.g., foundational biomedical relations by Barry Smith. • We could be the Q/A Testing group for these?. • Ontology Registries • Identifier Registries
How should Ontologies be maintained… • Evolution • Use of old data against a new ontology • Use of new data against an old ontology • Evolution of Mappings… • Versioning • History/Diffs • Merging/Partitioning • Provenance • Reason for the ontology • Dependency Propagation • Ontology Lifecycles
How should ontologies be evaluated? • General Principles of • Sound ontology design (from KR literature) • Taxonomy Design (from Library Sciences) • Quality of Ontologies? • Content • Application performance (indirect) • Quality of Mappings? • Can this be used to provide guidance to the ontology development process
How should ontologies be used? • Scalability of ontologies and applications using them… ontologies with 100,000s of concepts and relationships • Used in tools, exposed as web services • Web Services for Ontologies v/s Ontologies for Web Services • Ontologies for Data Mining • Ontologies for creating Social Communication Structures • What’s special with HCLS? • Specific vs exclusive • We have problems in “spades”: Rapidly changing knowledg • Legacy of ontology development and use… (e.g., Linnaeus classification) … Better chances of adoption/acceptance
Deliverables • Best Practice Guidelines • Use Cases • Solution Design for a particular use Case • Conversion of a subset of Snomed+GO+MedRA into OWL • Creation of mappings of the subset ontology to well known databases GeneBank, SwissProt and some clinical data…? • Design some queries against these data sources .. • Prototype? • Collaborative Ontology Development Wiki? • Wiki of Wikis that could include: • HCLSIG Wiki • BioPortal • … other Wikis …