1 / 60

Ontology Evolution

Ontology Evolution. Mark A. Musen Stanford University. Ontology Design Criteria (after Gruber). Clarity Definitions should be objective and complete Coherence The ontology should sanction those inferences consistent with the definitions

todd
Download Presentation

Ontology Evolution

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ontology Evolution Mark A. Musen Stanford University

  2. Ontology Design Criteria (after Gruber) • ClarityDefinitions should be objective and complete • CoherenceThe ontology should sanction those inferences consistent with the definitions • ExtendibilityAn ontology should anticipate future uses • Minimal encoding biasNo assumptions about knowledge representation • Minimal ontological commitment

  3. Trade-offs in Ontology Design • Minimizing ontological commitment requires specifying a weak theory • Making definitions precise requires increasing ontological commitment • Anticipating various uses of the ontology may require increasing the number of concepts represented • Making an ontology maximally general may make it useless for any specific application

  4. Common problems when people build ontologies • Classes are not defined at useful levels of abstraction (e.g., is it necessary to distinguish among mammals or terriers?) • Class definitions are overloaded (e.g., is it helpful to have the class red bicycle?) • Hierarchical relationships are not uniformly taxonomic (e.g., amino acid is a subclass of protein) • The world (or our perception of it) changes

  5. The world does change! • What happened to the ether? To phlogiston? • What happened to diseases such as dropsy, consumption, neuresthenia, “gay lymph node syndrome”? • What happened to HTLV III? • When did scurvy become a curable disease? • When did the central dogma of biology first break down? • When did Poland begin to exist?

  6. Suggested Upper Merged Ontology (SUMO)

  7. Part of the CYC Upper Ontology

  8. A Parable: Protocol-Based Advisories

  9. Protocol-Based Advisories

  10. The ONCOCIN system (ca. 1986)

  11. ONCOCIN: Object structure drives inference

  12. OPAL first elicited the overall algorithm for the protocol

  13. Clicking on “VAM” in the graph brought up a form for entering the constituent drugs

  14. Planning entities form skeletal plan Task-level actions modify customary execution of the plan Input data predicate actions We construed ONCOCIN’s PSM as Episodic Skeletal Plan Refinement (ESPR)

  15. PROTÉGÉ-1 (ca. 1987) • A meta-level knowledge-entry system for generating knowledge-entry systems like OPAL • Assumed an ontology of the ESPR method (but we didn’t call in that, since no one other than Barry knew about ontologies in 1987) • Demonstrated in domains of oncology and hypertension clinical trials—allowing rapid generation of custom-tailored knowledge-entry tools

  16. Attributes of a Planning Entity

  17. What was PROTÉGÉ-1 doing? • The system started with a an ontology of the kinds of data on which the ESPR method operates • Developers subclassed the entities in that ESPR ontology to define the domain entities that relate to skeletal planning in a particular application area (e.g., oncology, hypertension) • The system used the subclasses to generate UIs • For entry of instance-level knowledge (e.g., that of particular clinical protocols) • For creating the electronic spreadsheet for interacting with clinical users

  18. Planning entities form skeletal plan Task-level actions modify customary execution of the plan Input data predicate actions PROTÉGÉ-1 asked users to subclass the ESPR method ontology

  19. “Ontology development as subclassing” was not sustainable • Subclassing entities in the ESPR “method ontology” did ensure that anything we said about the domain immediately had an operational semantics • There were lots of things that we wanted to say about the domain unrelated to the ESPR method

  20. The Parable ContinuesTherapy Helper, ca. 1992

  21. Mapping domain ontologies to problem-solving methods ESPR Method Method Output Ontology (e.g., fully formed plan) Input Ontology (e.g., skeletal plan, input data) Domain Ontology (e.g., clinical data, treatment history) Therapy Helper: Protocol-Based Care for HIV/AIDS

  22. ESPR Method Ontology

  23. T-Helper Application Ontology

  24. Mapping domain ontologies to problem-solving methods ESPR Method Method Output Ontology (e.g., fully formed plan) Input Ontology (e.g., skeletal plan, input data) Domain Ontology (e.g., clinical data, treatment history) Therapy Helper: Protocol-Based Care for HIV/AIDS

  25. EON: Middleware that abstracts from T-Helper The debut of Protégé/Win

  26. Protégé/Win KA tool

  27. The EON ontology continued to evolve • Support for concurrent actions • Coordination of processes • Data abstraction from the primary inputs (electronic medical record) • Temporal data abstraction from primary data • Contextualization of actions into “scenarios of care”

  28. All the ontology changes took place at the “macro” level • Major shifts in distinctions made about the world (e.g., stereotypic “scenarios”) • Major new capabilities of underlying systems (e.g., ability to drive reasoning from large numbers of automatically acquired data) • While all this was happening: Countless changes in small, individual modeling decisions

  29. In the real world, ontolgies change all the time • The number of distinctions that we can make about the world is practically infinite • We have to start somewhere! • We constantly must make new distinctions because • Our needs change • Our view of reality changes • We finally get around to it …

  30. Porphyry’s depiction of Aristotle’s Categories Supreme genus:SUBSTANCE Differentiae: material immaterial Subordinate genera:BODYSPIRIT Differentiae: animate inanimate Subordinate genera:LIVINGMINERAL Differentiae: sensitive insensitive Proximate genera:ANIMALPLANT Differentiae: rational irrational Species:HUMANBEAST Individuals:Socrates Plato Aristotle …

  31. Locus of control for group ontology development • Centralized • As in the NCI Thesaurus • Decentralized • As in the Open Directory Project

  32. NCI Enterprise Vocabulary Services 1997: R. Klausner, Director NCI, wanted a “science management system” • Know about everything funded by NCI • Goals and results – “bench to bedside” • Thereby improve and speed translation of research • Approach: • Create integrative terminology • Evolve terminology scope from supporting grants management to supporting science • Build Web-accessible infrastructure – caCORE

  33. The NCI Thesaurus

  34. NCI Thesaurus Guidelines • Develop content model • Leverage existing sources as appropriate • MeSH, VA NDF-RT, MedDRA … • Develop unique content where needed • Cancer genes, gene products, cancer diagnoses, drugs, chemotherapies, molecular abnormalities etc., and relationships among them • Link to other standards using URLs where possible • OMIM, Swissprot, GO

  35. : NCI uses a Centralized Process

  36. Open Directory Project • Started in 1998 as a volunteer effort to develop an open-content directory of Web pages • In its first year, 4500 editors had indexed 100K Web sites • By July 2005, 69K editors had indexed 4.6M sites using 580K categories • On average, between 9K and 10K volunteer editors are working on ODP at any given time

  37. Dimensions for Ontology Change Management • Central vs. Decentralized control • Continuous editing vs. Periodic archiving • Curation vs. No curation • Monitored editing vs. Nonmonitored editing

  38. Monitored editing in Protégé

  39. History of Changes is Stored in a “Change Ontology”

  40. Workflow for Change Management

  41. : The Goal: To Streamline NCI’s Cumbersome Process

  42. Why Most Ontologies Stagnate • It is tedious to evaluate ontological soundness by inspection • It is impossible to evaluate ontological coverage by inspection • It is often plain difficult to determine what an ontology is good for by inspection

  43. A Portion of the OBO Library

  44. Ontologies are not like journal articles • It is difficult to judge methodological soundness simply by inspection • We may wish to use an ontology even though some portions • Are not well designed • Make distinctions that are different from those that we might want

More Related