420 likes | 585 Views
Who will Classify the Classifications? Differing Views of Biomedical Ontology. Mark A. Musen Stanford Medical Informatics Stanford University. Porphyry’s depiction of Aristotle’s Categories. Supreme genus: SUBSTANCE. Differentiae: material immaterial.
E N D
Who will Classify the Classifications? Differing Views of Biomedical Ontology Mark A. Musen Stanford Medical Informatics Stanford University
Porphyry’s depiction of Aristotle’s Categories Supreme genus:SUBSTANCE Differentiae: material immaterial Subordinate genera:BODYSPIRIT Differentiae: animate inanimate Subordinate genera:LIVINGMINERAL Differentiae: sensitive insensitive Proximate genera:ANIMALPLANT Differentiae: rational irrational Species:HUMANBEAST Individuals:Socrates Plato Aristotle …
This talk is meant to be descriptive in nature • We acept that an “ontology” is anything that its developers call an “ontology” • Our goal is to survey how biomedical “ontologies” are being used in practice • Ultimately, this survey will result in principles for development of all classes of “ontologies”
What has happened to the “o” word in informatics? • 10 years ago, it was the term that would not speak its name • 7 years ago, the Gene Ontology made the “o” word fashionable • Suddenly, a thousand flowers are blooming, and everyone has his own ontology • Ontology development has become a widespread cottage industry
“An ontology is a specification of a conceptualization” (T. Gruber) • A conceptualization is the way we think about a domain • A specification provides a formal way of writing it down
Creating ontologies has become a widespread cottage industry • Professional Societies • MGED: Microarray Gene Expression Data Society • HUPO: Human Protein Organization • Government • NCI Thesaurus • NIST: Process Specification Language • Open Biological Ontologies • GO • Three dozen (and growing) other ontologies
Lots of ontology builders are not very good philosophers • Nearly always, ontologies are created to address pressing professional needs • The people who have the most insight into professional knowledge may have little appreciation for metaphysics, principles of knowledge representation, or computational logic • There simply aren’t enough good philosophers to go around
Goals of Biomedical Ontologies • to provide a classification of biomedical entities • to summarize and annotate data • to mediate among different social groups • to mediate among different software components • to simplify the engineering of complex software systems • to provide a formal specification of biomedical knowledge
Classification of biomedical entities • To classify is human … • But biomedicine did not really get into the act until Linneaus and the advent of the ICD • The classifications that drive much of health care do not describe natural kinds: LOINC, CPT, ICD, DSM • Many classifications have huge societal implications • “Premenstrual syndrome” and “Homosexuality” as disorders in DSM • “Menopause” as a disease in ICD
Racial classifications under apartheid reflected biological “truths” • Europeans • Asiatics • Persons of mixed race (coloureds) • Bantus • Xhosa • Zulu • and six other groups …
Many classifications • Enforce or preserve existing social conventions • Are motivated because making particular distinctions is to the advantage of some social group • Reinforce or even ceate a perceived “reality” by legitimizing certain distinctions
The Proliferation of Nursing Vocabularies • International Classification of Nursing Practice (ICNP) • Nursing Intervention Lexicon and Taxonomy • The Omaha System: Nursing diagnoses, interventions, and clinical outcomes • Nursing Interventions Classifications
Some classes from the Nursing Intervention Classification • Cultural Brokerage • Bridging, negotiating, or linking the orthodox health carfe system with a patient and family of a different culture • Spiritual support • Assisting the patient to feel balance and connection with a greater power • Humor • Facilitating the patient to perceive, appreciate, and express what is funny, amusing, or ludicrous in order to establish relationships, relieve tension, release anger, facilitate learning, or cope with personal feelings
The ICD keeps doctors in business 724 Unspecified disorders of the back 724.0 Spinal stenosis, other than cervical 724.00 Spinal stenosis, unspecified region 724.01 Spinal stenosis, thoracic region 724.02 Spinal stenosis, lumbar region 724.09 Spinal stenosis, other 724.1 Pain in thoracic spine 724.2 Lumbago 724.3 Sciatica 724.4 Thoracic or lumbosacral neuritis 724.5 Backache, unspecified 724.6 Disorders of sacrum 724.7 Disorders of coccyx 724.70 Unspecified disorder of coccyx 724.71 Hypermobility of coccyx 724.71 Coccygodynia 724.8 Other symptoms referable to back 724.9 Other unspecified back disorders
Goals of Biomedical Ontologies • to provide a classification of biomedical entities • to summarize and annotate data • to mediate among different social groups • to mediate among different software components • to simplify the engineering of complex software systems • to provide a formal specification of biomedical knowledge
Summarization and annotation of data • Biologists don’t care about modeling reality beyond their data • Biologists care about • Making sense of terabytes of data • Accessing and indexing data • Comparing data sets with one another • The goal is to create annotations that make distinctions about the data, not about the world
SAEL:0 @part_of@abdomen ; SAEL:1 @part_of@adipose tissue ; SAEL:2 @part_of@adrenal gland ; SAEL:3 @part_of@amygdala ; SAEL:4 @part_of@anal canal ; SAEL:5 @part_of@aorta ; SAEL:6 @part_of@appendix ; SAEL:7 @part_of@blood ; SAEL:8 @part_of@blood vessel ; SAEL:9 @part_of@bone ; SAEL:10 @part_of@bone marrow ; SAEL:11 @part_of@brain ; SAEL:12 @part_of@brainstem ; SAEL:13 @part_of@bronchus ; SAEL:14 @part_of@caecum ; SAEL:15 @part_of@cardiovascular system ; SAEL:16 @part_of@cartilage ; SAEL:17 @part_of@cerebellum ; SAEL:18 @part_of@cerebral cortex ; SAEL:19 @part_of@cerebral hemisphere ; SAEL:20 @part_of@cochlea ; SAEL:21 @part_of@colon ; SAEL:22 @part_of@connective tissue ; SAEL:23 @part_of@corpus callosum ; SAEL:24 @part_of@decidua ; SAEL:25 @part_of@definitive endoderm ; SAEL:26 @part_of@dermis ; SAEL:27 @part_of@digestive system ; :118 SAEL:28 @part_of@digit ; SAEL:29 @part_of@dorsal root ganglion ; SAEL:30 @part_of@duodenum ; SAEL:31 @part_of@ectoderm ; SAEL:32 @part_of@endocrine system ; SAEL:33 @part_of@endoderm ; SAEL:34 @part_of@epidermis ; SAEL:35 @part_of@epididymis ; SAEL:36 @part_of@exocrine system ; SAEL:37 @part_of@external ear ; SAEL:38 @part_of@extra-embryonic structures ; SAEL:39 @part_of@eyeball ; SAEL:40 @part_of@foot ; SAEL:41 @part_of@fore limb ; SAEL:42 @part_of@forebrain ; SAEL:43 @part_of@gall bladder ; SAEL:44 @part_of@hand ; SAEL:45 @part_of@head ; SAEL:46 @part_of@heart ; SAEL:47 @part_of@hematolymphoid system ; SAEL:48 @part_of@hind limb ; SAEL:49 @part_of@hindbrain ; SAEL:50 @part_of@hippocampus ; SAEL:51 @part_of@hypothalamus ; SAEL:52 @part_of@ileum ; SAEL:53 @part_of@inner ear ; SAEL:54 @part_of@intestine ; SAEL:55 @part_of@jejunum ; SOFG Anatomy Entry List(about half of 121 terms)
Goals of Biomedical Ontologies • to provide a classification of biomedical entities • to summarize and annotate data • to mediate among different social groups • to mediate among different software components • to simplify the engineering of complex software systems • to provide a formal specification of biomedical knowledge
Meditation among different social groups • We all know how ICD and CPT codes propagate from clinicians to helath-care organizations to payors to epidemiologists to policy makers • Within institutions, ontologies provide the basis for getting our work done. • No coded lab test results, no treatment • No ICD code, no reimbursement
Goals of Biomedical Ontologies • to provide a classification of biomedical entities • to summarize and annotate data • to mediate among different social groups • to mediate among different software components • to simplify the engineering of complex software systems • to provide a formal specification of biomedical knowledge
Mediation among different software components • HL7 Developed simply to get individual departmental information systems to talk to one another • The implied ontology is one of messages, not of entities in the real world • Builds on longstanding work on machine interoperability
An ontology for CAD/CAM: STEP • Provides an international standard for interacting computer-aided design and manufacturing applications • Defines over 1300 classes of objects, addressing areas such as • Geometry and topology • Product configuration • Form features • Tolerances
Sample STEP Class Definition ENTITY part_model SUBTYPE OF (design_model); nominal_shape: shape_model; model_units: units; part_features: OPTIONAL LIST (0:#) OF form_features; part_tolerances: OPTIONAL LIST (0:#) OF shape_tolerances; equivalents: OPTIONAL LIST (0:#) OF part_model_structure; WHERE NOT (part_model IN equivalents.model_element); END ENTITY;
Goals of Biomedical Ontologies • to provide a classification of biomedical entities • to summarize and annotate data • to mediate among different social groups • to mediate among different software components • to simplify the engineering of complex software systems • to provide a formal specification of biomedical knowledge
Engineering of complex software systems • Object-oriented design and programming is well entrenched in current software-engineering practices • OOP owes considerable legacy to frame systems developed in AI in the 1970s • Ontologies are now (at least syntactically) at the core of advanced software engineering
BioSTORM Data Flow Data Source Ontology Mapping Ontology Control Structure Data Broker Data Mapper Heterogeneous Input Data Semantically Uniform Data Customized Output Data Data Sources Data Regularization Middleware Epidemic Detection Problem Solvers
Goals of Biomedical Ontologies • to provide a classification of biomedical entities • to summarize and annotate data • to mediate among different social groups • to mediate among different software components • to simplify the engineering of complex software systems • to provide a formal specification of biomedical knowledge
Formal specification of biomedical knowledge • In the “information society,” there will be increasing motivation for representing human knowledge in macheine-processable form • Ontologies of professional knowledge are being seen as having value even for their own sake
American Board of Family Practice Ontology of Clinical Care Interacts with Agents of Change Identity Use Exposed to Related to Lead to Population Record Health States Courses of Action Treat Has Associated with Contacts Alter Exhibits Reveal Evaluate Findings Link to
Goals of Biomedical Ontologies • to provide a classification of biomedical entities • to summarize and annotate data • to mediate among different social groups • to mediate among different software components • to simplify the engineering of complex software systems • to provide a formal specification of biomedical knowledge
There’s good news and bad news • We’ve been incredibly successful: Everyone is building “ontologies” • The ontologies that people are building • Model all kinds of realities • Make distinctions for a variety of political, social, engineering, and metaphysical reasons • Are of varying semantic “quality” and “robustness” • Are not going to go away easily
We have burgeoning opportunities to teach ontology developers about • Principled knowledge-representation techniques • Use of lexical corpora to inform ontology development • Methods for validating ontology design • Tools and methodologies to aid in the development process
We have to keep in mind • Ontologies are build with purposes in mind • These purposes reflect political, social, economic, and engineering goals that have little to do with metaphysics • Making explicit these additional considerations will lead to “purer” and more useful ontologies—at the risk of exposing issues that developers might rather leave buried
Goals of Biomedical Ontologies • to provide a classification of biomedical entities • to summarize and annotate data • to mediate among different social groups • to mediate among different software components • to simplify the engineering of complex software systems • to provide a formal specification of biomedical knowledge