60 likes | 144 Views
W3C Life Science Ontology Issues. Session on Triples and Ontologies. Why the Tower of Babel Exists. Biomedical science is a bottom up enterprise Efficiency of competitive systems Multiple independent discovery External enabling technology and knowledge
E N D
W3C Life Science Ontology Issues Session on Triples and Ontologies
Why the Tower of Babel Exists • Biomedical science is a bottom up enterprise • Efficiency of competitive systems • Multiple independent discovery • External enabling technology and knowledge • Tension between dissemination and control • Fundamental desire to be cited • Fundamental need to control intellectual property • Implicit citation through nomenclature • If you are using my name, you are citing my discovery
Name Space Collisions • Molecular biology has an extraordinarily complex vocabulary • Many terms with highly specific meanings that are used rarely • Plasmid, pUC13, M13, cosmid, fosmid, yac, bac, pac, … • All cloning vectors, each with specific properties and uses • High information content per word • Compression through acronyms => collisions across domains • PCR • Polymerase Chain Reaction • Historically, MeSH indexed PCR as an abbreviation for “premature contraction” in cardiology • Phosphocreatine in metabolism and physiology • Specific definitions with high information content • Association • Generally a rather vague relationship • In statistical genetics, a precisely defined criteria implying that specific tests for significance have been met.
Biomedical Text is Not “Well Classifiable” • Classifiable domain • Well defined robust classes • Class definitions ~robust to algorithms and metrics • Poorly classifiable domains • Class boundaries not clear, class definitions not robust • Really just saying the best classification is one big class • Biomedical text is a web, not a collection of well defined domain specific corpuses • Is an article about P53 molecular biology, gene expression regulation or cancer biology?
Probabilistic Nature of Biomedical Knowledge • Bayes rule • I know what I have observed • I can only probabilistically rank hypotheses • Understanding evolves as more data becomes available • Language links to understanding • As the understanding evolves, the meaning of the language evolves • Ask 3 biologists to define a gene and you will get 5 definitions and 2 dissenting opinions
Questions for Ontologies Session • How to represent probabilistic concepts and meanings with logically precise standards? • How do we associate the appropriate domain specific ontology(ies) with the text we are analyzing? • How do we create sustainable merges across evolving domain specific ontologies?