130 likes | 140 Views
Explore how adding instance recognition semantics to OWL can automate web data annotation effectively, improving scalability and compatibility with standards. Learn about declerative semantics and implementation using OWL-AA.
E N D
OWL-AA: Enriching OWL with Instance Recognition Semantics for Automated Semantic Annotation 2006 Spring Research Conference Yihong Ding
Semantic Web and Automated Semantic Annotation • Semantic Web: the web containing machine-processable web data • Semantic Annotation: adds formal metadata to web pages • Metadata links data in a web page to defined concepts in an ontology • Annotated data becomes machine-processable • Annotation needs automation to be scalable
“Main Drawback” of Current Automated Semantic Annotation • Problem: “post-processing and mapping of the IE [information extraction] results to an ontology” [Kiryakov 2004] • Needs human intervention • Decreases system automation and scalability • Solution: “use ontolog[ies] more directly during the process of extraction” [Kiryakov 2004] • Does work (as our ontology-based annotation shows) • But …
A Hidden Problem: Compatible with Standards • A solution should be compatible with semantic web standards • OWL (Web Ontology Language): standard • Solutions must be OWL-compatible • Current Solution • OSMX (Object-oriented Systems Model in XML): not a standard, not OWL-compatible • Declarative instance recognition semantics • Needed by automated annotation process • Lacking in OWL
Instance Recognition Semantics in Extraction Ontologies • Instance recognition semantics: machine-processable recognizers of instances that belong to the extention of a concept in a specified domain. • Examples in extraction ontologies • External Representation • Price: \d+|\d?\d?\d,\d\d\d • Make: CarMake.lexicon • Contextual Representation • Context phrases (left, right), e.g. \$? • Context keywords: e.g. price | obo | neg(\.|otiable)
OWL: Lacks Instance Recognition Semantics • In general, OWL • Declares class, property, hierarchical relationship, restriction. • Declares instantiations. • Does not support declaration of “instance recognition” • Consequently, • Not enough declarative semantics in OWL directly useable by automated annotation • Mixture of knowledge declaration and knowledge processing • Domain experts must know program implementation; • Or, program developers must be domain experts. • No annotation integrity checking • <carad:Make>Taurus</carad:Make> is legal, though it is incorrect; • And, machines cannot catch this error.
Implementation • Jena API converts OWL-AA ontologies to OSMX ontologies • Use OSMX ontologies to do automated annotation
Conclusion • OWL-AA is a way to extend OWL to provide for automated semantic annotation. • OWL-AA overcomes the “main drawback” of automated semantic annotation. • OWL-AA allows us to separate the creation of domain knowledge from the implementation of a processor to use domain knowledge for the purpose of annotating web pages. • OWL-AA provides for annotation integrity checking.
Instantiation Instantiation Declaration vs. Instantiation Declaration
Instance Recognition Semantics Machine-processable recognizers of instances that belong to the extention of a concept in a specified domain. IRecS of Right Concept: no line in eye IRecS of Left Concept: has line in eye