170 likes | 300 Views
Towards Linguistically Grounded Ontologies. Paul Buitelaar , Philipp Cimiano , Peter Haase , and Michael Sintek Proceedings of the 6 th European Semantic Web Conference (ESWC’09) Heraklion , Greece, May/June 2009, 111-125. 1 Introduction. Ontologies need linguistic grounding because:
E N D
Towards Linguistically Grounded Ontologies Paul Buitelaar, Philipp Cimiano, Peter Haase, and Michael Sintek Proceedings of the 6th European Semantic Web Conference (ESWC’09) Heraklion, Greece, May/June 2009, 111-125 Buitelaar et al.
1 Introduction • Ontologies need linguistic grounding because: • Easier for human developers • Automatic information extraction is easier • Helps in “verbalizing” an ontology • RDFS, OWL, and SKOS not adequate - W3C standards • Present a unified model LexInfo based on: • LingInfo • LexOnto • Lexical Markup Framework (LMF) – ISO standard • Basis for future Semantic Web standardization Buitelaar et al.
2 Motivation • Separation between Linguistic and Ontological Level • Flexible Coupling of the Ontological and Language Systems • Subcategorization and Predicate-Argument Structure • Why Related Work is Not Enough Buitelaar et al.
Separation of Levels • rdfs:label is not good enough: <rdfs:Class about="#Cat"> <rdfs:labelxml:lang="en">cat</rdfs:label> <rdfs:labelxml:lang="en">cats</rdfs:label> <rdfs:labelxml:lang="de">Katze</rdfs:label> <rdfs:labelxml:lang="de">Katzen</rdfs:label> </rdfs:Class> • Fails to capture linguistic relationships • Linguistic data does not belong in domain ontology • Capture in a separate linguistic model - lexicon Buitelaar et al.
Flexible Coupling of Layers • Options for ‘Schweineschnitzel’ (pork cutlet) • ‘Schweineschnitzel’ => class Schweineschnitzel • ‘Schweineschnitzel’ => • ‘schnitzel’ => class schnitzel • ‘schnitzel’ => class schnitzel and ‘Schweine’ => pork • Need flexibility in ontology linguistic relations • Not “fully synchronized” Buitelaar et al.
Subcategorization and Predicate Arguments • Part-of-speech information is essential: • (Germany, capital, Berlin) – capital is a noun • Need subcategorization frames: • (Rhein, flowsThrough, Karlsruhe) – flow is intransitive, requires through phrase, flow => flows • Must capture variation of expression: • locatedAt: passes by, connects, goes through • Map verb arguments to predicate arguments: • [The A8: subject] connects [Karlsruhe: direct object] => (Karlsruhe, locatedAt, A8) Buitelaar et al.
Why Related Work is Not Enough • More expressive models are needed: • Capture morphology separately • Represent decomposition and linking of components • Model complex linguistic patterns, eg. subcat. frames • Specify meaning with respect to a domain ontology • Clearly separate linguistic and ontological levels • SKOS, LMF, LexOnto, NLP frameworks, and LWF all fail to meet some of the requirements Buitelaar et al.
3 Towards an Ontological and Linguistic Joint Model • Previous Work • LingInfo – direct connection of linguistic information to classes and properties • LexOnto – subcategorization frames and relation to properties • Lexical Markup Framework (LMF) – core package plus extensions for morphology, syntax, and semantics • The LexInfoModel – built on LMF, integrates LingInfo and LexOntomodels Buitelaar et al.
The LexInfo Model • Req. 1: Morphology Relations • Already done in LMF • Req. 2: Decomposition of Complex Terms • ListOfComponents extends LMF morphology • Make owl:Entity subclass of lmf:Sense • Req. 3: Subcategorization Frames • Link lmf:SyntacticBehavior to lmf:PredicativeRepresentation • Additional sublclasses for LMF classes • Req. 4: Relate to Domain Ontologies • Automatic by linking to domain ontologies • Req. 5: Separation Between Linguistics and Ontologies • Fully separate, related by OWL2 meta-ontology Buitelaar et al.
4 Conclusions • Language/knowledge interface too complex for RDFS/OWL/SKOS alone • LingInfo allows publishing reusable models • Other models fall short of requirements • LexInfo integrates LingInfo and LexOnto models using LMF as the “glue” • Ontologies and Java API available on Web • Intend to continue developing and working with the LFM working group • Basis for further standardization Buitelaar et al.