1 / 18

Integrating lexical units, synsets and ontology in the Cornetto Database

Integrating lexical units, synsets and ontology in the Cornetto Database. Piek Vossen 1, 2 , Isa Maks 1 , Roxane Segers 1 , Hennie van der Vliet 1 1: Faculty of Arts, Vrije Universiteit Amsterdam 2: Irion Technologies, Delft. Project Cornetto. Financed by NTU Dutch Language Union

thisbe
Download Presentation

Integrating lexical units, synsets and ontology in the Cornetto Database

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Integrating lexical units, synsets and ontology in the Cornetto Database Piek Vossen1, 2, Isa Maks1, Roxane Segers1, Hennie van der Vliet1 1: Faculty of Arts, Vrije Universiteit Amsterdam 2: Irion Technologies, Delft

  2. Project Cornetto Financed by NTU Dutch Language Union STEVIN: Dutch Flemish Research Programme for Dutch Language and Speech Technology (2004-2011) Consortium partners VUA (Vrije Universiteit Amsterdam, General Linguistics Department) UvA (University of Amsterdam, Informatics Institute) K.U. Leuven (Katholieke Universiteit Leuven, Department of Computer Science) Irion Technologies BV Delft 2 LREC, Marrakech 28-29-30 May 2008 Lrec conference , Marrakech, May , 2008

  3. Overview • Goals of the project • What’s in the Cornetto database? • Integrating the ontology: Sumo terms and new axioms LREC, Marrakech 28-29-30 May 2008

  4. Goals of the Cornetto project • COmbinatorial Relational NEtwork voor Taal TOepassingen • Goal: to develop a lexical semantic database for Dutch: • 40K Entries: generic and central part of the language • Rich horizontal and vertical semantic relations • Combinatoric information • Ontological information LREC, Marrakech 28-29-30 May 2008

  5. Approach • Combine the information from two existing Dutch lexical resources: • The Dutch wordnet (DWN): synsets and lexical semantic relations • The Referentiebestand Nederlands (RBN): morpho-syntactic information, semantic information, pragmatic information, frame structures, lexical functions and combinatorics • Link to English WordNet • Link to Wordnet Domains • Link to SUMO LREC, Marrakech 28-29-30 May 2008

  6. Project overview DOLCE (KIF) Referentie Bestand Dutch Wordnet English Wordnet SUMO (KIF) Ontology: Dolce, Sumo Align/Merge WN-DOMAINS  Cornetto Editing * * * • Entry • LU/Synset • Pos • DWN data • RBN data • SUMO-pointer • PWN-pointer • Domain * * * Acquisition Toolkit Corpus Acquisition Toolkit Validation Corpus Corpus LREC, Marrakech 28-29-30 May 2008

  7. Lexical Unit (LU) Correspond to word-meaning pair Synonyms form morphology syntax semantics pragmatics usage examples Synset Model meaning relations Data Organization Internal relations Collection of Terms and Axioms Princeton Wordnet Czech Wordnet German Wordnet SUMO MILO Korean Wordnet Wordnet Domains Spanish Wordnet Arabic Wordnet French Wordnet LREC, Marrakech 28-29-30 May 2008

  8. Integrating the ontology: Sumo terms and new axioms LREC, Marrakech 28-29-30 May 2008

  9. Rationale for an ontological layer • Formal and fundamental model of meaning • Detection of inconsistencies • Formal reasoning • Global semantic grid LREC, Marrakech 28-29-30 May 2008

  10. SUMO/MILO as ontological framework • Based on pragmatic grounds: - availability, size, coverage - linking to English Wordnet - mapping to other Wordnet-like projects LREC, Marrakech 28-29-30 May 2008

  11. KIF Expressions vs triplets • Axioms in Sumo are written in SUO-KIF • Cornetto: replaced by triplets, based on first order logic SUMOCornetto triplet (and (instance, 0, Water) (exists ?L ?W) (instance, 1, Liquid) (instance, ?W, Water) (Attribute, 1, 0) (instance, ?L, Liquid) (Attribute, ?L, ?W)) LREC, Marrakech 28-29-30 May 2008

  12. Mapping to SUMO • Subsumption, equivalence, instance tea (drink) (+,, Tea) tea (shrub) (+,, FloweringPlant) date (fruit) (=,, Datefruit) Marrakech (instance,, City) LREC, Marrakech 28-29-30 May 2008

  13. Ontology mapping: female/male variants Teacher (a person whose occupation is teaching) SUMO: equivalent to Teacher In Dutch: no neutral form leraar (male teacher) (+,,Teacher), (instance,, Man) lerares (female teacher) (+,,Teacher), (instance,, Woman) LREC, Marrakech 28-29-30 May 2008

  14. Synsets versus Ontology Types • Many Synsets are lexicalizations that can name instances of the same Sumo Type in different contexts: • water used for a purpose (dishwater) • water occurring somewhere or originating from (tap water) • water being the result of a process (meltwater) • The latter do not grant the introduction of new Types in the ontology LREC, Marrakech 28-29-30 May 2008

  15. Complex ontology mapping • theewater (for making tea) • (exists (?A ?W) (and (instance ?W Water) (hasPurposeForAgent ?W (exists (?T) (and (instance ?T Tea) (part ?W ?T)))))) • Simplified representation as list of triplets: • (instance, 0, Water) (instance, 1, Tea) (instance, 2, Making) (component, 0, 1) (resource, 0,2) (result,1, 2) LREC, Marrakech 28-29-30 May 2008

  16. Some more triplets for water kwelwater (groundwater coming to the surface by the pressure of water, especially occurring close to a dike) • (instance, 0, GroundWater) , (instance, 1, StationaryArtifact (=Dike)) , (instance, 2, StreamWaterArea) (instance, 3, MotionUpward) LREC, Marrakech 28-29-30 May 2008

  17. But what to do with… • Grondwater (groundwater) Sumo term: GroundWater ("Groundwater is the subclass of Water that is found in deposits in the earth.") But is ground water a subclass of Water, or is it an instance of water with a certain place, usage or origin? ‘The groundwater got polluted.’ ‘They used groundwater for crop irrigation’ LREC, Marrakech 28-29-30 May 2008

  18. The end….. LREC, Marrakech 28-29-30 May 2008

More Related