1 / 18

Development and Alignment of a Domain-Specific Ontology for Question Answering

Development and Alignment of a Domain-Specific Ontology for Question Answering. Shiyan Ou 1 , Viktor Pekar 1 , Constantin Orasan 1 , Christian Spurk 2 , Matteo Negri 3 1 Research Group in Computational Linguistics, University of Wolverhampton, UK

barton
Download Presentation

Development and Alignment of a Domain-Specific Ontology for Question Answering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Development and Alignment of a Domain-Specific Ontology for Question Answering Shiyan Ou1, Viktor Pekar1, Constantin Orasan1, Christian Spurk2, Matteo Negri3 1Research Group in Computational Linguistics, University of Wolverhampton, UK 2German Research Centre for Artificial Intelligence GmbH (DFKI), Germany 3Fondazione Bruno Kessler – FBK, Italy

  2. Structure • Introduction to QALL-ME • The QALL-ME ontology • Alignment to WordNet and SUMO • How the ontology is used for data encoding • Conclusions

  3. Introduction to QALL-ME • QALL-ME (Question Answering Learning technologies Multilingual Multimodal Environment) is an EU-funded project which aims to establish a shared infrastructure for multilingual and multimodal question answering in the domain tourism. • Project’s website: http://qallme.fbk.eu/ • In the QALL-ME system • users pose natural language questions in several languages (both in textual and speech modality) using a variety of input devices (e.g. mobile phones), and • returns a list of specific answers formatted in the most appropriate modality, ranging from small texts, maps, videos, and pictures. • A domain-specific ontology for the tourism domain was developed and shared among all the partners.

  4. The ontology in the project WP3: Multilingual question interpretation WP4 Annotation of entities Indexing of data Retrieval of data QALL-ME ontology WP5: Multilingual answer extraction WP9: Evaluation See more in O39: Multilingual Resources (Ambasadeurs) at 13:05

  5. Design of the ontology • Analysis of data from content providers • Analysis of users’ requirements • Inspired by similar ontologies such as Harmonise, eTourism, Hi-Touch, TAGA, GETESS: • Harmonise and eTourism: focus on static information (e.g. accommodation and events/activities), rather than dynamic information related to travel business (e.g. customers and itineraries) as the TAGA and Hi-Touch ontologies do. • Similar to eTourism as is written in OWL rather RDFs • but wider coverage than each individual existing ontology • Introspection

  6. Technical details of the ontology • Encoded using OWL DL, since it has more expressive power than OWL Lite and has more efficient reasoning support than OWL Full • Used Protégé-OWL as the editor and RacerPro7 as the reasoner • The ontology contains • 122 classes (concepts), • 55 datatype properties and • 52 object properties which indicate the relationships among the 122 classes. • 15 top-level classes. • The class hierarchy has a maximum depth of 4.

  7. Part of the ontology (cinema/movies)

  8. Ontology alignment • The QALL-ME ontology was designed as a model of the narrow knowledge domain of tourism. • The QALL-ME ontology was complemented with information from WordNet (and implicitly MultiWordNet) and SUMO via alignment • The QALL-ME ontology is being changed so fully manual alignment was not a solution • Fully automatic alignment is not precise enough, but maybe semi-automatic alignment is a solution

  9. Ontology alignment (II) • The alignment relied on: • String similarity of element identifiers (e.g. chalet  chalet_1, SiteFacilityForChildren  facility_*) • Structural similarity for disambiguation (e.g. uses the semantic distance to aligned concepts) • Definition similarity for disambiguation (similarity between comments in the ontology and WordNet glosses is used) • Structural similarity for unmatched concepts is calculated to all the nouns in WordNet

  10. Ontology alignment (III) • The overall accuracy of the fully automatic alignment is clearly suboptimal (precision of 49% and recall of 31%), • Error analysis: • We noticed that for concept names with unambiguous matches in WordNet the algorithm performs without any errors • The poor disambiguation performance is due to the very different depths of the two ontologies • Only a few concepts have comments which are useful for definition similarity • Semi-automatic alignment requires under 30 minutes to obtain “perfect” alignment

  11. Example of alignment QALL-ME SUMO WN2.1 WN2.1 gloss Accommodation @inhabits =02647858 living quarters provided for public convenience; "overnight accommodations are available" Chalet @Building =02973228 a Swiss house with a sloping roof and wide eaves or a house built in this style PostOffice @Organization =08034771 an independent agency of the federal government responsible for mail delivery

  12. Semantic annotation and database organization • The ontology was used to encode the data • Annotated data from the content providers was converted to RDF triplets • The RDF documents can be stored in databases or plain text files • The Jena RDF API was used for the operations

  13. XML Schema Define QALL-ME Ontology Determine Determine XML Documents RDF Documents Transform Convert Convert World Wide Web Database Download HTML Parser Semantic annotation and database organization

  14. Content retrieval • For retrieval SPARQL is used • SPARQL is a query language for accessing RDF graphs by the W3C RDF Data Access Working Group • SPARQL provides interoperability between languages

  15. What movie starring Halle Berry is on in Birmingham? Class: MovieShow  Property: isInSite, Range: Cinema  Property: hasPostalAddress, Range: PostalAddress  Property: isInDestination, Range: Destination Property: name, Range: string <Birmingham> Property: hasEventContent, Range: Movie  Property: name, Range: string <unknown> Property: hasStar, Range: Star  Property:name, Range: string <Halle Berry>

  16. PREFIX qme: http://qallme.itc.it/ontology/qallme-tourism.owl# PREFIX xsd: http://www.w3.org/2001/XMLSchema# SELECT ?movieName WHERE { ?MovieShow qme:isInSite ?Cinema. ?Cinema qme:hasPostalAddress ?PostalAddress. ?PostalAddress qme:isInDestination ?Destination. ?Destination qme:name “Birmingham”^^<xsd:string> ?MovieShow qme:hasEventContent ?Movie. ?Movie qme:name ?movieName. ?Movie qme:hasStar ?Star. ?Star qme:name “Halle Berry”^^<xsd:string> }

  17. Conclusions • The QALL-ME ontology was specifically designed for the domain of tourism • The ontology is playing an important role in several parts of the project • The current ontology went through several revisions before reaching the current stage (and it may change again!!!) • Both the ontology and its alignment to WordNet and SUMO will be made freely available on the project’s website

  18. Thank you ! Project’s website: http://qallme.fbk.eu/

More Related