1 / 72

Question Answering over Linked Data

QALD is an open challenge for question answering systems mediating between users and semantic data, aiming to evaluate and compare participating systems. The evaluation measures include Recall, Precision, and F-Measures for each question.

ezekield
Download Presentation

Question Answering over Linked Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Question Answering over Linked Data

  2. Outline Question Answering overLinked Data (QALD) Case Studies Challenges Trends

  3. QALD

  4. What is QALD QALD is a series of evaluation campaigns on multilingual question answering over linked data. This open challenge is aimed at all kinds of question answering system that mediates between a user, expressing his or her information need in natural language, and semantic data. The goal is to evaluate and compare participating systems.

  5. Why QALD——Motivations The web of data continues to grow. How to allow end users to profit from the expressive power of Linked Data while at the same time hiding their complexity behind an intuitive and easy-to-use interface ?

  6. Why QALD——Motivations The expressivity-usability trade-off for querying over structured data. Ideally,a query mechanism for linked data must provide both high expressivity and high usability.

  7. QALD-1,ESWC (2011) Datasets: Dbpedia 3.6 (RDF) MusicBrainz (RDF) Tasks: Training questions: 50 questions for each dataset Test questions: 50 questions for each dataset

  8. QALD-1

  9. QALD-1 [1]. The answers will be either a literal (a boolean, date ,number, or string), or list of resources, for which both the URI as well as an English label or name (if it exists) is specified. [2]. You are free to specify eithor a SPARQL query or the answers(or both), depending on which of them your system returns. [3]. You are also allowed to change the query(insert quotes, reformulate,extract and use only keywords, use some structured input format, and the like). [4]. You are free to use all resources (e.g. WordNet, GeoNames, dictionary tools, and so on). [5]. All options count as correct answers, as long as the answer list contains all and only the correct resources.

  10. Evaluation • Test Collection Questions Datasets Answers (Gold-standard) • Evaluation Measures Recall Precision F-Measures

  11. QALD-1 Evaluation The evaluation tool computes precision, recall and F-measure for every question: The tool then also computes the overall precision, recall and F-measure along standard definitions:

  12. QALD-2, ESWC (2012) Datasets: Dbpedia 3.7 (RDF) MusicBrainz (RDF) Tasks: Training questions: 100 questions for each dataset Test questions: 50 questions for each dataset

  13. QALD-2

  14. QALD-2 [1]. answertype: resource / string / number / date / boolean; [2]. aggregation: whether any operations beyond triple pattern matching are required to answer the question (e.g., counting, filters, ordering,etc.). [3]. onlydbo: only for DBpedia questions and reports whether the query relies solely on concepts from the DBpedia ontology. [4]. You are also allowed to change the query(insert quotes, reformulate,extract and use only keywords, use some structured input format, and the like).

  15. QALD-2 As an additional challenge, a few of the training and test questions are out of scope

  16. QALD-2 Evaluation The evaluation tool computes precision, recall and F-measure for every question: The tool then also computes the overall precision and recall taking the average mean of all single precision and recall values, as well as the overall F-measure.

  17. QALD-3, CLEF (2013) Datasets: Dbpedia 3.8 (RDF) 100/100 Spanish DBpedia 100/100 MusicBrainz (RDF) 100/50 Tasks: Multilingual QA: In order to achieve the goal that users from all countries have access to the same information. Given a RDF dataset and a natural language question or set of keywords in one of six languages (English, Spanish, German, Italian, French, Dutch), either return the correct answers, or a SPARQL query that retrieves these answers. Ontology Lexicalization: aimed at all methods that (semi-)automatically create lexicalizations for ontology concepts [CLEF:Conference and Labs of the Evaluation Forum]

  18. QALD-3

  19. QALD-4, CLEF (2014) Task1: multilingual QA over Dbpedia .[200,50] Task2: Biomedical QA over interlinked data. Task3: hybrid approaches using information from both structured and unstructured data. Datasets: Dbpedia 3.9 (RDF) SIDER, Diseasome, Drugbank (RDF)

  20. Task2 “25,25” The focus of the task is on interlinked data. Distributed among a large collection of interconnected datasets, and that answers to questions can often only be provided if information from several sources are combined.

  21. Task3“25,10” The focus of the task is on hybrid QA, i.e. the integration of both structured data(RDF) and unstructured data(free text available in the Dbpedia abstracts). A lot of information is still available only in textual form, both on the web and in the form of labels and abstracts in linked data sources. Therefore approaches are needed that can not only deal with the specific character of structured data but also with finding information in several sources, processing both structured and unstructured information, and combining such gathered information into one answer.

  22. Task3

  23. Task3 The pseudo queries cannot be evaluated against the SPARQL endpoint . Therefore, when submitting results, provide answers.

  24. Summary: QALD • QALD is a series of evaluation campaigns on question answering over linked data. QALD-1 (ESWC 2011)workshop QALD-2 (ESWC 2012) as part of a workshop QALD-3 (CLEF 2013) QALD-4(CLEF 2014) • It is aimed at all kinds of systems that mediate between a user, expressing his or her information need in natural language, and semantic data. • Trends: 2个“单-->多”: 语言, 数据集; rdf+text.

  25. Case studies The gap between natural language and Linked Data

  26. Case studies • Aqualog & PowerAqua (Vanessa Lopez et al., 2006, 2012) Querying the Semantic Web [KMi] • TBSL (Chistina Unger et al., 2012) Template-based question answering[CITEC] • Treo (Andre Freitas et al., 2011, 2014) Schema-agnostic querying using distributional semantics[DERI] • gAnswer (Lei Zou et al., 2014) QA over RDF——A Graph Data Driven Approach[pku] [Knowledge Media Institute(KMi), The Open University, United Kingdom] [The Center of Excellence Cognitive Interaction Technology(CITEC), Bielefeld University, German] [Digital Enterprise Research Institute (DERI), National University of Ireland, Galway] [Institute of Computer Science and Technology, Peking University, China]

  27. Aqualog & PowerAqua (Vanessa Lopez et al.) Querying the Semantic Web

  28. What is PowerAqua An ontology-based Question Answering system that is able to answer queries by locating and integrating information, which can be distributed across heterogeneous semantic resources. The PowerAqua supports query disambiguation , knowledge fusion ( to aggregate similar or partial answers), and ranking mechanisms, to identify the most accurate answers to queries. PowerAqua accepts users’ queries expressed in NL and retrieves precise answers by dynamically selecting and combining information massively distributed across highly heterogeneous semantic resources.

  29. PowerAqua:The architecture

  30. Main drawback [1] Its main weakness is that due to limitation in GATE it cannot cope with aggregation , comparisons, superlatives. [2] Negations, comparatives, superlatives existential, or queries involving circumstantial (why) or temporal reasoning (last week ,in the 80’s, between the year 94 and 95) are currently out of the scope of the linguistic component.

  31. PowerAqua: Summary The PowerAqua’s main strength is that it locates and integrates information from different, heterogeneous semantic resources, relying on query disambiguation, ranking and fusion of answers. • Key contributions: Pioneer work on the QA over Semantic Web data. Semantic simliarity mapping. • Terminological Matching: WordNet-based Ontology-based String similarity Sense-based similarity matcher • Evaluation: QALD (2011) recall: 0.48, precision: 0.52, f-measure: 0.5.

  32. TBSL (Chistina Unger et al., 2012) Template-based question answering

  33. What is TBSL A prototype for An approach that combines both an analysis of the semantic structure and a mapping of words to URIs. Two-step approach: [1]. Template generation Parse question to produce a SPARQL template that directly mirrors the structure of the question, including filters and aggregation opeations. [2]. Template instantiation Instantiate SPARQL template by matching natural language expressions with ontology concepts using statistical entity identificaiton and prediacate detection.

  34. Example: Who produced the most films? SPARQL template: select distinct ?x where{ ?y rdf:type ?c. ?y ?p ?x. } order by desc(count(?y)) limit 1 offset 0 ?c stands proxy for the URI of a class matching the input keyword films ?p stands proxy for a property matching the input keyword produced Instantiations: By a matching class and a matching property: ?c=<http://dbpedia.org/ontology/Film> ?p=<http://dbpedia.org/ontology/producer>

  35. TBSL: Overview [1] The input question is first processed by a POS tagger. [2] Lexical entries Pre-defined domain-independent lexical entries which leads to a semantic representation of the natural language query :convert into a SPARQL query termplate with slots that need to be filled with URIs. [3] Entity Identification: obtain URIs. [4] Entity and Query Ranking: this yields a range of different query candidates as potential translations of the input question.Ranking those query candidates.

  36. TBSL: Overview

  37. Main drawback The created template structure does not always coincide with how the data is actually modelled. Considering all possibilities of how the data could be modelled leads to a big amount of templates (and even more queries) for one question.

  38. Evaluation:QALD [1] Of the 50 training questions provided by the QALD benchmark , 11 questions rely on namespaces which they did not incorporate for predicate detection: FOAF and YAGO. They did not consider these question.[Predicate detection] [2] Of 39 questions [50-11] , 5 question cannot be parsed due to unknown syntactic constructions or uncoverd domain-independent expression.[Depending on POS tagger] [3] Of 34 questions[39-5] , 19 are answered exactly as required by the benchmark (i.e. with precision and recall 1.0) and another two are answered almost correctly (with precision and recall > 0.8). [4] all precision scores :0.61; all recall scores: 0.63; F-measure: 0.62

  39. Evaluation:QALD

  40. TBSL: Summary Key contributions: [1].The main contribution is a domain-independent question answering approach that first converts natural language questions into queries that faithfully capture the semantic structure of the question then identifies domain-specific entities combining NLP methods and statistical information.

  41. Treo (Andre Freitas et al., 2011, 2014) Schema-agnostic querying using distributional semantics

  42. Treo: Motivation How to provide a query mechanism which allows users to expressively query linked datasets without a previous understanding of the vocabularies behind tha data? Linked Data brings the vision of exposing and interlinking datasets on the Web by using Semantic Web standards. Consuming Linked Data today can be challenging. Linked Data brings a scenario where users may need to query/search over potentially thousands of highly heterogeneous datasets.

  43. The critical problem The critical problem is that the structure and terms used in the users' queries typically differ from the representation of the information in the datasets.In order to address this problem a query/search mechanism needs to cope with a robust semantic matching approach.

  44. What is Treo Treo is a natural language based semantic search engine for Linked Data. That focuses on the semantic matching behind user queries and Linked Datasets. The main goal behind Treo is to abstract data consumers from the representation of the datasets, allowing expressive natural language queries over Linked Dataset. Treo's query processing approach combines entity search, spreading activation search, and distributional semantic relatedness as the key elements to address the semantic matching problem.[video]

  45. The Treo's query processing Through 3 major steps: [1] Entity Search and Pivot Entity Determination. [2] Query Syntactic Analysis. [3] Semantic Matching (Spreading Achivation using Distributional Semantic Relatedness)

  46. [1] Entity Search and Pivot Entity Determination Consists in determining the key entities in the user query (what is the query is about?) and mapping the entities in the query to entities on datasets. The mapping from the natural language terms representing the entities to the URIs representing these entities in the datasets is done through entity search step. The URIs define the pivot entities in the datasets, which are the entry points for the semantic search process.

  47. [2] Query Syntactic Analysis Transform natural language queries into triple patterns. The user natural language query is pre-processed into a partial ordered dependency structure (PODS), a format which is closer from the triple-like (subject, predicate and object) structure of RDF. The construction of the PODS demands the previous entity recognition step.The partial ordered dependency structure is built by taking into account the dependency structrue of the query, the position of the key entity and a set of transformation rules.

  48. [3] Semantic Matching (Spreading Activation using Distributional SemanticRelatedness) Taking as inputs the pivot entities URIs and the PODS query representation, the semantic matching process starts by fetching all the relations associated with the top ranked pivot entity. Starting from the pivot entity, the labels of each relation associated with the pivot node have their semantic relatedness measured against the next term in the PODS representation of the query. The query processing approach return a set of triple paths, which are a connected set fo triples defined by spreading activation search path, starting from the pivot entities over the RDF graph.

  49. T-Space In order to define a scalable solution, a Vector Space Model was proposed based on the Treo principles. The elements of the Treo construction of a semantic space based on the principles behind Treo define a search/index generalization which can be applied to different problem spaces, where data is represented aslabeled data graphs, including graph databases and semantic-level representations of unstructured text.

More Related