Natural Language Questions for the Web of Data

Natural Language Questions for the Web of Data Mohamed Yahya1, Klaus Berberich1, Shady Elbassuoni2 Maya Ramanath3, Volker Tresp4, Gerhard Weikum1 1 Max Planck Institute for Informatics, Germany 2 Qatar Computing Research Institute 3 Dept. of CSE, IIT-Delhi, India 4 Siemens AG, Corporate Technology, Munich, Germany EMNLP 2012

Example of question • “Which female actor played in Casablanca and is married to a writer who was born in Rome?”. • Translation to SPARQL : • ?x hasGender female • ?x isa actor • ?x actedIn Casablanca_(film) • ?x marriedTo ?w • ?w isa writer • ?w bornIn Rome • Characteristics of SPARQL : • Complex query • good results • Difficult for the user • Author wants: automatically create such structured queries by mapping the user’s question into this representation

Translate qNL to qFL • qNL→qFL • qNL : natural language question • qFL : formal language query • (SPARQL 1.0)

Yago2 • YAGO2s is a huge semantic knowledge base, derived from Wikipedia, WordNet and GeoNames.

sample facts from Yago2 • Examples of relations: • type, subclassOf, and actedIn. • Examples of class: • person and film. • Examples of Entities : • Entities are represented in canonical form such as ‘Ingrid_Bergman’ and ‘Casablanca_(film)’. • special type of entities : strings, numbers, and dates.

DEANNA • DEANNA (DEepAnswers for maNy Naturally Asked questions)

question sentence • qNL = (t0, t1, ..., tn). • Phrase = (ti, ti+1, ..., ti+l) ⊆ qNL, 0 ≤ i, 0 ≤ l ≤ n • Phrase focus on entities, classes, and relations

Phrase detection Phrases are detected that potentially correspond to semantic items such as ‘Who’, ‘played in’, ‘movie’ and ‘Casablanca’.

Phrase detection • A detected phrase p is a pair < Toks, l > • Toks: phrase • l : label (l ∈ {concept, relation}) • Pr : the set of all detected relation phrases. • Pc : the set of all detected concept phrases.

concept detection • works against a phrase-concept dictionary • phrase-concept dictionary : instances of the means relation in Yago2

relation detection • rely on a relation detector based on ReVerb (Fader et al., 2011) with additional POS tag patterns, in addition to our own which looks for patterns in dependency parses.

Phrase Mapping

Phrase Mapping • each phrase is mapped to a set of semantic items. • To map concept phrases: • also relies on the phrase-concept dictionary. • To map relation phrases: • rely on a corpus of textual patterns to relation mappings of the form • {‘play’, ‘star in’, ‘act’, ‘leading role’} → actedIn • {‘married’, ‘spouse’, ‘wife’} → marriedTo

Example of Phrase Mapping • ‘played in’ can either refer to the semantic relation actedIn or to playedForTeam and • ‘Casablanca’ can potentially refer to Casablanca_(film) or Casablanca,_Morocco.

Dependency Parsing & Q-Unit Generation

Dependency parsing • Dependency parsing identifies triples of tokens,or triploids • <trel, targ1, targ2>, where trel, targ1, targ2∈qNL • trel: the seed for the relation phrase • targ1, targ2 : seeds for the concept phrase. • there is no attempt to assign subject/object roles to the arguments.

Q-Unit Generation • By combining triploids with detected phrases, we obtain q-units. • q-unit is a triple of sets of phrases, • <{prel∈ Pr}, {parg1∈ Pc}, {parg2∈ Pc}> • trel∈prel , targ1∈ parg1 , and targ2∈ parg2 .

Joint Disambiguation

goal of the disambiguation step • each phrase is assigned to at most one semantic item • resolves the phrase boundary ambiguity • (only nonoverlapping phrases are mapped)

resulting subgraph for the disambiguation graph of Figure 3

Disambiguation Graph • Joint disambiguation takes place over a disambiguation graph DG = (V, E), • V = Vs∪Vp∪Vq • E = Esim∪Ecoh∪Eq

Type of vertices • V = Vs∪Vp∪Vq • Vs : the set of s-node • s-node is semantic items • Vp : the set of p-node • p-node is phrases • Vrp : the set of relation phrases • Vrc : the set of concept phrases • Vq : a set of placeholder nodes for q–units

Type of edges • Esim⊆Vp × Vs • a set of weighted similarity edges • Ecoh⊆ Vs × Vs • a set of weighted coherence edges • Eq⊆ Vq× Vp× d, d ∈ {rel, arg1, arg2} • Called q-edge

Cohsem (Semantic Coherence) • define the semantic coherence (Cohsem) • between two semantic items s1 and s2 as the Jaccard coefficient of their sets of inlinks. • For Yago2, we characterize an entity e by its inlinks • InLinks(e): the set of Yago2 entities whose corresponding Wikipedia pages link to the entity. • For class c with entities e • InLinks(c) = ∪e∈c Inlinks(e) • For relations r • InLinks(r) = ∪(e1, e2) ∈ r (InLinks(e1) ∩InLinks(e2))

Similarity Weights • For entities • how often a phrase refers to a certain entity in Wikipedia. • For classes • reflects the number of members in a class • For relations • reflects the maximum n-gram similarity between the phrase and any of the relation’s surface forms

Disambiguation Graph Processing • The result of disambiguation is a subgraph of the disambiguation graph, yielding the most coherent mappings. • We employ an ILP to this end.

Definitions (part1)

Definitions (part2)

objective function

Constraints(1~3)

Constraints(4~7)

Constraints(8)

Constraints(9)

resulting subgraph for the disambiguation graph of Figure 3

Query Generation • not assign subject/object roles in triploids and q-units • Example: • “Which singer is married to a singer?” • ?x type singer , ?x marriedTo ?y , and ?y type singer

5 Evaluation • Datasets • Evaluation Metrics • Results & Discussion

Datasets • QALD-1 • 1st Workshop on Question Answering over Linked Data (QALD-1) • context of the NAGA project • NAGA collection • The NAGA collection is based on linking data from the Yago2 knowledge base • Training set • 23 QALD-1 questions • 43 NAGA questions • Test set • 27 QALD-1 questions • 44 NAGA questions • Get hyperparameters (α, β, γ) in the ILP objective function. • 19 QALD-1 questions in Test set

Evaluation Metrics • author evaluated the output of DEANNA at three stages • 1. after the disambiguation of phrases • 2. after the generation of the SPARQL query • 3. after obtaining answers from the underlying linked-data sources • Judgement • two human assessors who judged whether an output item was good or not • If the two were in disagreement , then a third person resolved the judgment.

disambiguation stage • The task of judges • looked at each q-node/s-node pair, in the context of the question and the underlying data schemas, • determined whether the mapping was correct or not • determined whether any expected mappings were missing.

query-generation stage • The task of judges • Looked at each triple pattern • determined whether the pattern was meaningful for the question or not • whether any expected triple pattern was missing.

query-answering stage • the judges were asked to identify if the result sets for the generated queries are satisfactory.

For a question q and item set s in one of the stages of evaluation • correct(q, s) : the number of correct items in s • ideal(q) : the size of the ideal item set • retrieved(q, s) : the number of retrieved items • define coverage and precision as follows: • cov(q, s) = correct(q, s) / ideal(q) • prec(q, s) = correct(q, s) / retrieved(q, s). • Micro-averaging • aggregates over all assessed items regardless of the questions to which they belong. • Macro-averaging • first aggregates the items for the same question, and then averages the quality measure over all questions.

Conclusions • Author presented a method for translating natural language questions into structured queries. • Although author’s model, in principle, leads to high combinatorial complexity, they observed that the Gurobi solver could handle they judiciously designed ILP very efficiently. • Author’s experimental studies showed very high precision and good coverage of the query translation, and good results in the actual question answers.

qNLfocus on entities, classes, and relations • Ex: “Which actress from Casablanca is married to a writer from Rome?” • entities : Casablanca, … • Classes : actresses, … • relations : marriedTo, …

Natural Language Questions for the Web of Data