690 likes | 861 Views
Information Access through Textual Entailment: The Experience of the QALL-ME project. Bernardo Magnini FBK-irst, Trento, Italy. Outline. The Qallme scenario Semantic Interpretation of user queries Suggested direction: textual entailment engines Interacting with the user
E N D
Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento, Italy
Outline The Qallme scenario Semantic Interpretation of user queries Suggested direction: textual entailment engines Interacting with the user Suggested direction: provide answers with as much structure as possible (RDF) Porting the system Suggested direction: learn as much as possible from data (user questions) Conclusions
QALLME Question Answering Learning Technologies in a Multilingual and Multimodal Environment • Reference: FP6 IST-033860 • Contract Type: STREP • Start date: October 1st, 2006 • Duration: 36 months • Project Funding: 2.82 M euros http://qallme.itc.it
Query Driven vs Answer Driven Information Access • How many people live in Trento? No answer in the first ten documents using Google. • When did Hitler attack Soviet Union? We find documents containing the question itself, no matter whether or not the answer is actually provided. • Current information access is query driven. • Question Answering proposes an answer driven approach to information access. See how Google and Yahoo answer to “Who is Bill Clinton?”
SMS SMS INPUT OUTPUT VOICE VOICE TEXT TEXT MMS VIDEO DIGITAL ASSISTANT QALL-ME Scenario • Mobile Devices: Mobile Phones & PDA • Question Input: Voice/SMS • Answer Output: Voice/SMS/MMS/Digital Assistant (Images/Audio/Video/Maps and geo-referenced interactive maps)
halloI am in Trento and I would like to visit a church in the centre of the town I would like to know the name and the location of one of these churches thanks QALL-ME: Requests from the QALL-ME benchmark
halloI am in Trento and I would like to visit a church in the centre of the town I would like to know the name and the location of one of these churches thanks QALL-ME Questions To greet from the QALL-ME benchmark
halloI am in Trento and I would like to visit a church in the centre of the town I would like to know the name and the location of one of these churches thanks QALL-ME Questions To contextualise from the QALL-ME benchmark • This is explicit context • Time is implicit
hallo I am in Trento and I would like to visit a church in the centre of the town I would like to know the name and the location of one of these churches thanks QALL-ME Questions from the QALL-ME benchmark To ask
halloI am in Trento and I would like to visit a church in the centre of the town I would like to know the name and the location of one of these churches thanks QALL-ME Questions from the QALL-ME benchmark To thank
QALL-ME Resources • Qallme benchmark Acquisition for four languages (about 12,000 requests in total). Semantic annotations: transcriptions, speech acts, EAT, translations • Qallme Ontology: version 4 Both the QALL-ME benchmark and QALL-ME ontology are being made incrementally available at the project website(http://qallme.fbk.eu) under a creative common license Two papers at LREC 2008
TOWN • TRENTO • Address • - VIA VERDI 3 QALL-ME Mobile Infrastructure QALL-ME Webservices Server Side APPLICATION Resource Interface (German/ English) Webservices Front-End APPLICATION Voicedata ASR Engine Manager ASR Resource Interface Virtual Phone Engine CLIENT LIBRARY API IP IP Application data TTS Engine Manager TTS Resource Interface IP IP API Waycom srl, Demo Prototype
Showcases Cinema and Accommodation domain Automatic procedures for daily updating (Trento) Distributed services Cross-language More complex questions Mobile showcase Infrastructure has been consolidated Run on Comdata server Nokia N95 with GPS Speech input (Italian only) Cross-language: SMS only Navigation Text to Speech
Shared Semantic representation Local Information Sources Service Provider English Answer Extractor German Answer Extractor QALL-ME central QA planner Spanish Answer Extractor Italian Answer Extractor Question Type Ontology Answer Type Ontology Speech Recognizers Dialog Models QALL-ME architecture
QALL-ME in a nutshell Presentation output Question A Entailment Engine Answer Representation Training M Presentation Template Question Annotation Qallme Ontology M QALL-ME Question Collection SM M User Data
Outline The Qallme scenario Semantic interpretation of user queries Suggested direction: Entailment Engine Presenting information How to build the system Conclusions
Question Interpretation Domain ontology (entailment-based Relation Extraction) Given: A domain ontology
Question Interpretation Domain ontology (entailment-based RE) Given: A domain ontology describing binary relations of interest
Question Interpretation Domain ontology (entailment-based RE) Question Given: A domain ontology describing binary relations of interest A natural language question
Question Interpretation Domain ontology (entailment-based RE) Question Given: A domain ontology describing binary relations of interest A natural language question Determine ALL the relations of interest expressed by the question
Question Interpretation (entailment-based RE) Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10
Question Interpretation (entailment-based RE) Q1 Q2 Q3 Q4 Q5 Out of domain questions Q6 Q7 Q8 Q9 Q10
The task: example INPUT: “ What science fiction movie can I see today at cinema Astra in Trento?” OUTPUT:
The task: example R2: HasGenre(Movie,Genre) INPUT: “ What science fiction movie can I see today at cinema Astra in Trento?” OUTPUT: R2
The task: example R5: IsInDestination(Cinema, Destination) INPUT: “ What science fiction movie can I see today at cinema Astra in Trento?” OUTPUT: R2, R5
The task: example R7: IsInSite(Movie, Site) INPUT: “ What science fiction movie can I see today at cinema Astrain Trento?” OUTPUT: R2, R5, R7
The task: example R8: HasDate(Movie, Date) INPUT: “ What science fiction movie can I see todayat cinema Astra in Trento?” OUTPUT: R2, R5, R7,R8
Textual Entailment t:The technological triumph known as GPS … was incubated in the mind of Ivan Getting. h: Ivan Getting invented the GPS. TE tutorial at ACL 2007, Dagan, Roth, Zanzotto
Applied Textual Entailment A directional relation between two text fragments: Text (t) and Hypothesis (h): • Operational (applied) definition: • Human gold standard - as in NLP applications • Assuming common background knowledge – which is indeed expected from applications TE tutorial at ACL 2007, Dagan, Roth, Zanzotto
Distance-Based TE Engine Determines the best (less costly) sequence of edit operations that allow to transform T into H: - Linear distance - Tree Edit Distance Determines the cost of the three edit operations (insertion, deletion, substitution) Each rule has a probability representing the degree of confidence of the rule. Rules can be at different levels (e.g. lexical, syntactic)
Entailment-based QA over structured data Input question Pattern repository Q: “Where is cinema Astra located?” Entailment engine
Entailment-based QA over structured data Input question Pattern repository Q: “Where is cinema Astra located?” Entailment engine Q P4
Entailment-based QA over structured data Input question Pattern repository Q: “Where is cinema Astra located?” Entailment engine CONSTRUCT ?address WHERE { ?cinema rdf:type tourism:Cinema ?cinema tourism:name “Astra”. ?cinema tourism:hasPostalAddress ?addr. ?addr tourism:street ?address } Q P4
Entailment-based QA over structured data Input question Pattern repository Q: “Where is cinema Astra located?” Entailment engine CONSTRUCT ?address WHERE { ?cinema rdf:type tourism:Cinema ?cinema tourism:name “Astra”. ?cinema tourism:hasPostalAddress ?addr. ?addr tourism:street ?address } Q P4 A: Corso Buonarroti, 16 - Trento Answer
Entailment-based QA over structured data Input question Pattern repository Q: “What’s the address of Astra?” Entailment engine CONSTRUCT ?address WHERE { ?cinema rdf:type tourism:Cinema ?cinema tourism:name “Astra”. ?cinema tourism:hasPostalAddress ?addr. ?addr tourism:street ?address } Q P4 A: Corso Buonarroti, 16 - Trento Answer
Entailment-based QA over structured data Input question Pattern repository Q: “Where can I find a cinema in the city centre?” Entailment engine CONSTRUCT ?address WHERE { ?cinema rdf:type tourism:Cinema ?cinema tourism:name “Astra”. ?cinema tourism:hasPostalAddress ?addr. ?addr tourism:street ?address } Q P4 A: Corso Buonarroti, 16 - Trento Answer
Entailment-based QA over structured data Input question Pattern repository Q: “I want to see a movie at Astra. Where is it?” Entailment engine CONSTRUCT ?address WHERE { ?cinema rdf:type tourism:Cinema ?cinema tourism:name “Astra”. ?cinema tourism:hasPostalAddress ?addr. ?addr tourism:street ?address } Q P4 A: Corso Buonarroti, 16 - Trento Answer
Entailment-Based QA Language variations are held at textual level. Alleviate the need of lexical mapping (as in traditional NLI systems) Any textual entailment approach/algorithm can be used Distance-based, Machine Learning based Entailment rules with lexical and syntactic information Linguistic phenomena are independent from the database organization Re-usable across different tasks (e.g. Relation Extraction) Does not change in case of open domain QA
Outline The Qallme scenario Semantic Interpretation of user queries Presenting information Suggested direction: provide answers with as much structure as possible (RDF) How to build the system Conclusions
QALLME: RDF-based output RDF is a standard for representing knowledge in the Semantic Web RDF is independent both from languages and from media, allowing specific presentation components to be designed on top of it. All reasoning capabilities allowed by RDF will be available in order to draw inferences from answers. In order to represent the informative content of an answer, it seems natural to re-use concepts and relations already defined for the QALL-ME Ontology, rather then define a new set of predicates. Howeverthe informative content is not adequate for generating interactive QA presentations
A closer look to SPARQL queries CONSTRUCT{ … } WHERE{ … }
A closer look to SPARQL queries CONSTRUCT{ … } WHERE{ … } “Construct” portion Selects fragments of the ontology, that represent the “answer” (core answer PLUS relevant additional information, for different answer presentation strategies)
A closer look to SPARQL queries CONSTRUCT{ … } WHERE{ … } “Construct” portion Returns fragments of the ontology in the form of an RDF graph, that represent the “answer” (core answer PLUS relevant additional information, useful for answer presentation) “Where” portion Represents the constraints necessary for answer extraction
CONSTRUCT portion IN: What’s on at Modena? CONSTRUCT {?event qmo:hasPeriod ?period . ?event qmo:isInSite ?cinema . ?event qmo:hasEventContent ?movie . ?movie rdf:type ?movietype . ?movie qmo:name ?moviename . ?cinema qmo:hasGPSCoordinate ?coordinate . ?cinema qmo:name ?cinemaname . ?cinema qmo:hasPostalAddress ?postaladdress . ?postaladdress qmo:isInDestination ?destination . … qma:AnswerInstance a qma:AnswersObject ; qma:hasAnswerValue ?movie }
CONSTRUCT portion IN: What’s on at Modena? hasPeriod period event CONSTRUCT {?event qmo:hasPeriod ?period . ?event qmo:isInSite ?cinema . ?event qmo:hasEventContent ?movie . ?movie rdf:type ?movietype . ?movie qmo:name ?moviename . ?cinema qmo:hasGPSCoordinate ?coordinate . ?cinema qmo:name ?cinemaname . ?cinema qmo:hasPostalAddress ?postaladdress . ?postaladdress qmo:isInDestination ?destination . … qma:AnswerInstance a qma:AnswersObject ; qma:hasAnswerValue ?movie } isInDestination Destination postalAddress isInSite hasPostalAddr. cinema hasEventContent hasGPSCoord. movie coordinate type name name movietype cinemaName moviename
WHERE portion IN: What’s on at Modena? CONSTRUCT { … } WHERE{ ?event qmo:hasPeriod ?period . ?event qmo:isInSite ?cinema . … { ?cinema qmo:name ”Supercinema Modena" } UNION { ?cinema qmo:name "Multisala Modena" } } . … FILTER (xsd:dateTime("2008-12-05T14:19:55") <= xsd:dateTime(fn:string-join(fn:string-join(xsd:string(?date),"T"),xsd:string(?time)))) … } …the name of the cinema is “SUPERCINEMA MODENA” or “MULTISALA MODENA”
WHERE portion IN: What’s on at Modena? CONSTRUCT { … } WHERE{ ?event qmo:hasPeriod ?period . ?event qmo:isInSite ?cinema . … { ?cinema qmo:name ”Supercinema Modena" } UNION { ?cinema qmo:name "Multisala Modena" } } . … FILTER (xsd:dateTime("2008-12-05T14:19:55") <= xsd:dateTime(fn:string-join(fn:string-join(xsd:string(?date),"T"),xsd:string(?time)))) … } …the movie should be TODAY, and AFTER THE TIME OF THE QUERY
Resulting RDF graph IN: What’s on at Modena? hasPeriod period event isInDestination Trento postalAddress HasDatePeriod isInSite hasPostalAddr. HasTimePeriod Dateperiod 11°7′0′′E cinema Timeperiod Longitude hasEventContent StartDate hasGPSCoord. Latitude movie StartTime 12/11/2008 46°4′0′′N coordinate type 21.00 name name Crime Modena La Fuga