300 likes | 450 Views
Natural Language Interfaces to Conceptual Models NLP talk, Sheffield, 07 October 2010. Danica Damljanovi ć University of Sheffield danica@dcs.shef.ac.uk. What are Natural Language Interfaces to Conceptual Models?. Ontology engineer. Domain expert. Customisation.
E N D
Natural Language Interfaces to Conceptual ModelsNLP talk, Sheffield, 07 October 2010 DanicaDamljanović University of Sheffield danica@dcs.shef.ac.uk
Ontology engineer Domain expert Customisation Ontology editing (e.g. using Protege) WordNet … NLI for querying NLI for Ontology authoring Domain lexicon Domain knowledge
The Objective • Increase usability of Natural Language Interfaces to ontologies • For end users: increase precision and recall • For application developers: decrease the time for customisation
Previous Work: QuestIO 1.15 1.19 compare
But... • Ontologies are not perfect: • ontology lexicalisations often missing or too many • ranking based on ontology structure might be misleading • Encouraging users to use keywords might be misleading • User evaluation: • defined tasks: user satisfaction reaching 90% • undefined tasks: user satisfaction low (~44%)
FREyA - Feedback, Refinement, Extended VocabularyAggregator • Feedback: showing the user system interpretation of the query • Refinement: • resolving ambiguity: generating dialog whenever one term refers to more than one concept in the ontology (precision) • Extended Vocabulary: • expressiveness: generating dialog whenever an “unknown” term appears in the question (recall) • portability: no need for customisation from application developers • The dialog: • generated by combining the syntactic parsing and ontology-based lookup • learns from the user’s selections
answer FREyA Workflow • SPARQL • answer Answer Type • Potential Ontology Concept (POC) • Ontology Concept (OC) • triples • NL query • OCs • POCs Indentify the Answer Type learn
Find Potential Ontology Concepts CNL 2010, Marettimo, Sicily
Mapping POC to OCs: Ambiguities POC POC population geo:State geo:State new york geo:City geo:cityPopulation
Ambiguous Lexicon IF THEN
The User Controls the Output POC min geo:loElevation point POC geo:isLowestPointOf geo:LoPoint POC max state geo:stateArea area geo:State
What is the lowest point of the state with the largest area? TRIPLES: ?firstJoker – geo:isLowestPointOf – geo:State geo:State – (max) geo:stateArea - ?lastJoker SPARQL: prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> prefix xsd: <http://www.w3.org/2001/XMLSchema#> select ?firstJoker ?p0 ?c1 ?p2 ?lastJoker where { { { ?c1 ?p0 ?firstJoker} UNION { ?firstJoker ?p0 ?c1} . filter (?p0=<http://www.mooney.net/geo#isLowestPointOf>) . } ?c1 rdf:type <http://www.mooney.net/geo#State> . ?c1 ?p2 ?lastJoker . filter (?p2=<http://www.mooney.net/geo#stateArea>) . } ORDER BY DESC(xsd:double(?lastJoker))
What is the lowest point of the state with the largest area? the answer for both is Death Valley TRIPLES: ?firstJoker – (min) geo:loElevation – geo:LoPoint geo:LoPoint - ?joker3 – geo:State geo:State – (max) geo:stateArea - ?lastJoker SPARQL: prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> prefix xsd: <http://www.w3.org/2001/XMLSchema#> select ?firstJoker ?p0 ?c1 ?joker3 ?c2 ?p3 ?lastJoker where { ?c1 ?p0 ?firstJoker . filter (?p0=<http://www.moony.net/geo#loElevation>) . ?c1 rdf:type <http://www.mooney.net/geo#LoPoint> . {{ ?c2 ?joker3 ?c1 } UNION { ?c1 ?joker3 ?c2 }} ?c2 rdf:type <http://www.mooney.net/geo#State> . ?c2 ?p3 ?lastJoker . filter (?p3=<http://www.mooney.net/geo#stateArea>) . } ORDER BY ASC(xsd:double(?firstJoker)) DESC(xsd:double(?lastJoker))
New Lexicon IF THEN
FREyA: a Natural Language Interface to Ontologies • http://gate.ac.uk/freya ESWC 2010
Evaluation: correctness • Mooney GeoQuery dataset, 250 questions • 34 no dialog, 14 failed to be answered • Precision=recall=94.4%
Evaluation: Learning • 10-fold cross-validation , 202 Mooney GeoQuery questions that could be correctly mapped into SPARQL and required dialog, from 0.25 to 0.48 • Errors: ambiguity and sparseness
Evaluation: Ranking • Mean Reciprocal Rank: 0.76 (default ranking based on string similarity and synonym detection)
Learning the Correct Ranking • Randomly selected 103 dialogs from 202 questions (343 dialogs) • MRR increased for 6% from 0.72 to 0.78
Evaluation: Customisation • Small empirical evaluation with 1 subject who is not familiar with ontologiesand NLP • No training, short introduction into the domain • 17 questions asked in total; 3 were cancelled by the user during one of the dialogs • 78.57% correctly answered • 21.43% failed or incorrectly answered
Conclusion • Combining syntactic parsing with ontology-based lookup through user interaction can increase the precision and recall of NLIs to ontologies, • while reducing the time for customisation by shifting it from application developers to end users.
Next steps • Improvement of the learning model to avoid errors due to ambiguities • point> geo:HiPoint or geo:LoPoint • Using lexicon to improve other systems
More information... • D. Damljanovic, M. Agatonovic, H. Cunningham: Natural Language Interfaces to Ontologies: Combining Syntactic Analysis and Ontology-based Lookup through the User Interaction. In Proceedings of the 7th Extended Semantic Web Conference (ESWC 2010), Springer Verlag, Heraklion, Greece, May 31-June 3, 2010. PDF • D. Damljanovic, M. Agatonovic, H. Cunningham: Identification of the Question Focus: Combining Syntactic Analysis and Ontology-based Lookup through the User Interaction. In Proceedings of the 7th Language Resources and Evaluation Conference (LREC 2010), ELRA 2010, La Valletta, Malta, May 17-23, 2010. PDFD. Damljanovic. Towards portable controlled natural languages for querying ontologies. In Rosner, M., Fuchs, N., eds.: Proceedings of the 2nd Workshop on Controlled Natural Language. Lecture Notes in Computer Science. Springer Berlin/Heidelberg, Marettimo Island, Sicily (September 2010)