260 likes | 440 Views
9 th International Conference on Applications of Natural Language to Information Systems NLDB’04. Ontology-driven question answering in: AQUALog. By : Vanessa López, Enrico Motta Knowledge Media Institute. Open University. {v.lopez, e.motta}@open.ac.uk. Index.
E N D
9th International Conference on Applications of Natural Language to Information Systems NLDB’04 Ontology-driven question answering in: AQUALog By : Vanessa López, Enrico Motta Knowledge Media Institute. Open University {v.lopez, e.motta}@open.ac.uk
Index • Motivation: NL front-end for the Semantic Web • AquaLog approach • System Architecture • Examples • Evaluation/ Discussion • Future Lines/ Conclusions
? The Semantic Web Vision Engineering Semantics on the Web • The future web: • knowledge to be managed in an automatic way • Semantics through: - Set of representation languages: rdf,… - Structures for knowledge: ontologies ASSUMPTION: Ontology-based semantic markup will become widely available
Question-Answering So… - Novel, sophisticated Question Answering using Semantic Mark-up - Semantic Mark-up queried directly Similar scenario - asking NL queries to databases (semantic mark-up viewed as a knowledge base) Ontology portable
Example!!: What are the projects of enrico motta?
NL SENTENCE INPUT ANSWER QUERY TRIPLES ONTOLOGY COMPATIBLE TRIPLES AquaLog: Approach LINGUISTIC & QUERY CLASSIFICATION RELATION SIMILARITY SERVICE INFERENCE ENGINE Intermediate triples: <subject, predicate, object> + features
? Ontologíes Knowledge Bases AquaLog: Approach 2 main subtasks: • Intermediate representation from the input query • Map the intermediate representation to the kb Linguistic Component: Relation Similarity Service: PLUG-INS AQUALOG
LINGUISTIC COMPONENT QUERY INTERFACE USER’S SESSION TRIPLE(s) RELATIONS RELATIONS WH-TERMS QUERY-PATTERN-CLASSIFICATION NOUNS WH-S FEATURES TERMS VERBS JAPE PREPS TOKENS Linguistic Component GATE LIBRARIES NL QUERY TRIPLES NL QUERY TERMS
projects? – involved - semantic web person/organization? - managed (passive) - motta value?– job title - motta value?– web address -- peter person/organization? – has interest – semantic web Linguistic Component WH-GENERIC TERM: WH-UNKNOWN TERM: WH-UNKNOWN RELATION: DESCRIPTION: AFFIRMATIVE-NEGATIVE: COMBINATION OF BASIC QUERIES:
FEEDBACK KB HIERARCHY RELATIONS ANSWER STRING ALG WORDNET LEXICON RSS Ontologies Relation Similarity Service LINGUISTIC TRIPLE USER MECHANISM(S)
? Relations/concepts THE PROBLEM similarities Translated query Ontological structures dynamic secretary (person, KMI) works-in-unit (secretary, knowledge-media-institute) The relation similarity service Who is the secretary in Kmi? RSS Research institute
USER’S FEEDBACK REQUIRED!! PROJECT? – ? - JOHN DOMINGUE What are the projects of john domingue CATEGORY:WH-UNKNOWN TERM QUERY TRIPLE: VALUE? – PROJECTS – JOHN DOMINGUE
ONTO TRIPLE: PROJECTS – HAS-PROJECT-MEMBER (OR) HAS-PROJECT-LEADER- JOHN-DOMINGUE ANSWER: LIST OF PROJECTS
What are the research areas covered by the akt project ? CATERGORY: WH-GENERIC TERM QUERY TRIPLE: RESEARCH AREAS – COVERED -AKT ONTO TRIPLE: RESEARCH-AREA – ADDRESSES-GENERIC-AREA-OF-INTEREST – AKT SOLUTION: LIST OF RESEARCH AREAS
Gate libraries QUERY INTERFACE USER’S SESSION ANSWERING PROCESSING INTERFACE WordNet thesaurus libraries Ontologíes Knowledge Bases AquaLog: Architecture Post-process Semantic modules Configuration files QUERY LINGUISTIC COMPONENT TRIPLES String pattern libraries RELATION SIMILARITY SERVICE User’s feedback Help RSS - IE modules ‘Raw’ Answer Ontology-compliant query Configuration files Answer Interpreter
? Aktive-reference ontology Evaluation • Initial study: - Satisfy the users expectations about the range of questions? • Possible extensions to the ontology and linguistic components? - 70 questions: no linguistic constraints • 48.68% of the total were handled correctly LINGUISTIC FAILURE DATA MODEL FAILURE RSS FAILURE CONCEPTUAL FAILURE SERVICE FAILURE NLP -> TRIPLE 69% of errors NL too complicate for triples 0% of errors Query TRIPLE Ontology does no cover query 10.2% of errors requires ranking and similarity services 20.5% of errors Onto TRIPLE 7.6% of errors
AquaLog version 2 • Improved linguistic coverage: • which researchers wrote publications related to social aspects? • Implementing services • Similarity services: is there a project similar to akt? • Ranking services: what are the most successful projects? NEW VERSION TO HANDLE 87% OF THE FAILURES
Current work: Example Are there any projects about ontologies sponsored by eprsc? CLAUSE CATEGORY: WH-UNKNOWN-REL ONTO TRIPLE: ADDRESSES-GENERIC-AREA-OF-INTEREST (PROJECT?, ONTOLOGIES) CATEGORY: WH-GENERIC-TERM ONTO TRIPLE: FUNDING SOURCE (PROJECT?, ONTOLOGIES) SOLUTION(WH-GENERIC-1TERM-CLAUSE): COMBINATION OF LISTS CATEGORY: WH-GENERIC-1TERM-CLAUSE QUERY TRIPLE: SPONSORED (PROJECTS, ONTOLOGY, EPRSC)
Current work: Example Which projects are headed by researchers in akt? IS A CLASS! CLAUSE CATEGORY: WH-3TERM ONTO TRIPLE: HAS-PROJECT-MEMBER OR LEADER (PROJECT?, RESEARCHER) CATEGORY: WH-UNKNOWN-REL ONTO TRIPLE: HAS-PROJECT-MEMBER OR LEADER (RESEARCHERS, AKT) SOLUTION(WH-3-TERM – CLAUSE TO THE 2 TERM): GET THE FIRST LIST FOR THE CLAUSE AND GET A LIST FOR EACH OF THE ELEMENTS IN THE LIST CATEGORY: WH--3TERM QUERY TRIPLE: HEADED (PROJECTS, RESEARCHERS, AKT)
Conclusion • Novel RSS Service:combination of pattern matching, lexicon & reasoning about the ontology (taxonomy, relationships). • Term (Triple): instance/class • Relation (Triple): relation/class (not necessarily known) • Linguistic Component:GATE (Sheffield university). Very flexible through the use of patterns: currently around 26 linguistic patterns. • String algorithmsfind matching in the ontology for any of the triple terms. Based on combination of string distance metrics for name matching tasks (open source: Carnegie Melon University – Pittsburgh) • Portablelittle configuration effort • Portability across ontologies have to be evaluated
End Thanks for your attention V.Lopez@open.ac.uk