1 / 28

Ontology-based question analysis in a federation of university sites: the MOSES case study

Ontology-based question analysis in a federation of university sites: the MOSES case study. Paolo Atzeni***, Paolo Missier***, Patrizia Paggio ** , Dorte H. Hansen ** , Roberto Basili * , Maria Teresa Pazienza * , Fabio Massimo Zanzotto *

overton
Download Presentation

Ontology-based question analysis in a federation of university sites: the MOSES case study

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ontology-based question analysis in a federation of university sites: the MOSES case study Paolo Atzeni***, Paolo Missier***, Patrizia Paggio**, Dorte H. Hansen**, Roberto Basili*, Maria Teresa Pazienza*, Fabio Massimo Zanzotto* * Dipart. Di Informatica, Sist. e Prod., University of Roma “Tor Vergata” ** Centre for Language Technology, University of Copenhagen *** Dipart. Di Informatica e Automatica, University of Roma Tre

  2. The environment • A federation of university sites dealing with similar information (partially sharable conceptualization) • Different logical organization • Different physical organization • Different languages for stored information • Remote access • Multilingual queries • Indepencency between query and related information localization

  3. BEVANDA (Beverage) BEVERAGE CAPPUCCINO (Cappuccino) CAFFE (Coffee) COFFEE ESPRESSO CAPPUCCINO LONG CAPPUCCINO MINI-CAPPUCCINO (Small Cappuccino) CAFFE MACCHIATO (Dashed Coffee) CAFFE RISTRETTO (Short Coffee) CAFFE LUNGO (Long Coffee) CAFFE CORRETTO (Corrected Coffee) Interaction in a Multilingual Environment Italian Customer English Barman Hot Dashed Coffee Dashed Coffee Caffè macchiato caldo CAFFE MACCHIATO CALDO (Hot Dashed Coffee) ? ? OK

  4. The approach Answering a question in a federation of sites can be seen as a collaborative task between ontological nodes belonging to the same Q/A system: it requires a mapping between similar concepts in different linguistic representations. Relevant problems: • To work on existing ontologies • To interface ontologies and linguistic expressions • To model the mapping task for federated questions

  5. Motivation • semantic resources are available (at least for English and some other language): WordNet, FrameNet • besides the ontology representation language issue, Semantic Web will push the construction of many (interrelated) domain ontologies • only “valuable” domain ontologies will survive after the initial jungle • surviving domain ontologies should be representative of the portion of the world they are built for Problems of Ontological Q&A are well known! Why to invest now?

  6. Motivation teacherOf( Professor(John Brown), Course(functional programming languages))

  7. Motivation

  8. Motivation

  9. Ontological Q&A Architecture Query Matcher Query Analyzer Q? A! Answer Generator Knowledge structure www site Knowledge Feeder Lexical KB Domain Ontology

  10. Semantic Web Ontologies and Natural Language • Objective of an ontology: sharing knowledgeamong different (human or automatic) web users • Formal Languages have been settled: OWL (and, formely, DAML+OIL, SHOE, ...) • Relevant Concepts ID: Course Label: Course Subclassof: Work • Relevant Relationships ID: teacherOf Label: Teaches Domain: #Faculty Range: #Course

  11. Semantic Web Ontologies and Natural Language teacherOf(#Faculty,#Course) John Brown gives the database course Professor Brown delivers the database course Prof. Brown teaches the database course “Linguistic Interfaces” to SW ontologies dealing with synonymy: all linguistic expressions used to convey • concepts • relationships

  12. “Linguistic Interface” to Domain Concepts {person, individual, someone, somebody, mortal, human, soul} {worker} {adult, grownup} {employee} {professional} {educator, pedagague} {accademic, faculty member} {professor} {teacher, instructor} Linguistic KB (e.g. WN) Domain Concept Hierarchy person student employee faculty professor

  13. “Linguistic Interface” to Domain Relationships teach_course ==> tenere v insegnare v fare. props(teach_course(E),[ (consequence(E, [relation(E,teacherOf),r_arg1(E,X),r_arg2(E,Z)] ):- nodeprop(E,lsubj(E,X)), X <- ewn4123(_), /* human_1 */ nodeprop(E,lobj(E,Z)), Z <- ewn567704(_) /* education_1 */ ) ]).

  14. K2 Kn Q A K1 Kn+1 Ontological Q&A in the Web: a possible Scenario

  15. Different Nodes ... Different Languages • Questions are posted to one node • Represented in its internal language • Translated for the node having the required content • Ontology Mapping • Ontology Mapping in a Multilingual environment

  16. BEVANDA (Beverage) BEVERAGE CAPPUCCINO (Cappuccino) CAFFE (Coffee) COFFEE ESPRESSO CAPPUCCINO LONG CAPPUCCINO MINI-CAPPUCCINO (Small Cappuccino) CAFFE MACCHIATO (Dashed Coffee) CAFFE RISTRETTO (Short Coffee) CAFFE LUNGO (Long Coffee) CAFFE CORRETTO (Corrected Coffee) Ontology Mapping in a Multilingual Environment English Barman Italian Customer Hot Dashed Coffee Dashed Coffee Coffee CAFFE MACCHIATO CALDO (Hot Dashed Coffee) ? ? OK

  17. Ontology Mapping in a Multilingual Environment Concepts and relations in the two ontologies are labelled in different languages The mapping algorithm needs a translation phase (possible passing through a pivot language: English) Structural differences (not all nodes of an ontology are represented in the other) Domain relations are managed differently in the two different ontologies

  18. Test-Bed • MOSES: an European Project • Domain: • University Domain • two different universities • two different national education systems • Languages • Danish • Italian • Linguistic Formalisms and Theories • Danish site: • POS-tagging: Brill (Brill, 1995) • Feature Structure Parser: PET software (Callmeier, 2000). • Italian site: • Robust and shallow syntactic parser (Basili, Pazienza, Zanzotto, 1998), (Basili, Zanzotto, 2002) • A shallow semantic interpretation (Gaizauskas et al., 1997)

  19. Attributes Objects Events WN1.5:EWN Base Ontology Italian Process Architecture Q-QLF Syntactic Analysis Semantic Analysis Q Q-LF

  20. Tokenizer Named Entity Recognition Morphological Analyser POS Tagging Chunking An example of the Query Analysis in Italian Chi insegna il corso di Database ? SSA Semantic Analysis

  21. lsubj lobj mod Tokenizer [Chi] [insegna] [il corso] [di Database][?] Named Entity Recognition NPK VPK NPK PPK Morphological Analyser props(teach_1(E),[ (consequence(E, [relation(E,teacherOf),arg1(E,X),arg2(E,Z)]):- lsubj(E,X), X <- person_c(_), lobj(E,Y), name(Y,’corso’), mod(Y,Z), Z <- topic_c(_))]). POS Tagging Chunking An example of the Query Analysis in Italian SSA Semantic Analysis

  22. Tokenizer Named Entity Recognition Morphological Analyser POS Tagging Chunking SSA Semantic Analysis An example of the Query Analysis in Italian lsubj lobj mod [Chi] [insegna] [il corso] [di Database][?] NPK VPK NPK PPK list_all(X). relation(E,teacherOf), arg1(E, person_c(X)), arg2(E,course_c(“Database“))

  23. Lærersta (Faculty) Professorat (Professorship) Lektor (Associate Professor) Adjunkt (Assistant Professor) … Professor (FullProfessor) GæsteProfessor (GuestProfessor) Mapping University Ontologies b QUESTION: all(x) (lektor(x) & CourseOffer(x,y) & Course(y) & Name(y, French)) DK Faculty IT Professore (Tenured Professor) TitolareCorso (Teaching Assistant) Ricercatore (Research Assistant) … Ordinario (FullProfessor) ProfessoreAssociato (Associated Professor)

  24. Open Issues and Conclusions • SW ontologies seems to be valuable resources for language processing • Ontological QA does not imply multilingual QA • Challenges: • learning“linguistic interfaces” for concepts and relationships • mapping algorithms between different ontologies (representing similar knowledge domains) • learning to map web documents in the ontological language (IE for web documents)

  25. Professor Samples from a Specific Domain Excerpt from a Web Page A possible interaction with a QA system The “Linguistic Interface” for Concepts is relevant

  26. Feature Structures for the Danish Processors undervisning_1 := lex-phrase & [ORTH "undervisning" , SYNSEM.LOC[ARG-ST < [LOC.CONT #arg1], [LOC.CONT #arg2] >, CONT verb-obj & [LOA < CourseOffer & [ TEACHER #arg1, COURSE #course ], CourseSubject & [ WORK #course, SUBJECT #arg2 ]>] ] ]. (Syntactic features are largely omitted here)

  27. NL User Query Preprocessing module Tokeniser(STO) Tokeniser lex Named Entity Recogniser Danish grm NER lex POS-tagger POS lex Lemmatiser Lemmatiser lex Parsing module Parser User Query Analysis Danish Processing Architecture

More Related