310 likes | 414 Views
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language). Pushpak Bhattacharyya CSE Dept., IIT Bombay 12 th April, 2011. Semantics: wikipedia.
E N D
CS460/626 : Natural Language Processing/Speech, NLP and the Web(Lecture 37– Semantics; Universal Networking Language) Pushpak BhattacharyyaCSE Dept., IIT Bombay 12th April, 2011
Semantics: wikipedia • Semantics (from Greeksēmantiká, neuter plural of sēmantikós)is the study of meaning. • It typically focuses on the relation between signifiers, such as words, phrases, signs and symbols, and what they stand for, their denotata.
Computational Semantics: wikipedia • Computational semantics is the study of how to automate the process of constructing and reasoning with meaning representations of natural language expressions. • Some traditional topics of interest are: construction of meaning representations, semantic underspecification, anaphora resolution, presupposition projection, and quantifier scope resolution. • Methods employed usually draw from formal semantics or statistical semantics. • Computational semantics has points of contact with the areas of lexical semantics (word sense disambiguation and semantic role labeling), discourse semantics, knowledge representation and automated reasoning (in particular, automated theorem proving). • Since 1999 there has been an ACL special interest group on computational semantics, SIGSEM.
A hurdle: signifier-denotata dichotomy • Divide between a word and what it stands for • “red” is NOT red in colour • “red wine”, “red rose”, “he is in the red” denote very different sense of the word • Translation into another language reveals this difference
A Perpective Discourse Pragmatics Semantics Syntax Lexicon Morphology
Our tryst with semantics: Universal Networking Language (UNL)
Motivation • Extraction of semantics, i.e., deep meaning is important for many applications. • Machine Translation, Meaning-based IR, CLIR • Robust, scalable & efficient methods of knowledge extraction required • Machine Translation and Cross Lingual IR: a need of the hour for crossing language barrier
Interlingua: a vehicle for machine translation Hindi English Interlingua (UNL) Analysis Chinese French generation
UNL: a United Nations project • Started in 1996 • 10 year program • 15 research groups across continents • First goal: generators • Next goal: analysers (needs solving various ambiguity problems) • Current active language groups • UNL_French (GETA-CLIPS, IMAG) • UNL_English+Hindi • UNL_Italian (Univ. of Pisa) • UNL_Portugese (Univ of Sao Paolo, Brazil) • UNL_Russian (Institute of Linguistics, Moscow) • UNL_Spanish (UPM, Madrid)
World-wide Universal Networking Language (UNL) Project Marathi • Language independent meaning representation. English Russian UNL Spanish Japanese Hindi Others
Foundations and Applications • UNL Foundations • Semantic Relations • Universal Words • Attributes • How to write UNL expressions • UNL Applications • Machine Translation: Rule based and Statistical • Search • Text Entailment • Sentiment Analysis
IR: Cross Lingual Search Crawling Indexing Multilingual Relevance Feedback Information Extraction: Part of Speech tagging Named Entity Recognition Shallow Parsing Summarization Language Processing & Understanding Machine Translation: Statistical Interlingua Based EnglishIndian languages Indian languagesIndian languages Indowordnet Machine Learning: Semantic Role labeling Sentiment Analysis Text Entailment (web 2.0 applications) Using graphical models, support vector machines, neural networks Resources: http://www.cfilt.iitb.ac.in Publications: http://www.cse.iitb.ac.in/~pb Linguistics is the eye and computation the body
UNL represents knowledge: John eats rice with a spoon Universal words Semantic relations attributes Repository of 42 Semantic Relations and 84 attribute labels
Sentence embeddings Deepa claimed that she had composed a poem. [UNL] agt(claim.@entry.@past, Deepa) obj(claim.@entry.@past, :01) agt:01(compose.@past.@entry.@complete, she) obj:01(compose.@past.@entry.@complete, poem.@indef) [\UNL]
Constituents of Universal Networking Language • Universal Words (UWs) • Relations • Attributes • Knowledge Base
forward(icl>send) @ entry @ past agt gol obj he(icl>person) minister(icl>person) @def mail(icl>collection) @def UNL Graph He forwarded the mail to the minister.
UNL Expression agt (forward(icl>send).@ entry @ past, he(icl>person)) obj (forward(icl>send).@ entry @ past, minister(icl>person)) gol (forward(icl>send ).@ entry @ past, mail(icl>collection). @def)
What is a Universal Word (UW)? • Words of UNL • Constitute the UNL vocabulary, the syntactic-semantic units to form UNL expressions • A UW represents a concept • Basic UW (an English word/compound word/phrase with no restrictions or Constraint List) • Restricted UW (with a Constraint List ) • Examples: “crane(icl>device)” “crane(icl>bird)”
TheLexicon Format of the dictionary entry e.g., [minister] {}“minister(icl>person)”(N,ANIMT,PHSCL,PRSN); • Head word • Universal word • Attributes • Morphological - Pl(plural), V_ed(past tense form) • Syntactic - V(verb),VOA(verb of action) • Semantic - ANIMT(animate), PLACE, TIME [headword] {}“Universal word“(Attribute list);
TheLexicon (cntd) He forwarded the mail to the minister. Content words: [forward] {} “forward(icl>send)” (V,VOA) <E,0,0>; [mail] {}“mail(icl>message)”(N,PHSCL,INANI) <E,0,0>; [minister] {}“minister(icl>person)” (N,ANIMT,PHSCL,PRSN) <E,0,0>; Headword Universal Word Attributes
TheLexicon (cntd) He forwarded the mail tothe minister. function words: [he] {} “he” (PRON,SUB,SING,3RD) <E,0,0>; [the] {} “the” (ART,THE) <E,0,0>; [to] {} “to” (PRE,#TO) <E,0,0>; Headword Universal Word Attributes
Hindi example: संज्ञा का उदाहरण १/२ सार्वभौम शब्द मुख्य शब्द गुण farmer farmer(icl>creator) E N,ANIMT,FAUNA,MML,PRSN शेतकरी M N,M,ANIMT,FAUNA,MML,PRSN किसान H N,M,ANIMT,FAUNA,MML,PRSN,Na
The Features of a UW • Every concept existing in any language must correspond to a UW • The constraint list should beas small as necessary to disambiguate the headword • Every UW should be defined in the UNL Knowledge-Base
Restricted UWs • Examples • He will hold office until the spring of next year. • The spring was broken. • Restricted UWs, which are Headwords with a constraint list, for example: “spring(icl>season)” “spring(icl>device)” “spring(icl>jump)” “spring(icl>fountain)”
How to create UWs? • Pick up a concept • the concept of “crane" as "a device for lifting heavy loads” or as “a long-legged bird that wade in water in search of food” • Choose an English word for the concept. • In the case for “crane", since it is a word of English, the corresponding word should be ‘crane' • Choose a constraint list for the word. • [ ] ‘crane(icl>device)' • [ ] ‘crane(icl>bird)'
English sentences: basic structure verb • A <verb> B • John eats bread • agt(eat.@entry, John) • obj(eat.@entry, bread) • A <verb> • John sleeps • aoj(sleep.@entry, John) • A <be> B • John is good • aoj(good.@entry, John) R2 R1 A B R2 verb R1 A B aoj A
Hindi sentences: basic structure verb • A B <verb> • John rotikhaataahai • agt(eat.@entry, John) • obj(eat.@entry, bread) • A <verb> • John sotaahai • aoj(sleep.@entry, John) • A <be> B • John acchaahai • aoj(good.@entry, John) R2 R1 A B R2 verb R1 A B aoj A
Complex English sentences: Use recursion on the basic structure eat A <verb> B • John who is a good boy eats bread which is toasted • agt(eat.@entry, :01) • obj(eat.@entry, :02) • aoj:01(boy, John.@entry) • mod:01(boy, good) • obj:01(toast, bread.@entry.@focus) agt obj :01 :02 :02 :01 toast boy aoj mod obj John good Bread Red arrows indicate entry nodes