200 likes | 376 Views
“Lexical Knowledge Structures” Seminar Presentation J. Ramanand (KReSIT) Advisor: Prof. Pushpak Bhattacharyya (CSE). Outline. Some Problems A Solution using Lexical Knowledge Structures Lexical Knowledge Structures - Introduction Critical Comparisons SUMO – an upper ontology initiative
E N D
“Lexical Knowledge Structures”Seminar PresentationJ. Ramanand (KReSIT)Advisor:Prof. Pushpak Bhattacharyya (CSE)
Outline • Some Problems • A Solution using Lexical Knowledge Structures • Lexical Knowledge Structures - Introduction • Critical Comparisons • SUMO – an upper ontology initiative • Conclusions
The Problems • How do you disambiguate ‘web’ in ‘the spider spun a web’ from ‘go surf the web’? • How do you summarise a long paragraph? • How do you automatically construct language phrasebooks for tourists? • Can a search query such as “a game played with bat and ball” be answered as “cricket”? • Can the emotional state of a person who blogs “I didn’t expect to win the prize!” be determined?
Lexical Knowledge Structures – A Solution? • Many of these issues can be (partially) resolved just by knowing more about the meaning of words - lexical semantics theory • Want a lexicon to provide not only dictionary or thesaurus like information, but more rich associations among words • The underlying structure has to be discovered
Solution - Key Elements • Machine Readable by NLP tools • Ease of Construction • Coherent Principles behind Construction • Storage Structure • Coverage • Usability • Quality
Lexical Knowledge Structures • Several Examples currently in existence • Mainly research efforts, but some like WordNet moving into mainstream use • Conceptually Directed Acyclic Graphs • Labeled Relations between nodes • Relations: ranging from Synonymy to Cause-Effect to Performs-Functions to Motivates
WordNet • Lexical Matrix of words vs senses • Key application: WSD • Principles: Differentiation, Minimality, Coverage, Replaceability • Synsets are the nodes • Relations: Paradigmatic (Synonymy et al)
ConceptNet • Common Sense semantic network • Graph of simple concepts and rich relations • Relations: Mainly Syntagmatic • Common Sense data collected from volunteers
HowNet • Network of “concepts” - describe the “objective world” • Principles: composition, evolution, construction • Basis: Identify Sememes and combine them • Significant emphasis on the Chinese character philosophy
Concept to be represented: “Teacher” In HowNet, concept expressed as a combination of the sememes for “human” (entity), “teach” (event) and “education” (entity) The HowNet record for “teacher” will have: Its hypernym: “human” Attribute(s): “education” “Agent” relation to “teach” Its part of speech: Noun HowNet Example
FrameNet • “Frame semantic” oriented approach - similar to senses • Situational frames • Frames: frame elements, lexical units • Complex structural representation • Rich set of annotations with illustrations of combinatorial possibilities of lexical units
FrameNet - e.g. • Frame for “Apply_Heat” : • Definition: A Cook applies heat to Food, with Temperature of the heat and Duration of application. A Heating_Instrument, generally indicated by a locative phrase, may also be expressed. Some methods involve the use of a Medium by which heat is transfered to the Food, which may be expressed in more than one phrase. The gramatically and semantically prominent one, usually an object NP, is marked Food1, and the other, usually in a prepositional phrase, is marked Food2. • Sally FRIED an egg in butter. • FEs (sample): • (Core) Food: Food is the entity to which heat is applied by the Cook. • (Non-Core) Degree: This FE identifies the Degree to which heat application occurs. • Other attributes (sample).: • Inherits From: Activity, Intentionally_affect ; Is Used By: Cooking_creation • Lexical Units:bake.v, barbecue.v, blanch.v, boil.v, braise.v, broil.v ...
MindNet • Microsoft effort • Broad Coverage Parser run on corpus • Definition oriented • Hierarchical “sem-rels” • e.g.: “A pen is an instrument used for writing” =>Pen: • Hypernym: instrument • Tobj: • Purpose: writing • Collection and inversion
Critical Comparisons - I • Domain • Principles of construction: • Constructive: HowNet • Differentiative: WordNet • Definition oriented: MindNet • Generative: FrameNet • Uncentralised: MindNet, ConceptNet • Method of Construction: • Manual: WordNet, HowNet, FrameNet • Automated: ConceptNet, MindNet
Critical Comparisons - II • Representation: • Record-oriented: HowNet, MindNet, FrameNet • Synsets: WordNet • Assertions: ConceptNet • Coverage and Quality: • High: WordNet, HowNet, FrameNet • Medium: MindNet, ConceptNet • Potentially Low: ConceptNet
SUMO • SUMO: Suggested Upper Merged Ontology • Examples: An upper ontology would have high level entities such as “Animal” or “Country” or “GovernmentalOrganisation”. A domain (lower) ontology would have “Blackbuck” or “India” or “BMCC” • Aim: New knowledge bases design, interoperability, integration with legacy DBs • Contains a set of entities or high level concepts • Hypernymy/Hyponymy tree structure
Conclusions • Lexical structures can be applied to NLP, IR tasks • Increasing in coverage and utility • Key issues: Method of construction, quality of database, coverage • Key tradeoff: Quality vs. speed of collection