320 likes | 450 Views
CS 460 natural language processing. Combinatory Categorial Grammar. Hemant Noval (08005017) Saurabh Goyal (08005016) Mudit Malpani (08005020) Palak Dalal (08005034). Guided by : Prof. Pushpak Bhattacharya. contents. Introduction Motivation Categorial Grammar
E N D
CS 460natural language processing Combinatory Categorial Grammar HemantNoval (08005017) SaurabhGoyal (08005016) MuditMalpani (08005020) PalakDalal (08005034) Guided by : Prof. Pushpak Bhattacharya
contents • Introduction • Motivation • Categorial Grammar • Combinatory Categorial Grammar • Three parts of formalism of CCG • Subtypes • Semantics • CCG and Parsing Algorithm • Conclusion
Introduction • Limitations of Context-free grammar • “Peter is from England and Paul from Sweden” • Knowledge about context needed due to missing verb. • Crossing dependencies cannot be resolved. • “John kicks skillfully the ball”
Motivation (1/2) • Models based on lexical dependencies • The dependencies are typically derived from a context-free phrase structure tree • Does not work well for the long-range dependencies “Ram ka yeh baar baar Shyaam ke ghar jaana mujhe pasand nahi” • CCG • “mildly context-sensitive” formalism • Provides the most linguistically satisfactory account of the dependencies • Is to facilitate recovery of unbounded dependencies Ref - nlp.korea.ac.kr/~hjchung/sprg/summary/021104.ppt
Motivation (2/2) • Principle of Compositionality – Meaning of a complex expression is determined by meaning of constituent expressions and rules used to combine them • CCG has close relation to (compositional) semantics - syntactic constituents combine as functions or according to a function-argument relationship. • Cross-linguistic generalizations can be made easily since the same set of rules always apply • Arguably psychologically plausible (since processing can proceed in a left-to-right fashion)
Categorial Grammar (1/2) • Categorial Grammar (CG) involves syntactic objects with well defined syntactic types or categories and rules for combining them. • The rules of grammar are entirely conditioned on lexical categories. • There are lots of categories and only a small set of applicable rules.
Categorial Grammar (2/2) • Categories • Primitive Categories : N, NP, S etc. • Man – N • The old man - NP • Functions : Combination of primitive categories, more specifically a function from one category to another. • S/NP • NP/N • (NP\S)/NP
Function Types A simple categorial grammar may have just two function types - • B/A - type of a phrase that results in a phrase of type B when followed (on the right) by a phrase of type A. • A\B - type of a phrase that results in a phrase of type B when preceded (on the left) by a phrase of type A.
Categorial Grammar • English grammar might have three basic types (N, NP and S). Other types can be derived - • Adjective – N/N • Determiner – NP/N • Intransitive verbs – NP\S • Transitive verbs - (NP\S)/NP The bad boy made that mess NP/N N/N N (NP\S)/NP NP/N N
Combinatory Categorial Grammar • Combinatory categorial grammar (CCG) is an efficiently parseable, yet linguistically expressive grammar formalism. • CCG is mildly context sensitive. • Basic categorial grammar uses just forward and backward application combinators. • CCG also includes functional composition and type-raising combinators. • CCG provides incremental derivations (left to right) to the language.
Definition of CCG • A CCG G = (VT , VN, f, S,R) is defined as follows: • – VT defines the finite set of all terminals. • – VN defines the finite set of all nonterminals. These nonterminals are also called “atomic categories” which can be combined into more complex functional categories by using the backward operator \ or the forward operator /. • – The function f maps terminals to sets of categories and corresponds to the first step in bottom-up parsing. • – The unique starting symbol is denoted by S • – R describes a finite set of combinatory rules
Functional Application • The two basic rules used in Pure Categorial Grammar (AB Calculus) • Forward Application: (>) X/Y Y => X • Backward Application: (<) Y X\Y => X
Functional Application (Example) • Brazil defeated Germany np (s\np)/np np ------------------------------ > s\np ----------------------------------------------- < s • The dog bit John np/n n (s\np)/np np ------------------ > ---------------------- > np s\np ------------------------------------- < s
Functional Composition • Two functional types can compose if the domain of one type corresponds to the range of the other. • Forward Composition: (>B) X/Y Y/Z =>B X/Z • Backward Composition: (<B) Y\Z X\Y =>B X\Z
Functional Composition(Example) • Ram likes football s/(s\np) (s\np)/np np ----------------------------- >B s/np ------------------------------- > s
Type Raising • Type raising combinators take elementary syntactic types (primitive types) to functor types. • Forward Type-Raising: (>T) X =>TT/(T\X) • Backward Type-Raising: (<T) X =>TT\(T/X)
Type Raising (Example) • Ram likes football np (s\np)/np np --------- >T s/(s\np) ----------------------------- >B s/np ------------------------------- > s
Modifications • Above rules are order preserving • In languages certain words can be permutes without changing the meaning of sentence. • E.g. Kahn blocked skillfully a powerful shot by Rivaldo instead of skillfully blocked. • Extra rules needed to parse such sentences.
Crossed Compostion • Forward Crossed Composition: (>Bx) X/Y Y\Z =>B X\Z • Forward crossed composition is generally considered to be inactive in the grammar of English because it can induce some highly ungrammatical scrambled orders • Backward Crossed Composition: (<Bx) Y/Z X\Y =>B X/Z
Substitution • Allows a single resource to be utilized by two different functors. • Forward Substitution: (>S) (X/Y)/Z Y/Z =>SX/Z • Backward Substitution: (<S) Y\Z (X\Y)\Z =>SX\Z
Example • team that I persuaded everyone to support n (n\n)/(s/np) np ((s\np)/(s\np))/np np/np (s\np)/np ------ >T ------------------------------------------- >B s/(s\np) ((s\np)/(s\np))/np ------------------------------------------------------------ >S (s\np)/np ----------------------------------------------------------- >B s/np --------------------------------------------- > n\n ------------------------------ < n
Subtypes (1/2) In the simple terms described earlier John run or many coffee are correct. • Introduce type hierarchies • For example , John is not only a NP it is also singular. • So we introduce NPsg as a subtype of NP. • Anything that requires a NP will accept NPsg as well. • But there can be specific requirement of NPsg where just NP will not fit. http://www.wellnowwhat.net/blog/?p=294
Subtypes (2/2) John runNPsg S\NPpl ––––––––––––– Cannot Apply John runsNPsg S\NPsg –––––––––––––< many coffeeNPpl/NplNmass –––––––––––––––– Cannot Apply http://www.wellnowwhat.net/blog/?p=294 much coffeeNPmass/NmassNmass ––––––––––––––––––––>NPmass
Semantics (1/2) • Most common way to represent semantics is through predicate calculus and lambdas • For example : • Each word has a semantic content. • The proper noun John has the content John’ (the ‘ distinguishing the semantic individual from the word represented with the same orthography). • Verb run would have the content λx.run’ x. • Rules are : http://www.wellnowwhat.net/blog/?p=294 Y X\Y y λv.p(v) –––––––––––––< X p(y) X/Y Y λv.p(v) y ––––––––––––> X p(y)
Semantics (1/2) John runs NP S\NP John’ λx.run’(x) ––––––––––––––––––> S run’(John’) John saw Frank NP (S\NP)/NP NP John’ λy.(λx.see’(x, y)) Frank’ –––––––––––––––––––––––––––> S\NPλx.see’(x, Frank’) ––––––––––––––––––––––––––––––< S see’(John’, Frank’) http://www.wellnowwhat.net/blog/?p=294
CCG and parsing algorithms • Normal CYK Algorithm (discussed in class) • It is exhaustive • It explores all possible analyses of all possible spans, irrespective of whether such analyses are likely to be part of the highest probability derivation. • Two methods Adaptive supertagging A* parsing
Adaptive supertagging • Treats the assignment of lexical categories (or supertags) as a sequence tagging problem. • Lexical categories are pruned to contain only those with high posterior probability. • It is the extensive pruning of lexical categories that leads to substantially faster parsing times • Relaxing the pruning threshold for lexical categories whenever the parser fails to find an analysis. • The process either succeeds and returns a parse after some iteration or gives up after a predefined number of iterations
A* parsing • A* search is an agenda-based best-first graph search algorithm • Finds the lowest cost parse exactly without necessarily traversing the entire search space • Items are processed from a priority queue, which orders them by the product of their inside probability and a heuristic estimate of their outside probability. • If heuristic is admissible then solution is guaranteed to be exact. Klein and Manning, 2003
CONCLUSION (1/2) • Accurate, efficient wide-coverage parsing is possible with CCG • Mildly context sensitive. • Uses functors and function rules for parsing of sentences. The semantics is analyzed using lambda calculus/combinatory logic.
References (1/2) • A Brief History of Grammar – Categorial Grammar (CG) and Combinatory Categorial Grammar (CCG) July 24th, 2009 (http://www.wellnowwhat.net/blog/?p=294) • Wikipedia : Combinatory categorial grammar (http://en.wikipedia.org/wiki/Combinatory_categorial_grammar ) • Efficient CCG Parsing: A* versus Adaptive Supertagging – Michael Auli and Adam Lopez ACL 2011 • Generative Models for Statistical Parsing with Combinatory Categorial Grammar 2002. 10. 23 Joon-Ho Lim NLP Lab., Korea University
References (2/2) • Identifying Semantic Roles Using Combinatory Categorial Grammar - Daniel Gildea and Julia Hockenmaier, University of Pennsylvania • Building deep dependency structures with a wide-coverage CCG parser Stephen Clark - ACL2002 • Multi-Modal Combinatory Categorial Grammar Jason Baldridge, Geert-Jan M. Kruijff