400 likes | 505 Views
Amirkabir University of Technology Computer Engineering Faculty AILAB Grammars for Natural Language Ahmad Abdollahzadeh Barfouroush Mehr 1381. An Example of a NLU System Structure. Words (Input). Words (Response). Lexicon. Realization. Parsing. Grammars.
E N D
Amirkabir University of Technology Computer Engineering Faculty AILAB Grammars for Natural Language Ahmad Abdollahzadeh Barfouroush Mehr 1381
An Example of a NLU System Structure Words (Input) Words (Response) Lexicon Realization Parsing Grammars Syntatic Structure and Logical form of response Syntatic Structure and Logical form Discourse Context Utterance Planning Application Context Contextual Interpretation Meaning of response Final Meaning Application Reasoning
Grammar and Parsing • To Examine how the syntatic structure of a sentence can be • computed, you must consider two things: • Grammar: A formal specification of the allowable structures • in the language. • Parsing: The method of analysing a sentence to determine its • structure according to the grammar.
Grammar and Language • A Grammar G generates a characteristic language L(G) • and assigns structures to all s e L(G). • For grammar G and start symbol S, L(G) = {x | S derives x} • For X e S and let a,b,Y {S U N}* (i.e. sequence of • terminal/non-terminal symbols) • aXbimmidiately derivesaYb iff X Y e G • aXbderives aZb if • aXb immidiately derives aZb or • aXb immidiately derives aYb and aXb derives aZb • A grammar does not tell us how to generate L(G) or how to discover • such structures.
Grammar Definition • Grammer G is defined by a four tuple and is written as in the • form of G = (N,S,P,S0) where, • - N is non-terminal symbols set • S is terminal symbols set • P is rewrite rules of the form a b, where a and b are strings • Of symbols. • S0 is start symbol • In this definition N and S are two separate sets. • Only non-terminals are re-writeable and can occure in both sides • of a rule.
An Example of a Grammar 1- S NP VP 2- VP V NP 3- NP NAME 4- NP ART N 5- NAME Amir 6- V ate 7- ART the 8- N biscuite S0 = S N = {N,ART,NAME,NP,VP,P} P = {rules in 1 to 8} S = {ate,the,Amir,biscuite}
Sentence Structure • Two Methods for representing sentence structures are: • Parse tree • Lists
Parse Tree Man ate the apple S NP VP Name V NP Man ate ART N the apple
Parse Tree • Includes information about • - precedence between constituents • – dominance between constituents • Constitutes a trace of the rule applications used to derive a sentence. • Does not tell you the order in which the rules were used
Lists Man ate the apple ( S (NP (NAME Man)) (VP (V ate) (NP (ART the) (N apple))))
Chomskey’s Hierachy • Different classes of grammar result from various restrictions on • the form of rules. • Grammers can be compared according to range of languages each • formalism can describe.
Types of Grammar in Hierarchy • Regular or Rigth Linear (Type 3): Every rewrite rule is of the form • X aY or X a, where a is sequence of terminals. • Context-free Grammar (CFG) (Type 2): Every rewrite rule is of the • form X a, where X is in N and a is in (S U N)+. • Context-sensitive Grammar (Type 1): Every rewrite rule is of the form g1Xg2 g1ag2, where X is N, g1, g2 and a are in (N U S)+. • Unrestricted (Type 0): Every rewrite rule is of the form a b, where • There is no restriction on rule.
Categorized Grammer • Grammer G is defined by a five tuple and is written as in the • form of G = (N,S,T,P,S0) where, • - N is non-terminal symbols set • S is terminal symbols set • P is rewrite rules of the form a b, where a and b are strings • Of symbols. • S0 is start symbol • T is category terminal or lexical symbols written as T1,T2,..,Tn. • S is written as S = T1,T2,…,Tn. • Every categorized terminal is written as Ti ai1 | ai2 | … | ain
An Example of a Categorized Grammar 1- S NP VP 9- VP Verb 2- NP Art NP2 10- VP Verb NP 3- NP NP2 11- VP VP PP 4- NP2 Noun 5- NP2 Adj NP2 6- NP2 NP2 PP 7- PP Prep NP 8- PP Prep NP PP S0 = S N = {S,NP,VP,NP2,PP} T = {Art, Noun,Adj,Prep,Verb} Art = {a,the} , Noun = {Man,Woman,boy,cow,chicken} Verb = {eat,run,put} Adj = {old, young,heavy} Prep = {in,by,of,over}
Criteria for Evaluating Grammars • Does it undergenerate? • • Does it overgenerate? • • Does it assign appropriate structures to sentences it generates? • • Is it simple to understand?How many rules are? • • Does it contain generalisations or special cases? • • How ambiguous is it?
Overgeneration and Undergeneration Overgeneration: A grammar should generate only sentences in the language.It should reject sentences not in the language. Undergeneration: A grammar should generate all sentences in the language.There should not be sentences in the language that are not generated by the grammar.
Appropriate Structures - A grammar should assign linguistically plausible structures. S --> N VP N VP --> V ART ADJ N --> [John] V --> [ate] ART --> [a] ADJ --> [juicy] N --> [hamburger]
Understandability/Generality • Understandability: The grammar should be simple. • Generality: The range of sentences the grammar analyzes correctly.
Ambiguity NP NP PP PP Prep NP (the man)(on the hill with a telescope by the sea) (the man on the hill)(with a telescope by the sea) (the man on the hill with a telescope)(by the sea) etc.
Context-free Grammars (CFG) • CFG formalism is poweful enough to descibe must • of the structure in natural languages. • CFG is restricted enough so that efficient parsers can • be built to analyse sentences.
CFGs: Advantages and Disadvantages • Advantages • Easytowrite • Declarative • Linguistically natural (sometimes) • Well understood • Formal properties • Computationally effective • Disadvantages • Notion of “head” is absent • Categories are unanalysable
Chomsky Normal Form (CNF) • Suppose G = (N,S,P,S0) is a context-free grammar. G is in Chomsky Normal Form if every rule in P be in one of the following forms: • 1) X YZ for {X,Y,Z} in N or • 2) X a for a in S • There is an algorithm that shows every CFG can be equal to a CNF grammar.
An Algorithm for Converting CFG to CNF • For every grammar G = (N,S,P,S0) there is a equivalent grammar G’ in Chomsky Normal Form. • 1- Transfer every rule in X YZ for {X,Y,Z} in N or X a for a in S to CNF. • 2- Consider every rule in the form XY1,a1,Y2,…,Yn. All terminal symbols ai are replaced by Xi and new rule Xi ai is added to P’. • 3- Step 2 produces rules in the form XY1,Y2,…,Yn. If n<=2 then transfer this rule directly to P’. If n>2 non-terminals are added as follow: • X Y1,<Y2,Y3,…,Yn> Z1 = <Y2,…,Yn> • Z1 Y2, <Y3,…,Yn> Z2 = <Y3,…,Yn> • Zn-1 Yn-1,Yn
Greibach Normal Form (GNF) • Suppose G = (N,S,T,P,S0) is a categorized context-free grammar. A rule in the form X a1a2…an is in GNF if a1 is in S or T and a2,…,an be non-terminals. • If all rules in P be in GNF, then G is GNF. • So, G should not contain rules in the form X e. • By direct substitution we can reach to GNF.
An Example of CFG GNF CFG 1- S NP VP 2- S NP VP PREPS 3- NP Det NP2 4- NP NP2 5- NP2 Noun 6- NP2 Det NP2 7- NP2 NP3 PREPS 8- NP3 Noun 9- PREPS PP 10- PREPS PP PREPS 11- PP Prep NP 12- VP Verb
An Example of CFG GNF GNF 1a- S Det NP2 VP 5) NP2 Noun 1b- S Noun VP 6) NP2 Adj NP2 1c- S Adj NP2 VP 7) NP2 Noun PREPS 1d- S Noun PREPS VP 8) NP3 Noun 2a- S Det NP2 VP PREPS 9) PREPS Prep NP 2b- S Noun VP PREPS 10) PREPS Prep NP PREPS 2c- S Adj NP2 VP PREPS 11) PP Prep NP 2d- S Noun PREPS VP PREPS 12) VP Verb 3- NP Det NP2 4a- NP NP2 4b- NP Adj NP2 4c- NP Noun PREPS
Problems with phrase structure The shooting of the hunters was terrible. (The shooting) (of the hunters) (was terrible.) The boy hit the ball The ball was hit by the boy.
Surface vs. Deep Structure • Surface structure: the phrase structure of the current utterance • Deep structure: a canonical phrase structure that has the same meaning as the surface structure • Transformational grammar: rules that transform a deep phrase structure into surface phrase structures with the same meaning
Deep Structure:The boy hit the ball ANDThe ball was hit by the boy Sentence VP NP V NP The boy hit the ball
Transformational Grammar (1965) • Generates surface structure from deep structure. • Syntatic Component • Phrase-strcuture rules • Deep Structure • Transformational Rules • Surface Structure Semantic component Phonological component
Example of TG • Context-free grammar generates deep structure, then a set • of trasformations transform deep structure to surface structure • S • NP VP • ART N AUX V NP • The cat will catch man
Example of TG • Yes/No Question transformation • S S • NP VP AUX NP VP ? • ART N AUX V NP ART N V NP • S • AUX NP VP ? • ART N V NP • Will the cat catch man Transformation
Transformational Grammar • Base component: Generates the deep strcuture. • Transformational component: Transforms the deep structure to surface structure by using transformational rules. • Transformational rules change the sentence elements, insert or delete elements and/or replace one element with another element. • Example of a rule: • NP + V + ed + NP Did + NP + V + NP + ?
Grammer Types (1) • Constraint-based Lexicalist Grammar (CBLG) • - Sag, I. A. and Waswo, Syntatic Theory – a formal • introduction, CSLI Publications, 1999. • Categorical Grammar (CG) • - Konig, E., LexGram, A Practical Categorical Grammar Formalism, Journal of Language and Computation,2000 http://www.ims.uni-stuttgart.de/CUF/LexGram/LexGram.html • Dependency Grammar (DG) • - Sag, I. A. and Waswo, Syntatic Theory – a formal • introduction, CSLI Publications, 1999.
Grammer Types (2) • Link Grammar • - Sleator, D., Temperley D., Parsing English with Link • Grammar, Carnegie Mellon Univ, • http://www.cs.cmu.edu/project/link/tr91-196.html • Lexical Functional Grammar (LFG) • - Sag, I. A. and Waswo, Syntatic Theory – a formal • introduction, CSLI Publications, 1999. • Tree-Adjoining Grammar (TAG) • - Allen, James, Natural Language Understanding, 1995
Grammer Types (3) • Generalized Phrase Structure Grammar (GPSG) • - Sag, I. A. and Waswo, Syntatic Theory – a formal • introduction, CSLI Publications, 1999. • Head-driven Phrase Structure Grammar • - Pollard, C., and Sag I. A., Head-driven Phrase Structure • Grammar, Chicago Univ Press, 1994. • http://hpsg.stanford.edu/ • This page provides information about Head-Driven Phrase Structure Grammar (HPSG) related activities at the Center for the Study of Language and Information (CSLI) at Stanford University, and pointers to other resources on the web.
Grammer Types (4) • Probabilistic Feature Grammar (PFG) • - Goodman, Joshua, Probabilistic Feature Grammar, Harward • university, 1997. Goodman@eecs.harvard.edu