450 likes | 472 Views
Explore the fundamentals of parsing in NLP and CL, covering implementation, grammar writing, lexical ambiguity, and more. Discover basic issues like top-down vs. bottom-up, handling ambiguity, and recursive rules.
E N D
Parsing I Context-free grammars and issues for parsers
More or less all books on CL or NLP will have chapters on parsing, and some may be all or mostly about parsing Many are written for computer scientists They explain linguistic things like POSs and PS grammars They go into detail about implementation (eg talk of Earley’s algorithm, shift-reduce parsers) D Jurafsky & JH Martin Speech and Language Processing, Upper Saddle River NJ (2000): Prentice Hall. Chs 9 & 10 RM Kaplan ‘Syntax’ = Ch 4 of R Mitkov (ed) The Oxford Handbook of Computational Linguistics, Oxford (2003): OUP J Allen Natural Language Understanding (2nd ed) (1994): Addison Wesley Bibliography
Parsing • Bedrock of (almost) all NLP • Familiar from linguistics • But issues for NLP are practicalities rather than universality • Implementation • Grammar writing • Interplay with lexicons • Suitability of representation (what do the trees show?)
Basic issues • Top-down vs. bottom-up • Handling ambiguity • Lexical ambiguity • Structural ambiguity • Breadth first vs. depth first • Handling recursive rules • Handling empty rules
Some terminology • Rules written A B c • Terminal vs. non-terminal symbols • Left-hand side (head): always non-terminal • Right-hand side (body): can be mix of terminal and non-terminal, any number of them • Unique start symbol (usually S) • ‘’ “rewrites as”, but is not directional (an “=” sign would be better)
S NP VP v det n shot the man 1. Top-down with simple grammar S NP VP NP det n VP v VP v NP the man shot an elephant S NP VP Lexicon det {an, the} n {elephant, man} v shot NP det n det {an, the} n {elephant, man} VP v VP v NP v shot No more rules, but input is not completely accounted for… So we must backtrack, and try the other VP rule
1. Top-down with simple grammar S NP VP NP det n VP v VP v NP the man shot an elephant S NP VP S NP VP NP det n det {an, the} n {elephant, man} v NP det n det n VP v VP v NP shot an the man elephant Lexicon det {an, the} n {elephant, man} v shot v shot NP det n det {an, the} n {elephant, man} No more rules, and input is completely accounted for
Breadth-first vs depth-first (1) • When we came to the VP rule we were faced with a choice of two rules • “Depth-first” means following the first choice through to the end • “Breadth-first” means keeping all your options open • We’ll see this distinction more clearly later, • And also see that it is quite significant
S VP S NP NP VP n v n det det 2. Bottom-up with simple grammar S NP VP NP det n VP v VP v NP det {an, the} n {elephant, man} v shot Lexicon det {an, the} n {elephant, man} v shot NP det n VP v VP v NP S NP VP S NP VP the man shot an elephant We’ve reached the top, but input is not completely accounted for… So we must backtrack, and try the other VP rule We’ve reached the top, and input is completely accounted for
Same again but with lexical ambiguity S NP VP NP det n VP v VP v NP Lexicon det {an, the} n {elephant, man, shot} v shot shot can be v or n
S NP VP det n v NP det n shot the an man elephant 3. Top-down with lexical ambiguity S NP VP NP det n VP v VP v NP the man shot an elephant S NP VP NP det n Lexicon det {an, the} n {elephant, man, shot} v shot det {an, the} n {elephant, man} VP v VP v NP Same as before: at this point, we are looking for a v, and shot fits the bill; the n reading never comes into play
S S VP NP NP VP n n det det v n 4. Bottom-up with lexical ambiguity S NP VP NP det n VP v VP v NP det {an, the} n {elephant, man, shot} v shot NP det n Lexicon det {an, the} n {elephant, man, shot} v shot VP v VP v NP S NP VP Terminology: graph nodes arcs (edges) the man shot an elephant
S S VP NP NP VP n 4. Bottom-up with lexical ambiguity S NP VP NP det n VP v VP v NP Let’s get rid of all the unused arcs Lexicon det {an, the} n {elephant, man, shot} v shot det n det v n the man shot an elephant
S VP NP NP 4. Bottom-up with lexical ambiguity S NP VP NP det n VP v VP v NP Let’s get rid of all the unused arcs Lexicon det {an, the} n {elephant, man, shot} v shot det n det v n the man shot an elephant
S VP NP NP 4. Bottom-up with lexical ambiguity S NP VP NP det n VP v VP v NP And let’s clear away all the arcs… Lexicon det {an, the} n {elephant, man, shot} v shot det n det v n the man shot an elephant
S VP NP NP 4. Bottom-up with lexical ambiguity S NP VP NP det n VP v VP v NP And let’s clear away all the arcs… Lexicon det {an, the} n {elephant, man, shot} v shot det n det v n the man shot an elephant
Breadth-first vs depth-first (2) • In chart parsing, the distinction is more clear cut: • At any point there may be a choice of things to do: which arcs to develop • Breadth-first vs. depth-first can be seen as what order they are done in • Queue (FIFO = breadth-first) vs. stack (LIFO= depth-first)
S NP VP vNP det n det n the man shot an elephant PP prepNP det n in his pyjamas Same again but with structural ambiguity the man shot an elephant in his pyjamas S NP VP NP det n NP det n PP VP v VP v NP VP v NP PP PP prep NP Lexicon det {an, the, his} n {elephant, man, shot, pyjamas} v shot prep in We introduce a PP rule in two places
S NP VP vNP det n det n the man shot an elephant PP prepNP det n in his pyjamas Same again but with structural ambiguity the man shot an elephant in his pyjamas S NP VP NP det n NP det n PP VP v VP v NP VP v NP PP PP prep NP Lexicon det {an, the, his} n {elephant, man, shot, pyjamas} v shot prep in We introduce a PP rule in two places
S NP VP det n PP prep NP the man 5. Top-down with structural ambiguity the man shot an elephant in his pyjamas S NP VP NP det n NP det n PP VP v VP v NP VP v NP PP PP prep NP S NP VP NP det n NP det n PP det {an, the, his} n {elephant, man, shot, pyjamas} PP prep NP prep in At this point, depending on our strategy (breadth-first vs. depth-first) we may consider the NP complete and look for the VP, or we may try the second NP rule. Let’s see what happens in the latter case. The next word, shot, isn’t a prep, So this rule simply fails
S NP VP v det n det n v NP shot the man shot an elephant 5. Top-down with structural ambiguity the man shot an elephant in his pyjamas S NP VP NP det n NP det n PP VP v VP v NP VP v NP PP PP prep NP S NP VP NP det n NP det n PP det {an, the, his} n {elephant, man, shot, pyjamas} VP v VP v NP VP v NP PP v shot NP det n NP det n PP det {an, the, his} As before, the first VP rule works, But does not account for all the input. n {elephant, man, shot, pyjamas} Similarly, if we try the second VP rule, and the first NP rule …
S NP VP v det n v NP det n shot the man shot an elephant 5. Top-down with structural ambiguity the man shot an elephant in his pyjamas S NP VP NP det n NP det n PP VP v VP v NP VP v NP PP PP prep NP S NP VP NP det n NP det n PP det {an, the, his} n {elephant, man, shot, pyjamas} Depth-first: it’s a stack, LIFO VP v VP v NP VP v NP PP Breadth-first: it’s a queue, FIFO v shot NP det n NP det n PP So what do we try next? This? Or this?
S NP VP det n det n v NP PP shot the man in an elephant prep NP his pyjamas 5. Top-down with structural ambiguity (depth-first) the man shot an elephant in his pyjamas S NP VP NP det n NP det n PP VP v VP v NP VP v NP PP PP prep NP S NP VP NP det n NP det n PP det {an, the, his} n {elephant, man, shot, pyjamas} VP v VP v NP VP v NP PP v shot NP det n NP det n PP det {an, the, his} n {elephant, man, shot, pyjamas} PP prep NP prep in
S NP VP det n det n v NP PP shot the man an in elephant prep NP his pyjamas 5. Top-down with structural ambiguity (breadth-first) the man shot an elephant in his pyjamas S NP VP NP det n NP det n PP VP v VP v NP VP v NP PP PP prep NP S NP VP NP det n NP det n PP det {an, the, his} n {elephant, man, shot, pyjamas} VP v VP v NP VP v NP PP v shot NP det n NP det n PP det {an, the, his} n {elephant, man, shot, pyjamas} PP prep NP prep in
Recognizing ambiguity • Notice how the choice of strategy determines which result we get (first). • In both strategies, there are often rules left untried, on the list (whether queue or stack). • If we want to know if our input is ambiguous, at some time we do have to follow these through. • As you will see later, trying out alternative paths can be quite intensive
S VP S S VP NP PP S VP NP NP VP NP det n v det n prep det n S NP VP NP det n NP det n PP VP v VP v NP VP v NP PP PP prep NP 6. Bottom-up with structural ambiguity VP v NP det n PP prep NP NP det n PP VP v NP VP v NP PP S NP VP the man shot an elephant in his pyjamas
S NP NP NP VP S NP VP NP det n NP det n PP VP v VP v NP VP v NP PP PP prep NP 6. Bottom-up with structural ambiguity S VP S VP NP VP PP det n v det n prep det n the man shot an elephant in his pyjamas
Recursive rules • “Recursive” rules call themselves • We already have some recursive rule pairs: NP det n PP PP prep NP • Rules can be immediately recursive AdjG adj AdjG (the) big fat ugly (man)
AdjG AdjG AdjG AdjG AdjG AdjG adj Recursive rules Left recursive AdjG AdjG adj AdjG adj Right recursive AdjG adj AdjG AdjG adj adj adj adj big fat rich old big fat rich old
NP det AdjG n AdjG adj AdjG adj AdjG adj AdjG adj AdjG adj NP det n the the 7. Top-down with left recursion NP det n NP det AdjG n AdjG AdjG adj AdjG adj the big fat rich old man NP det n NP det AdjG n AdjG AdjG adj AdjG adj You can’t have left-recursive rules with a top-down parser, even if the non-recursive rule is first
NP det AdjG n man adj AdjG adj AdjG adj AdjG adj AdjG adj AdjG old rich fat big adj old the 7. Top-down with right recursion NP det n NP det AdjG n AdjG adj AdjG AdjG adj the big fat rich old man old NP det n NP det AdjG n AdjG adj AdjG AdjG adj
AdjG AdvG AdvG AdjG det adv adv adj adj n 8. Bottom-up with left and right recursion NP det n NP det AdjG n AdjG AdvG adj AdjG AdjG adj AdvG AdvG adv AdvG adv AdjG adj AdvG adv NP AdjG AdvG adj AdjG AdvG AdvG adv AdjG AdvG adj AdjG AdjG NP det AdjG n AdjG rule is right recursive, AdvG rule is left recursive AdvG AdjG Quite a few useless paths, but overall no difficulty the very very fat ugly man
8. Bottom-up with left and right recursion NP det n NP det AdjG n AdjG AdvG adj AdjG AdjG adj AdvG AdvG adv AdvG adv AdjG adj AdvG adv NP AdjG AdvG adj AdjG AdvG AdvG adv AdjG AdvG adj AdjG AdjG NP det AdjG n AdjG rule is right recursive, AdvG rule is left recursive AdvG AdvG AdjG det adv adv adj adj n the very very fat ugly man
Empty rules • For example NP det AdjG n AdjG adj AdjG AdjG ε • Equivalent to • NP det AdjG n • NP det n • AdjG adj • AdjG adj AdjG • Or • NP det (AdjG) n • AdjG adj (AdjG)
NP det AdjG n NP det AdjG n adj AdjG adj AdjG adj AdjG big fat the the man man 7. Top-down with empty rules NP det AdjG n AdjG adj AdjG AdjG ε the man the big fat man NP det AdjG n NP det AdjG n AdjG adj AdjG AdjgG ε AdjG adj AdjG AdjgG ε
AdjG AdjG AdjG AdjG 8. Bottom-up with empty rules NP det AdjG n AdjG adj AdjG AdjG ε AdjG ε AdjG adj AdjG NP det AdjG n NP Lots of useless paths, especially in a long sentence, but otherwise no difficulty AdjG det adj n the fat man
Some additions to formalism • Recursive rules build unattractive tree structures: you’d rather have flat trees with unrestricted numbers of daughters • Kleene star • AdjG adj* NP AdjG det adj adj adj adj n the big fat old ugly man
Some additions to formalism • As grammars grow, the rule combinations multiply and it gets clumsy NP det n NP det AdjG n NP det n PP NP det AdjG n PP NP (det) n (AdjG) n (PP) NP n NP AdjG n NP n PP NP AdjG n PP
Processing implications • Parsing with Kleene star • Neatly combines empty rules and recursive rules
NP det adj* n NP det adj* n adj adj* adj adj* adj adj* fat big the the man man 9. Top-down with Kleene star NP det adj* n the man the big fat man NP det adj* n NP det adj* n adj* = adj adj* adj* = ε adj* = adj adj* adj* = ε
the fat ugly man 10. Bottom-up with Kleene star NP det adj* n NP det adj* n NP NP det adj n det adj n adj the fat man
Processing implications • Parsing with bracketed symbols • Parser has to expand rules • either in a single pass beforehand • or (better) on the fly (as it comes to them) • So bracketing convention is just a convenience for rule-writers NP det n (AdjG) n (PP) NP (det) n (AdjG) n (PP) NP n (AdjG) n (PP)
Top down vs. bottom-up • Bottom-up builds many useless trees • Top-down can propose false trails, sometimes quite long, which are only abandoned when they reach the word level • Especially a problem if breadth-first • Bottom-up very inefficient with empty rules • Top-down CANNOT handle left-recursion • Top-down cannot do partial parsing • Especially useful for speech • Wouldn’t it be nice to combine them to get the advantages of both?
Left-corner parsing • The “left corner” of a rule is the first symbol after the rewrite arrow • e.g. in S NP VP, the left corner is NP. • Left corner parsing starts bottom-up, taking the first item off the input and finding a rule for which it is the left corner. • This provides a top-down prediction, but we continue working bottom-up until the prediction is fulfilled. • When a rule is completed, apply the left-corner principle: is that completed constituent a left-corner?
S VP NP n v NP det det elephant man the an shot n 9. Left-corner with simple grammar S NP VP NP det n VP v VP v NP the man shot an elephant NP det n S NP VP VP v but text not all accounted for, so try VP v NP NP det n