420 likes | 600 Views
LING 438/538 Computational Linguistics. Sandiway Fong Lecture 28. Administrivia. Homework 7 was due today finally done grading Homework 6 y ou should have received an email from me. Administrivia. Administrivia. Rules Email me your slides BEFORE class
E N D
LING 438/538Computational Linguistics Sandiway Fong Lecture28
Administrivia • Homework 7 was due today • finally done grading Homework 6 • you should have received an email from me
Administrivia • Rules • Email me your slides BEFORE class • Each talk is 8 minutes (2 mins for questions) • We’ll be constrained for time, so be well-prepared/ practiced
Administrivia • Spend a few minutes to fill out TCE forms at the end of today …
Background Reading • JM • Chapter 12: Formal Grammars of English • Chapter 13: Syntactic Parsing
Top-Down Parsing • we already know one top-down parsing algorithm • DCG rule system starting at the top node • using the Prolog computation rule • always try the first matching rule • expand x --> y, z. • top-down: x then y and z • left-to-right: do y then z • depth-first: expands DCG rules for y before tackling z • problems • left-recursion • gives termination problems (but fixableby grammar transformation) • no bottom-up filtering • Inefficient (be aware there are more efficient algorithms) • left-corner idea … discussed back in Lecture 14 i.e. we don’t need to try all possible rules if we pre-compute some information about the grammar… we’ll take a look at this in a moment
Top-Down Parsing • Prolog computation rule • is equivalent to the stack-based algorithm shown in the 1st edition of the textbook • not in 2nd edition • Prolog advantage • we don’t need to implement this explicitly • this is default Prolog strategy • We can concentrate our efforts on implementing language phenomena…
Left Corner Parsing • Top-down parsing can use bottom-up filtering • filter top-down rule expansion using bottom-up information • current input provides the bottom-up information • left-corner idea • example • s(s(NP,VP)) --> np(NP), vp(VP). • what terminals can be used to begin this phrase? • answer: whatever can begin NP • np(np(D,N)) --> det(D), nominal(N). • np(np(PN)) --> propernoun(PN). • (recursive) answer: whatever can begin Det or ProperNoun • det(det(that)) --> [that]. • det(det(this)) --> [this]. • det(det(a)) --> [a]. • propernoun(propn(houston)) --> [houston]. • propernoun(propn(twa)) --> [twa]. • Final answer: • {that,this,a,houston,twa} “Left Corner” s /\ np vp /\ det nominal propernoun
Left Corner Parsing • example • does this flight include a meal? • Grammar rules (augmented with Left Corner terminal sets) • s(s(NP,VP)) --> np(NP), vp(VP). LC: {that,this,a,houston,twa} • s(s(Aux,NP,VP)) --> aux(Aux), np(NP), vp(VP). LC:{does} • s(s(VP)) --> vp(VP). LC:{book,include,prefer} • only rule 2 is compatible with the input • Algorithm • match first input terminal against left-corner (LC) set for each possible matching rule • left-corner idea prunes away or rules out options 1 and 3
Left Corner Parsing • DCG Rules • s(s(NP,VP)) --> np(NP), vp(VP). LC: {that,this,a,houston,twa} • s(s(Aux,NP,VP)) --> aux(Aux), np(NP), vp(VP). LC: {does} • s(s(VP)) --> vp(VP). LC: {book,include,prefer} • Implementation: • lc1,lc2, and lc3 are dummy non-terminals, don’t contribute to structure • or consume input
Left Corner Parsing • Prolog code (not grammar rules): for efficiency once(Goal) means Goal can only succeed once L is the input sentence we look at its first word W but don’t “consume” the word
Left Corner Parsing grammar rules are always compiled into Prolog code last two arguments for each nonterminal are always lists
Left Corner Parsing • Note: • not all rules need be rewritten • lexicon rules are direct left-corner rules, so no filtering is necessary • det(det(a)) --> [a]. • noun(noun(book)) --> [book]. • i.e. no need to call lcNas in • det(det(a)) --> lc11,[a]. • noun(noun(book)) --> lc12,[book]. only one aux in our grammar
Left Corner Parsing s = skip, not single step
Bottom-Up Parsing • LR(0) parsing • An example of bottom-up tabular parsing • Similar to the top-downEarley algorithm described in the textbook in that it uses the idea of dotted rules • finite state automata revisited…
Tabular Parsing • e.g. LR(k) (Knuth, 1960) • invented for efficient parsing of programming languages • disadvantage: a potentially huge number of states can be generated when the number of rules in the grammar is large • can be applied to natural languages (Tomita 1985) • build a Finite State Automaton (FSA) from the grammar rules, then add a stack • tables encode the grammar (FSA) • grammar rules are compiled • no longer interpret the grammar rules directly • Parser = Table + Push-down Stack • table entries contain instruction(s) that tell what to do at a given state … possibly factoring in lookahead • stack data structure deals with maintaining the history of computation and recursion
Tabular Parsing • Shift-Reduce Parsing • example • LR(0) • left to right • bottom-up • (0) no lookahead (input word) • Three possible machine actions • Shift: read an input word • i.e. advance current input word pointer to the next word • Reduce: complete a nonterminal • i.e. complete parsing a grammar rule • Accept: complete the parse • i.e. start symbol (e.g. S) derives the terminal string
Tabular Parsing • LR(0) Parsing • L(G) = LR(0) • i.e. the language generated by grammar G is LR(0) if there is a unique instruction per state (or no instruction = error state) LR(0) is a proper subset of context-free languages • note • human language tends to be ambiguous • there are likely to be multiple or conflicting actions per state • if we are using Prolog, we can let Prolog’s computation rule handle it • via Prolog backtracking deterministic!
Dotted rule notation “dot” used to track the progress of a parse through a phrase structure rule examples vp --> v .np means we’ve seen v and predict np np --> . d np means we’re predicting a d (followed by np) vp --> vp pp. means we’ve completed a vp state a set of dotted rules encodes the state of the parse set of dotted rules = name of the state kernel vp --> v . np vp --> v . completion (of predict NP) np --> . d n np --> . n np --> . npcp Tabular Parsing
Tabular Parsing • compute possible states by advancing the dot • example: • (Assume d is next in the input) • vp --> v . np • vp --> v . (eliminated) • np --> d . n • np --> . n (eliminated) • np --> . np cp
Dotted rules example State 0: s --> . npvp np--> .d n np--> .n np--> .nppp possible actions shiftd and go to new state shiftn and go to new state Creating new states Tabular Parsing State 1 shift d np --> d. n s --> . npvp np --> .d n np --> .n np --> .nppp shift n np --> n . State 0 State 2
Tabular Parsing • State 1: Shift N, goto State 3 State 3 np --> d n. shift n State 1 shift d np --> d. n s --> . npvp np --> .d n np --> .n np --> .nppp shift n np --> n . State 0 State 2
Tabular Parsing • Shift • take input word, and • put it on the stack State 3 np --> d n. shift n State 1 shift d np --> d. n s --> . npvp np --> .d n np --> .n np --> .nppp [Nman][V hit ] … [V hit ] … [N man] [D a ] [D a ] [D a ] [Nman][V hit ] … shift n np --> n . Stack Input State 0 State 2 • state 3 (Powerpoint animation)
Tabular Parsing • State 2: Reduce action np --> n . State 3 np --> d n. shift n State 1 shift d np --> d. n s --> . npvp np --> .d n np --> .n np --> .nppp shift n np --> n . State 0 State 2
[NP milk] Tabular Parsing • Reduce NP -> N . • pop [N milk] off the stack, and • replace with [NP [N milk]] on stack [V is ] … [N milk] Input • State 2 Stack
Tabular Parsing • State 3: Reduce np --> d n. State 3 np --> d n. shift n State 1 shift d np --> d. n s --> . npvp np --> .d n np --> .n np --> .nppp shift n np --> n . State 0 State 2
[NP[D a ][N man]] Tabular Parsing • Reduce NP -> D N . • pop [N man] and [D a] off the stack • replace with [NP[D a][N man]] [V hit ] … [N man] [D a ] Input • State 3 Stack
Tabular Parsing • State 0: Transition NP State 3 np --> d n. shift n State 1 shift d np --> d. n s --> . npvp np --> .d n np --> .n np --> .nppp s --> np . vp np --> np . pp vp --> . v np vp --> . v vp --> . vppp pp --> . p np np shift n np --> n . State 0 State 2 State 4
Tabular Parsing • for both states 2 and 3 • NP -> N . (reduce NP -> N) • NP -> D N . (reduce NP -> D N) • after Reduce NP operation • gotostate 4 • notes: • states are unique • grammar is finite • procedure generating states must terminate since the number of possible dotted rules is finite • no left recursion problem (bottom-up means input driven)
Tabular Parsing • Observations • table is sparse • example • State 0, Input: [V ..] • parse fails immediately • in a given state, input may be irrelevant • example • State 2 (there is no shift operation) • there may be action conflicts • example • State 1: shift D, shift N (only if word is ambiguous…) • more interesting cases • shift-reduce and reduce-reduce conflicts
Tabular Parsing • finishing up • an extra initial rule is usually added to the grammar • SS --> S . $ • SS = start symbol • $ = end of sentence marker • input: • milk is good for you $ • accept action • discard $ from input • return element at the top of stack as the parse tree
LR Parsing in Prolog • Recap • finite state machine technology + a stack • each state represents a set of dotted rules • Example • s --> . npvp • np --> .d n • np --> .n • np --> .nppp • we transition, i.e. move, from state to state by advancing the “dot” over the possible terminal and nonterminal symbols
LR State Machine State 5 npnp pp. State 8 snpvp. vpvp .pp pp .pnp State 13 sss $. State 4 snp .vp npnp .pp vp .vnp vp .v vp .vppp pp .pnp pp pp vp pp $ p State 9 vpvp pp. p State 1 sss .$ State 6 pp p .np np .np pp np .n np .dn p State 11 pp pnp. npnp. pp pp .pnp np np s State 0 ss .s $ s .npvp np .np pp np .n np .dn v n p pp n State 7 vpv .np vpv . np .np pp np .n np .dn d State 3 npn. n State 10 vpvnp. npnp. pp pp .pnp np d d State 12 npdn. State 2 npd .n n [animation]
Build Actions • two main actions • Shift • move a word from the input onto the stack • Example: • np--> .d n • Reduce • build a new constituent • Example: • np--> d n.
Lookahead • LR(1) • a shift/reduce tabular parser • using one (terminal) lookahead symbol • decide on whether to take a reduce action depending on statex nextinput symbol • Example • select the valid reduce operation consulting the next word • cf. LR(0): select an action based on just the current state
Lookahead • potential advantage • the input symbol may partition the action space • resulting in fewer conflicts • provided the current input symbol can help to choose between possible actions • potential disadvantages • larger finite state machine • more possible dotted rule/lookahead combinations than just dotted rule combinations • might not help much • depends on grammar • more complex (off-line) computation • building the LR machine gets more complicated
Lookahead • formally • X --> α.Yβ, L • L = lookaheadset • L= set of possible terminals that can follow X • α,β (possibly empty) strings of terminal/non-terminals • example • State 0 • ss-->.s $ [[]] • s-->.npvp [$] • np-->.dn [p,v] • np-->.n [p,v] • np-->.np pp [p,v]
Lookahead • central idea • for propagating lookahead in state machine • if dotted rule is complete, • lookahead informs parser about what the next terminal symbol should be • example • NP --> D N. , L • reduce by NP rule provided current input symbol is in set L
LR Parsing • in fact • LR-parsers are generally acknowledged to be the fastest parsers • especially when combined with the chart technique(table: dynamic programming) • reference • (Tomita, 1985) • textbook • Earley’s algorithm • uses chart • but follows the dotted-rule configurations dynamicallyat parse-time • instead of ahead of time (so slower than LR)