180 likes | 302 Views
Accurate Parsing. ('they worry that air the shows , drink too much , whistle johnny b. goode and watch the other ropes , whistle johnny b. goode and watch closely and suffer through the sale', 2.1730387621600077e-11 ). David Caley Thomas Folz-Donahue Rob Hall Matt Marzilli.
E N D
Accurate Parsing ('they worry that air the shows , drink too much , whistle johnny b. goode and watch the other ropes , whistle johnny b. goode and watch closely and suffer through the sale', 2.1730387621600077e-11) David Caley Thomas Folz-Donahue Rob Hall Matt Marzilli
Accurate Parsing: Our Goal Given a grammar • For a sentence S, return the parse tree with the max probability conditioned upon S. arg-max t inT P (t| S) where T is the set of possible parse trees of sentence S
Talking Points • Using the Penn-Treebank • Reading in n-ary trees • Finding Head-tags within n-ary productions • Converting to Binary Trees • Inducing a CFG grammar • Probabilistic CYK • Handling Unary rules • Dealing with unknowns • Dealing with run times • Beam search, limiting depth of unary rules, further optimizations • Example Parses and Trees • Lexicalization Attempts
Using the Penn-Treebank: Our Training Data • Contains tagged data and n-ary trees used from a Wall Street Journal corpus. • Contains some information unneeded by the parser. • Questionable Tagging • (JJ the) ?? • Example…
Using the Penn-Treebank: Handling N-ary trees ( (S (NP-SBJ-1 (NNS Consumers) ) (VP (MD may) (VP (VB want) (S (NP-SBJ(-NONE- *-1) ) (VP (TO to) (VP (VB move) (NP (PRP$ their) (NNS telephones) ) (ADVP-DIR (NP (DT a) (RB little) ) (RBR closer) (PP (TO to) (NP (DT the) (NN TV) (NN set) )))))))) Functional tags such as NP-SBJ-1 are ignored We simply call this an NP Also –NONE- tags are used for traces, these are ignored also.
Using the Penn-Treebank: Head-Tag Finding Algorithm For a context-free rule X -> Y1 … Yn, for each rule we can use a function to determine the “head” of the rule. In the example above this could be any Y1 … Yn The head is the most important child tag. • Head-Tags Algorithm as Outlined in Collins Thesis • Allow us to determine the head-tags that will be used for later binary tree conversion
Using the Penn-Treebank: Head-Tag Finding Algorithm If nothing is found in a list traversal the head-tag becomes the left or right most element.
Using the Penn-Treebank: Head-Rule Finding Algorithm • Rules for NPs are a bit different • If the last word is tagged POS, return (last-word) • Else • Search from right to left for the first child which is in the set {NN, NNP, NNPS, NNS, NX, POS, JJR} • Else • Search from left to right for first child which is an NP • Else • Search from right to left for the first child which is in the set {$, ADJP, PRN} • Else • Do the same with the set {CD} • Else • Do the same with the set {JJ, JJS, RB, QP} • Else • Return the last word
Using the Penn-Treebank: Binary Tree Conversion • Now we put the Head-Tags to use • Necessary for CFG grammar use with probabilistic CYK R - > LiLi-1…L1LoHRoR1 … Ri-1Ri A General n-ary rule LiLi-1…L1LoHRoR1 … Ri-1 Ri On right side of H-tag we recursively split last element to make a new binary rule, left recursive. On the left side we do the same by removing the first element, right recursive. Li Li-1…L1LoH
Using the Penn-Treebank: Grammar Induction Procedure • After we have binary trees we can easily begin to identify rules and record their frequency • Identify every production and save them into a python dictionary • Frequencies cached in a local file for later use, read-in on subsequent executions • No immediate smoothing is done on probabilities, Grammar is later trimmed to help with performance
Probabilistic CYK: The Parsing Step • We use a Probabilistic CYK implementation to parse our CFG grammar and also assign probabilities to final parse trees. • Useful to provide multiple parses and disambiguate sentences • New Concerns • Unary Rules and their lengths • Runtime (result of incredibly large grammar)
Probabilistic CYK: Handling Unary Rules within Grammar • Unary Rules of the form X->Y or X->a are ubiquitous in our grammar • The closure of a constituent is needed to determine all the unary productions that can lead to that constituent. • Def Closure(X) = U{Closure(Y) | Y->X}, i.e all non terminals that are reachable, by unary rules, from X. • We implement this iteratively and also maintain a closed list and limit depth, to prevent possible infinite recursion
Probabilistic CYK: Dealing with Run times • Beam Search • Limit the number of nodes saved in each cell of CYK dynamic programming table. • Using beam width k, All generations are kept sorted and the k best are saved for the next iteration • Experiences with 100, 200, 1000? list size <= k
Probabilistic CYK: Dealing with Run Times • Another optimization was to remove all productions rules with frequency < fc • Used fc = 1, 2… • Also limited depth when calculating the unary rules (closure) of a constituent present in our CYK table • Extensive unary rules found to greatly slow down our parser • Also long chains of unary productions have extremely low probabilities, they are commonly pruned by beam search anyway
Probabilistic CYK: Random Sentences and Example Trees • Some random sentences from our grammar with associated probabilities. ('buy jam , cocoa and other war-rationed goodies',0.0046296296296296294) ('cartoonist garry trudeau refused to impose sanctions , including petroleum equipment , which go into semiannual payments , including watches , including three , which the federal government , the same company formed by mrs. yeargin school district would be confidential', 2.9911073159300768e-33) ('33 men selling individual copies selling securities at the central plaza hotel die', 7.4942533128815141e-08)
Probabilistic CYK: Random Sentences and Example Trees ('young people believe criticism is led by south korea', 1.3798001044090654e-11) ('the purchasing managers believe the art is the often amusing , often supercilious , even vicious chronicle of bank of the issue yen-support intervention', 7.1905882731776209e-1)
S:-(VP) +--VP +--VP:-(VB)-NP +--VP:-(VB) | +--VB | +--buy +--NP +--NP:-(NP)-NP +--NP:-(NP)-CC | +--NP:-(NP)-NP | | +--NP:-(NP)-, | | | +--NP:-(NP) | | | | +--NP | | | | +--NP:-(NN) | | | | +--NN | | | | +--jam | | | +--, | | | +--, | | +--NP | | +--NP:-(NN) | | +--NN | | +--cocoa | +--CC | +--and +--NP +--NP:-(NNS)JJ-NNS +--JJ | +--other +--NP:-(NNS)JJ-NNS +--JJ | +--war-rationed +--NP:-(NNS) +--NNS +--goodies S(buy) +--VP(buy) +--VB(buy) | +--buy +--NP(jam) +--NP(jam)-NP(goodies) | +--NP(jam)-CC(and) | | +--NP(jam)-NP(cocoa) | | | +--NP(jam) | | | | +--NN(jam) | | | | +--jam | | | +--,(,) | | | +--, | | +--NP(cocoa) | | +--NN(cocoa) | | +--cocoa | +--CC(and) | +--and +--NP(goodies) +--JJ(other) | +--other +--NP(goodies):JJ(other)- +--JJ(war-rationed) | +--war-rationed +--NNS(goodies) +--goodies
Accurate Parsing Conclusion • Massive Lexicalized Grammar • Working Probabilistic Parser • Future Work • Handle sparsity • Smooth Probabilities