1 / 45

Prepositional Phrase Attachment

Prepositional Phrase Attachment. Chris Brew. Ohio State University. Prepositional Phrase Attachment. Hindle and Rooth: partial parser to get statistics Collins and Brooks: back off estimation from tree bank data + attachment decision. Merlo,Crocker and Berthouzoz: multiple PP disambiguated

breindel
Download Presentation

Prepositional Phrase Attachment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Prepositional Phrase Attachment Chris Brew Ohio State University

  2. Prepositional Phrase Attachment • Hindle and Rooth: partial parser to get statistics • Collins and Brooks: back off estimation from tree bank data + attachment decision. • Merlo,Crocker and Berthouzoz: multiple PP disambiguated • Ratnaparkhi: entirely unsupervised

  3. The problem

  4. Hindle and Rooth • Whittemore, Ferrara and Brunner • Structural heuristics (Kimball’s Right Association, Frazier’s Minimal Attachment) account for only 55% of behaviour • Lexical preferences do much better • H and R • note that the preferences for this experiment were provided by human judgement • ask how to get automatically a good list of lexical preferences

  5. Discovering Lexical Association in text • Church’s part of speech analyser • Hindle’s FIDDICH partial parser • 13 million words of AP news wire

  6. Fiddich S ? NP AUX VP ? PP DART NBAR VPREZ VPPRT NP FIN TNS PREP NP the ADJ NPL are aimed . pro+ ? in DART NBAR ? ? radical PP the PNP ADV changes PP PREP NP PNP PNP evidently PREP NBAR at VP Union Soviet in N NPL NP VING CONJ NPL N regulations IART NBAR PP remedying export an ADJ N PREP NP and of NBAR extreme customs shortage N NPL consumer goods

  7. Extract information about words ID Verb Noun Prep Syntax a change in -V b regulation c aim PRO-+ at d remedy shortage of e good in f DART-PNP g assuage citizen h scarcity of i item as j wiper k VING l VING

  8. What the table means • noun column has head noun of noun phrase (or various special cases) • verb column has head verb if noun phrase was its object • prep column has following preposition • Syntax column V- if no preceding verb

  9. Counting attachments • Parser isn’t reliable, so use a decision procedure to assign nouns and verbs to noun-attach (na) and verb-attach (va)

  10. No preposition • add a count for <noun,NULL> or <verb,NULL> ID Verb Noun Prep Syntax a change in -V b regulation c aim PRO-+ at d remedy shortage of e good in f DART-PNP g assuage citizen h scarcity of i item as j wiper k VING l VING

  11. Sure Verb Attach 1: • if the noun phrase head is a pronoun add a count for <verb,prep> ID Verb Noun Prep Syntax a change in -V b regulation c aim PRO-+ at d remedy shortage of e good in f DART-PNP g assuage citizen h scarcity of i item as j wiper k VING l VING

  12. Sure Verb Attach 2: • if the verb is passivized, verb attach unless preposition is “by” ID Verb Noun Prep Syntax a change in -V b regulation c aim PRO-+ at d remedy shortage of e good in f DART-PNP g assuage citizen h scarcity of i item as j wiper k VING l VING

  13. Sure Noun Attach • if no verb available, then noun attach ID Verb Noun Prep Syntax a change in -V b regulation c aim PRO-+ at d remedy shortage of e good in f DART-PNP g assuage citizen h scarcity of i item as j wiper k VING l VING

  14. Ambiguous Attach 1: • if LA score > 2.0 verb attach, < -2.0 noun attach. Use stats so far for calculating score. Repeat until stable. ID Verb Noun Prep Syntax a change in -V b regulation c aim PRO-+ at d remedy shortage of<- maybe e good in f DART-PNP g assuage citizen h scarcity of i item as j wiper k VING l VING

  15. Ambiguous Attach 2: • Share counts between noun and verb ID Verb Noun Prep Syntax a change in -V b regulation c aim PRO-+ at d remedy shortage of <- maybe e good in f DART-PNP g assuage citizen h scarcity of i item as j wiper k VING l VING

  16. Unsure Attach: • attach to noun by default ID Verb Noun Prep Syntax a change in -V b regulation c aim PRO-+ at d remedy shortage of e good in f DART-PNP g assuage citizen h scarcity of i item as j wiper k VING l VING

  17. LA scores va: (send (soldier NULL) (into Afghanistan)) na: (send (soldier (into Afghanistan))) LA= log2(P(va p|v,n)/P(na p|v,n)) = log2(P(va into|send,soldier)/P(na into|send,soldier)) • and we approximate this using collected countsP(va into|send,soldier) ~ P(into|send)*P(NULL|soldier)P(na into|send,soldier) ~ P(into|soldier)

  18. Estimating the counts P(into|send) = |send,into| / | send| = .049 P(NULL|soldier) = |soldier,NULL|/ |soldier| = .800 P(into|soldier) = |soldier,into|/|soldier| = .0007 LA = log2(.049*.800/.0007) = 5.81 • which is enough to be very sure that verb attach is right

  19. Smooth the estimates • using typical association rates of prepositions with the whole classes of nouns and verbs P(p|n) = (|n,p|+P(na|p))/( |n|+1)where P(na|p) is |any noun,p|/|any noun| and similarly for verbs • Laplace’s M-estimate again

  20. Performance • ~ 80% correct • can get better precision by accepting lower recall (useful for exploratory text analysis) • “good enough to be added to a parser like Fidditch”

  21. V N2 P N1 V N2 P N1 Backed-off estimation • Collins and Brooks • use N2 as well as N1

  22. Use treebank data • similar approaches • Ratnaparkhi, Reynar and Roukos • Brill and Resnik • difficult to compare results with Hindle and Rooth, because the corpora used are different (but raw scores around 80% in both cases)

  23. The data • 20801 training and 3097 test examples • about 95% of the quadruples in the test data had not been seen in the training set. • compare H&R 200,000 triples

  24. The backed-off method • Katz’s approach to n-grams • If there are enough trigrams: p(wn|wn-1,wn-2) = | wn-2wn-1,wn | / | wn-2,wn-1| • otherwise back off to bigrams p(wn|wn-1,wn-2) = 1*|wn-1,wn | / |wn-1| • otherwise back off to unigram p(wn|wn-1,wn-2) = 1* 2*|wn |

  25. Take this method and apply to PP data • Start with full quadruples • Four possible triples to back off to • Six possible pairs to back off to • Restrict attention to those containing P

  26. How to combine counts from triples and pairs ptriple(1|v,n1,p,n2) ~p(1,v,n1,p)+p(1,v,p,n2)+p(1,n1,p,n2) p(v,n1,p)+p(v,p,n2)+p(n1,p,n2) ppair(1|v,n1,p,n2) ~p(1,v,p)+p(1,p,n2)+p(1,n1,p) p(v,p)+p(p,n2)+p(n1,p) • other combinations tried, this formula is better than simple averaging for this task

  27. What was “enough data”? • In this task it turns out that using a threshold of 0 for the denominator is best. If there is even one instance of the quadruple, trust it. • For n-grams, it was better to ignore low counts • reason for this is not obvious, but in such situations trying things is essential.

  28. Results • 84.1% correct without morphological analysis, 84.5% with • Quadruples more accurate than triples , in turn more accurate than doubles, etc. • But only 148 quadruples in test data, vs 764 triples, 1965 doubles, 216 singles

  29. Comparison with Hindle and Rooth • We have 1924 test cases where H&R would have made a decision • The backoff method using just the |v,p| and |n1,p| counts (86.5%) outscores H&R style (82.1%).

  30. Extra experiments • Setting threshold to 5 reduces performance to 81.6% • Tuples with prepositions in are the most effective.

  31. Attaching Multiple PPs • Merlo, Crocker, Berthouzoz • For a single PP there are two structures, for 2 PPs there are 5, for 3 PPs 14 • so the problem is harder, a dumb algorithm will do poorly • Generalization of Collins/Brooks

  32. Five structures for V NP PP PP • Structure 1 535 The agency said it will [keep]v [the debt]np [under review]pp [ for possible downgrade]pp • Structure 2 1160 Penney will [extend]v [[its involvement]np [with the service]pp]np [for at least five years]pp

  33. Structure 3 1394 [address]v [[budget limits]np [on [credit allocations [ for the Federal Housing agency ]pp]np]pp]np Structure 4 1055 [abandon] [the everyday pricing approach] [in the face of [the poor results]]

  34. Structure 5 539 [answering] [questions [from members of Parliament]] [after his announcement]

  35. Algorithm • Model of PP1 as Collins and Brooks, but excluding p2 • Model of 2PPs is back off over sextuples (i,v,n1,p1,n2,p2) until we get to tuples that don’t have p1, or that don’t have p2 • then Competitive Back off

  36. Competitive Back off • Do standard back off for PP1 using v,n1,p1 • Do standard back off for PP2 using v,n2,p2 • Do back off for PP2 using n1 instead of n2 (ie., v,n1,p2) • Combine these results using a simple procedure, with tiebreak where they conflict.

  37. Results • PP1(2) 84.3% baseline 61.2% (choose most frequent) • PP2(5) 69.6% baseline 29.8% (choose most frequent) • PP3(14) 43.6% baseline 18.5% (choose most frequent)

  38. Results • Take-home messages • Devise a baseline • Measure performance • Pick tasks where beating the baseline is • Impressive • Useful

  39. Ratnaparkhi (Coling 98) • 970K unannotated sentences of WSJ • tagger, simple chunker • heuristic extraction of unambiguous cases

  40. Heuristic extraction • (v,p,n2) if • p is a real preposition (not “of”) • v is the first verb that occurs < K words left of p • v is not a form of the verb “to be” • No noun occurs between v and p • n2 is first word < K words right of p • No verb occurs between p and n2

  41. Heuristic extraction 2 • (n,p,n2) if • p is a real preposition (not “of”) • n is the first that occurs < K words left of p • No verb occurs between v and p • n2 is first word < K words right of p • No verb occurs between p and n2

  42. Accuracy of extraction • Noisy data (c 69% correct) • But abundant

  43. Evaluation • 81.91% with a back off technique • 81.85% with interpolation like H&R • Baseline for this data 70.39%

  44. Portability • Moved to Spanish and got similar performance • H&R would have had to port Fidditch to Spanish

  45. Where to get more information • Charniak ch 8. • Hindle and Rooth CL 19(1) pp 103-120, 1993 • Manning and Schütze, section 8.3 • Original papers

More Related