1 / 21

Preposition Phrase Attachment

Preposition Phrase Attachment. To what previous verb or noun phrase does a prepositional phrase (PP) attach?. The woman. with a poodle. saw. a man. in the park. with a poodle. with a telescope. on Tuesday. on his bicycle. A Simplified Version.

jesther
Download Presentation

Preposition Phrase Attachment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Preposition Phrase Attachment • To what previous verb or noun phrase does a prepositional phrase (PP) attach? The woman with a poodle saw a man in the park with a poodle with a telescope on Tuesday on his bicycle

  2. A Simplified Version • Assume ambiguity only between preceding base NP and preceding base VP: The woman had seenthe man with the telescope. Q: Does the PP attach to the NP or the VP? • Assumption: Consider only NP/VP head and the preposition

  3. Simple Formulation • Determine attachment based on log-likelihood ratio: LLR(v, n, p) = log P(p | v) - log P(p | n) If LLR > 0 then attach to verb, If LLR < 0 attach to noun

  4. Issues • Multiple attachment: • Attachment lines cannot cross • Proximity: • Preference for attaching to closer structures, all else being equal Chrysler will end its troubled venture with Maserati. P(with | end) = 0.118 P(with | venture) = 0.107 !!!

  5. Hindle & Rooth (1993) • Consider just sentences with a transitive verb and PP, i.e., of the form: ... bVP bNP PP ... Q: Where does the first PP attach (NP or VP)? Indicator variables (0 or 1): VAp: Is there a PP headed by p after v attached to v? NAp: Is there a PP headed by p after n attached to n? NB: Both variables can be 1 in a sentence

  6. Attachment Probabilities • P(attach(p) = n | v, n) = P(NAp=1 | n) • Verb attachment is irrelevant; if it attaches to the noun it cannot attach to the verb • P(attach(p) = v | v, n) = P(VAp=1, NAp=0 | v, n) = P(VAp=1 | v) P(NAp=0 | n) • Noun attachment is relevant, since the noun ‘shadows’ the verb (by proximity principle)

  7. Estimating Parameters • MLE: P(VAp= 1 | v) = C(v,p) / C(v) P(NAp= 1 | n) = C(n,p) / C(n) • Using an unlabeled corpus: • Bootstrap from unambiguous cases: The road from Chicago to New York is long. She went from Albany towards Buffalo.

  8. Unsupervised Training • Build initial model using only unambiguous attachments • Apply initial model and assign attachments if LLR above threshhold • Divide remaining ambiguous cases as 0.5 counts for each possibility Use of EM as principled method?

  9. Limitations • Semantic issues: I examined the man with a stethoscope. I examined the man with a broken leg. • Other contextual features: Superlative adjectives (biggest) indicate NP • More complex sentences: The board approved its acquisitionby BigCoof Milwaukee for $32 a shareat its meetingon Tuesday.

  10. Memory-Based Formulation • Each example has four components: V N1 P N2 examine man with stethoscope Class = V • Similarity based on information gain weighting for matching components • Need ‘semantic’ similarity measure for words: • stethoscope ~ thermometer kidney ~ leg

  11. MVDM Word Similarity • Idea:Words are similar to the extent that they predict similar class distributions • Data sparseness is a serious problem, though! • Extend idea to task independent similarity metric...

  12. Lexical Space • Represent ‘semantics’ of a word by frequencies of words which coöccur with it, instead of relative frequencies of classes • Each word has 4 vectors of frequencies for words 2 before, 1 before, 1 after, and 2 after

  13. Results • Baseline comparisons: • Humans (4-tuple): 88.2% • Humans (full sentence): 93.2% • Noun always: 59.0% • Most likely for prep: 72.2% • Without Info Gain: 83.7% • With Info Gain: 84.1%

  14. Using Many Features • Use many features of an example together • Consider interaction between features during learning • Each example represented as a feature vector: x = (f1,f2,...,fn)

  15. kNN Geometric Interpretation Linear Separator Learning

  16. Linear Separators • Linear separator model is a vector of weights: w = (w1,w2,...,wn) • Binary classification: Is wTx > 0 ? • ‘Positive’ and ‘Negative’ classes A threshhold other than 0 is possible by adding dummy element of “1” to all vectors – the threshhold is just the weight for that element

  17. Error-Based Learning • Initialize w to be all 1’s • Cycle x through examples repeatedly (random order): • If wTx > 0 but x is really negative, then decrease w’s elements • If wTx < 0 but x is really positive, then decrease w’s elements

  18. Winnow • Initialize w to be all 1’s • Cycle v through examples repeatedly (random order): a) If wTx < 0 but x is really positive, then b) If wTx > 0 but x is really negative, then:

  19. Issues • No negative weights possible! • Balanced Winnow: Formulate weights as sum of 2 weight vectors: w = w+- w- Learn each vector separately, w+ regularly, and w- with polarity reversed • Multiple classes: • Learn one weight vector for each class (learning X vs. not-X) • Choose highest value result for example

  20. PP Attachment Features • Words in each position • Subsets of the above, e.g: <v=run,p=with> • Word classes at various levels of generality: stethoscope  medical instrument  instrument device  instrumentation  artifact  object  physical thing • Derived from WordNet – handmade lexicon • 15 basic features plus word-class features

  21. Results including preposition of: Transform Backoff MBL Winnow 81.9 84.5 84.4 84.8 Results • Results without preposition of:

More Related