Chapter 8

Chapter 8 Lexical Acquisition February 19, 2007 Additional Notes to Manning’s slides

Slide 2 notes • Language is constantly evolving • NLP properties of interest is not available in dictionary form – for instance frequency or probability of occurrence • Need to constantly learn and acquire new terms and usage • Focus areas for this chapter • Attachment Ambiguity • A. The children ate the cake with their hands. • B. The children ate the cake with blue icing. • Semantic characterization of a verb’s argument

Slide 3 notes • Evaluation measures discussion • tp  true positives • fp  false positives – Type II errors • fn  false negatives – Type I errors • tn  true negatives

Slide 5 notes • Trade-off exist between precision and recall • One can simply return all possible documents and get 100% recall (no false negatives) • But precision will be low as there will be a lot of false positives

Slide 9 - Notes

Slide 11 - notes • tell – has a subcategorization frame NP NP S (subject, object, clause) • find – lacks such a frame. But has NP NP (subject, object)

Slide 12 - notes • Cues for frame: • NP NP • (OBJ|SUBJ_OBJ|CAP)(PUNC|CC) • OBJ  personal pronouns like me and him • SUBJ_OBJ  pronouns such as you, it • CC  subordinating conjunction like if, before or as • The error rate determination uses binomial distribution – each occurrence of the verb is an independent coin flip for which the cue occurs and does not correctly identify the frame (error rate ej) – and (1-ej) where it works correctly.

Slide 13 - notes • Brent (1993) Lerner algorithm has high precision – but low recall. • Manning (1993) By combining it with tagging that will look for patterns such as the following – one can increase the reliability. • (OBJ|SUBJ_OBJ|CAP)(PUNC|CC)

Slide 32 - notes • For instance, the verb “eat” prefers strongly some thing edible as object. • Exceptions include, metaphorical use of the word: • “eating one’s words” or “fear eats the soul”.

Slide 33 - NotesKullback-Leibler Divergence • Relative entropy or KL (Kullback-Leibler) divergence • Example for A(v,n)  noun like “chair” • Susan interrupted the chair.

Slide 38 - notes • X = {1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1} • Y = {1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0} • Matching coefficient = 1 • Dice coefficient = (2 x 1)/(10+10) = 0.1 • Jaccard coefficient = 1/(10 + 10 -1) =~ 0.05 • Overlap coefficient = 1/10 = 0.1 • Cosine coefficient = 1/sqrt of (100) = 0.1 • Cosine is very useful for comparing data with widely varying data set; if one vector with one non-zero entry and another with 1000 non-zero entries, • Dice will give =~ 0.002, Cosine =~ 0.03

Chapter 8

Chapter 8

Presentation Transcript

Diamond Chapter 8 1 CHAPTER 8

CHAPTER 8

Chapter 8

CHAPTER 8

Chapter 8

Chapter 8

Chapter 8

Chapter 8

Chapter 8

Chapter 8

Chapter 8:

Chapter 8

Chapter 8

Chapter 8

Chapter 8

Chapter 8

Chapter 8

Chapter 8

Chapter 8

Chapter 8

Chapter 8

Chapter 8