300 likes | 472 Views
LING 438/538 Computational Linguistics. Sandiway Fong Lecture 26. Administrivia. Reminder 538 requirement Email me chapter/section presentation choices (1st-3rd) Range: chapter 14 through 25: first come, first served…. Today’s Topics. We didn’t finish reviewing Homework 6 …
E N D
LING 438/538Computational Linguistics Sandiway Fong Lecture 26
Administrivia • Reminder • 538 requirement • Email me chapter/section presentation choices (1st-3rd) • Range: chapter 14 through 25: first come, first served…
Today’s Topics • We didn’t finish reviewing Homework 6 … • Finite State Transducers (FST) • Background: JM sections 3.4-3.7 • English spelling rules • Homework 7 (due next Thursday)
Homework 6 Review Examples: • (a) Johnpraises himself • (a) John thinks he is smart • (a) Johnconsiders himselfto be intelligent • (a) John likes his dog • (a) John found a picture of himself • (a) John found Mary’s picture of herself Examples: • (b) Marypraises herself • (b) Mary thinks she is smart • (b) Maryconsiders herselfto be intelligent • (b) Mary likes her dog • (b) Mary found a picture of herself • (b) Mary found John’s picture of himself Question 1: write a Prolog grammar for the following sentences
Homework 6 Review • It’s straightforward to build a grammar to cover the examples • unfortunately, it also overgenerates … • We want to rule out ungrammatical examples: • *Johnpraises herself • *John thinks himself is smart • *Johnconsiders herselfto be intelligent • *John likes himself’s dog • *John found a picture of herself • *John found Mary’s picture of himself
Homework 6 Review Idea: propagate value of lexical feature Gender up the tree
Homework 6 Review m f m m m f m m m f
Homework 6 Review m m m f m f m m m f m f
Homework 6 Review m f m m m m f m f m f m f m
Homework 6 Review G m G m f f m f f f m f f f f m
Homework 6 Review she Can’t use unrestricted Gender feature percolation, otherwise *she
Homework 6 Review Practical solution: block anaphor from appearing in subject position under sbar. she Key linguistic insight: anaphors and pronouns behave differently!
Homework 6 Review • Practical solution: • prevent anaphor from appearing in subject position under sbar. • (allow pronoun) Two categories of person pronouns: prp and prp1 np1 behaves just like np but cannot access prp1
Homework 6 Review • Practical solution: • prevent anaphor from appearing in subject position under sbar. • (allow pronoun) np1 behaves just like np but cannot access prp1
Homework 6 Review • Practical solution: • prevent anaphor from appearing in subject position under sbar. • (allow pronoun)
Homework 6 Review • For information on a more general solution … • Google Binding Theory
Homework 6 Review Feature percolation implementation is difficult… maybe impossible… Local usually means in the same S c-command (constituent command)
Today’s Topic • Going to step away from sentence level syntax, and go back to talking about words (for a couple of lectures) … • Let’s talk about a generalization of the finite state automaton (FSA); the finite state transducer (FST)
Finite State Transducers (FST) • Problem: • not all FSTs can be turned into deterministic FST (DFST) • i.e. non-deterministic FST and DFST are not equivalent in power • e.g. DFST can only compute functions, no way to handle one-to-many relations: a:b > 1 2 There are also functions than non-deterministic FST can compute but not DFST a:c 3 • Restrictions (so that things can be determinized): • Sequential transducers: • deterministic on the input side • No epilson (ε) on input side – ok on output side • Subsequential transducers = sequential transducer + extra tail output at final state(s)
Spelling Rules • aka orthographic rules • Examples: Note: the Porter stemmer tries to reconstruct these… • Idea: apply orthographic rules to fix up the output a morphological parsing FST:
Spelling Rules • Tlex:
e-insertion FST • Context-sensitive rule: rewrite right context left context • Corresponding FST: other = any feasible pair not in this transducer
e-insertion FST other = any feasible pair not in this transducer i.e. anything not covered by the other columns in the table • State transition table: • FST:
e-insertion FST • State transition table: • FST: Input q0 q0 q0 q1 q2 q5 Reject q4 q0 q3 Accept Output
e-insertion FST • State transition table: • FST: Input Accept q0 q0 q1 q1 q0 q0 q0 Output
Spelling Rules • aka orthographic rules • Examples: • Idea: apply orthographic rules to fix up the output a morphological parsing FST:
Homework 7 • Input: verb_stem^morpheme# ^ = morpheme boundary # = word boundary • try^s# ⤇ tries • try^ed# ⤇ tried • play^s# ⤇ plays • play^ed# ⤇ played • yank^s# ⤇ yank^s# • yank^ed# ⤇ yank^ed# Draw a FST that handles Y replacement for the examples shown Give the state transition table Implement your table in Perl Show working examples