1 / 13

CMSC 723 / LING 645: Intro to Computational Linguistics

CMSC 723 / LING 645: Intro to Computational Linguistics. September 22, 2004: Dorr Supplement: PC-Kimmo Tutorial Prof. Bonnie J. Dorr Dr. Christof Monz TA: Adam Lee. S. +:0, =. S. +:e. s. Building Automata in Kimmo: English Epenthesis.

Download Presentation

CMSC 723 / LING 645: Intro to Computational Linguistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CMSC 723 / LING 645: Intro to Computational Linguistics September 22, 2004: Dorr Supplement: PC-Kimmo Tutorial Prof. Bonnie J. DorrDr. Christof MonzTA: Adam Lee

  2. S +:0, = S +:e s Building Automata in Kimmo: English Epenthesis Chomsky & Halle: +s → es / X__, else s; where X = {s, ch, s, sh, x} How do we implement this? X+s ==> Xes; else Y+s ==> Ys What does this look like? Let S = {x, s, z} Note 1: S = S:S, s = s:s, etc. Note 2 : No way to get out of last state except for `s’

  3. Building Automata in Kimmo: English Epenthesis Chomsky & Halle: +s → es / X__, else s; where X = {s, ch, s, sh, x} How do we implement this? X+s ==> Xes; else Y+s ==> Ys What does this look like? Let S = {x, s, z} S +:0, = S +:e +:e +:0 Note 1: S = S:S, s = s:s, etc. Note 2 : No way to get out of last state except for `s’ Note 3: No way to get out of intermed state on `s’ =, S s

  4. Building Automata in Kimmo: English Epenthesis This takes care of x, s, z, but what about ch, sh? +:0, = S S +:e +:e +:0 =, S s

  5. Building Automata in Kimmo: English Epenthesis This takes care of x, s, z, but what about ch, sh, ss? Add two new states. (And now we need to add numbers!) +:e s s,h,S +:0, = s S S +:e c s,h,S +:e +:0 c =, S s

  6. Building Automata in Kimmo: English Epenthesis Problem: Now that we have introduced c, s, h, we need to worry about these in other states! Add numbers to states! +:e 4 s s,h,S +:0, = s S 5 S +:e 1 3 c s,h,S +:e 2 +:0 6 c =, S s

  7. Building Automata in Kimmo: English Epenthesis Problem: Now that we have introduced c, s, h, we need to worry about these in other states! Add numbers to states! +:e 4 s s,h,S +:0, =, h s s S c 5 S +:e 1 3 c c s,h,S +:e 2 h +:0 6 c =, S, c, h s

  8. Building Automata in Kimmo: English Epenthesis Problem: In fact, we have to worry about all feasible pairs in every state. Add numbers to states! +:e 4 s s,h,S +:0, =, h s = s = S c 5 S +:e 1 3 c c Note: If a feasible pair is missing between 2 states, it is assumed that pair is not possible, e.g., we cannot go from 6 to 1 on s:s = s,h,S +:e 2 h +:0 6 c =, S, c, h s

  9. Building Automata in Kimmo: English Epenthesis Unfortunately, life can get even more complicated (because of restriction below): there can be interaction with other automata. For example, Epenthesis interacts with Y-replacement, so we need a feasible pair for y:i and +:0 in Epenthesis! +:e 4 s s,h,S +:0, =, h s = s = S c 5 S +:e 1 3 c c Note: If a feasible pair is missing between 2 states, it is assumed that pair is not possible, e.g., we cannot go from 6 to 1 on s:s = s,h,S +:e 2 h +:0 6 c =, S, c, h s

  10. 4 +:e s s,h,S +:0, =, h s = s S = 5 1 3 c S +:e c c 2 = s,h,S +:e 6 +:0 h c =, S, c, h s Building Automata in Kimmo: English Epenthesis Here is the actual automaton matrix used in simple-english.aut RULE "Epenthesis" 6 9 c h s S y + + = _ c h s S i e 0 = _ 1: 2 1 4 3 3 0 1 1 1 2: 2 3 3 3 3 0 1 1 1 3: 2 1 3 3 3 5 6 1 1 4: 2 3 3 3 3 5 6 1 1 5. 0 0 1 0 0 0 0 0 0 6: 1 1 0 1 1 5 6 1 1 Hint: This is the sort of thing you’ll need for German find+t

  11. What about long-distance dependencies? • How do we handle Buch and Bücher? • Trick: Use maybe-umlaut, coupled with special marker in the suffix • Root form in lexicon: b|ch Continuation class (or alternation) is called /NOUN-NEUT-ADD-ER • In lexicon for NOUN-NEUT-ADD-ER, put ending +&er, where & indicates that the root may have a character that should be umlauted. • In the Maybe-Umlaut automaton, search for sequence “| .. +&e” and change to “* .. +&e”

  12. Building a Lexicon in Kimmo: Simple English Example • ALTERNATION /Root RootALTERNATION /N N • ALTERNATION /End End • LEXICON INITIAL 0 /Root "(“ • LEXICON N 0 /C1 "(cat n) (person p3) (number sg)“ +s /C2 "(cat n) (person p3) (number pl)“ +y /A "(cat a)“ • LEXICON Root spy /N spy /V • LEXICON End 0 # “)” • END

  13. Demo of PC-Kimmo • cd kimmo • pckimmo • load rules simple-english.aut • load lexicon simple-english.dic • generate try+s • recognize tries • set tracing on • recognize tries

More Related