1 / 21

CS 182 Sections 101 - 102

March 15 2006. CS 182 Sections 101 - 102. slides created by Eva Mok (emok@icsi.berkeley.edu) modified by JGM. Announcements. a5 is due Friday night at 11:59pm a6 is out tomorrow (2 nd coding assignment), due the Monday after spring break Midterm solution will be posted (soon).

asa
Download Presentation

CS 182 Sections 101 - 102

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. March 15 2006 CS 182Sections 101 - 102 slides created by Eva Mok (emok@icsi.berkeley.edu) modified by JGM

  2. Announcements • a5 is due Friday night at 11:59pm • a6 is out tomorrow (2nd coding assignment), due the Monday after spring break • Midterm solution will be posted (soon)

  3. Quick Recap • This Week • you just had the midterm • a bit more motor control • some belief net, feature structure • Coming up • Bailey’s Model of learning hand action words

  4. Your Task: As far as the brain / thought / language is concerned, what is the single biggest mystery to you at this point?

  5. Remember Recruitment Learning? • One-shot learning • The idea is for things like words or grammar, kids learn at least something given a single input • Granted, they might not get it completely right in the first shot • But over time, their knowledge slowly converges to the right answer (i.e. built a model to fit the data)

  6. Model Merging • Goal: • learn a model given data • The model should: • explain the data well • be "simple" • be able to make generalizations

  7. Naïve way to make a model • create a special case for each piece of data • of course get the training data completely right • cannot generalize at all when test data comes • how to fix this — Model Merging • "compact" the special cases into more descriptive rules without losing too much performance

  8. Basic idea of Model Merging • Start with the naïve model: one special case for each piece of data • While performance increases • Create a more general rule that explains some of the data • Discard the corresponding special cases

  9. 2 examples of Model Merging • Bailey’s VerbLearn system • model that maps actions to verb labels • performance: complexity of model + ability to explain data  MAP • Assignment 6 - Grammar Induction • model that maps sentences to grammar rules • performance: size of grammar + derivation length of sentences  cost

  10. Grammar • Grammar: rules that governs what sentences are legal in a language • e.g. Regular Grammar, Context Free Grammar • Production rules in a grammar have the form    • Terminal symbols: a, b, c, etc • Non-terminal symbols: S, A, B, X, etc • Different classes of grammar restrict where these symbols can go • We’ll see an example on the next page

  11. Right-Regular Grammar • Right-Regular Grammar is a further restricted class of Regular Grammar • Non terminal symbols are always on the right end • e.g: S -> a b c X X -> d e X -> f • valid sentences would be "abcde" and "abcf“

  12. Grammar Induction • As input data (e.g. “abcde”, “abcf”) comes in, we’d like to build up a grammar that explains the data • We can certainly have one rule for each sentence we see in the data  naive approach, no generalization • Would rather “compact” your grammar • In a6, you have two ways of doing this “compaction” • prefix merge • suffix merge

  13. prefix merge Sa b c d e Sa b c f becomes Sa b c X X  d e X  f suffix merge S  a b c d e S  f c d e becomes S  a b X S  f X X c d e How do we find the model?

  14. Contrived Example • Suppose you have these 3 grammar rules: r1: S  eat them here or there r2: S  eat them anywhere r3: S  like them anywhere or here or there • 5 merging options • prefix merge (r1, r2, 1) • prefix merge (r1, r2, 2) • suffix merge (r1, r3, 1) • suffix merge (r1, r3, 2) • suffix merge (r1, r3, 3)

  15. Computationally • Kids aren’t presented all the data at once • Instead they’ll hear these sentences one by one: • eat them here or there • eat them anywhere • like them anywhere or here or there • As each sentence (i.e. data) comes in, you create one rule for it, e.g. S  eat them here or there • Then you look for ways to merge as more sentences come in

  16. Example 1: just prefix merge • After the first two sentences are presented, we can already do a prefix merge of length 2: r1: S  eat them here or there r2: S  eat them anywhere r3: S  eat them X1 r4: X1  here or there r5: X1  anywhere

  17. Example 2: just suffix merge • After the first three sentences are presented, we can do a suffix merge of length 3: r1: S  eat them here or there r2: S  eat them anywhere r3: S  like them anywhere or here or there r4: S  eat them X2 r5: S  like them anywhere or X2 r6: X2  here or there

  18. Your Task in a6 • pull in sentences one by one • monitor your sentences • do either a prefix merge or a suffix merge as soon as it’s “good” to do so

  19. How do we know if a model is good? • want a small grammar • but want it to explain the data well • minimize the cost along the way: c(G) =  s(G) + d(G,D) size of grammar derivation length of sentences : learning factor to play with

  20. Back to Example 2 • Remember your data is: • eat them here or there • eat them anywhere • like them anywhere or here or there • Your original grammar: r1: S  eat them here or there r2: S  eat them anywhere r3: S  like them anywhere or here or there size of grammar = 15 derivation length of sentences = 1 + 1 + 1 = 3 c(G) =  s(G) + d(G,D) =  ∙ 15 + 3

  21. Back to Example 2 • Remember your data is: • eat them here or there • eat them anywhere • like them anywhere or here or there • Your new grammar: r2: S  eat them anywhere r4: S  eat them X2 r5: S  like them anywhere or X2 r6: X2  here or there so in fact you SHOULDN’T merge if  ≤ 2 size of grammar = 14 derivation length of sentences = 2 + 1 + 2 = 5 c(G) =  s(G) + d(G,D) =  ∙ 14 + 5

More Related