1 / 51

Pushpak Bhattacharyya CSE Dept., IIT Bombay 24 th March, 2011

CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 29– CYK; Inside Probability; Parse Tree construction). Pushpak Bhattacharyya CSE Dept., IIT Bombay 24 th March, 2011. CYK Parsing. Shared Sub-Problems: Example. CKY Parsing: CNF.

Download Presentation

Pushpak Bhattacharyya CSE Dept., IIT Bombay 24 th March, 2011

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS460/626 : Natural Language Processing/Speech, NLP and the Web(Lecture 29– CYK; Inside Probability; Parse Tree construction) Pushpak BhattacharyyaCSE Dept., IIT Bombay 24th March, 2011

  2. CYK Parsing

  3. Shared Sub-Problems: Example

  4. CKY Parsing: CNF • CKY parsing requires that the grammar consist of ε-free, binary rules = Chomsky Normal Form • All rules of the form: • A  BC or Aa • What does the tree look like? • What if my CFG isn’t in CNF? A → B C D → w

  5. CKY Algorithm

  6. Illustrating CYK [Cocke, Younger, Kashmi] Algo • DT  the 1.0 • NN  gunman 0.5 • NN  building 0.5 • VBD  sprayed 1.0 • NNS  bullets 1.0 • S  NP VP 1.0 • NP  DT NN 0.5 • NP  NNS 0.3 • NP  NP PP 0.2 • PP  P NP 1.0 • VP  VP PP 0.6 • VP  VBD NP 0.4

  7. CYK: Start with (0,1) 0The1 gunman 2 sprayed 3 the 4 building 5 with 6 bullets 7.

  8. CYK: Keep filling diagonals 0The 1gunman2 sprayed 3 the 4 building 5 with 6 bullets 7.

  9. CYK: Try getting higher level structures 0The1gunman2 sprayed 3 the 4 building 5 with 6 bullets 7.

  10. CYK: Diagonal continues 0The 1 gunman 2sprayed3 the 4 building 5 with 6 bullets 7.

  11. CYK (cont…) 0The 1 gunman 2 sprayed 3 the 4 building 5 with 6 bullets 7.

  12. CYK (cont…) 0The 1 gunman 2 sprayed 3the4 building 5 with 6 bullets 7.

  13. CYK (cont…) 0The 1 gunman 2 sprayed 3 the 4building5 with 6 bullets 7.

  14. CYK: starts filling the 5th column 0The 1 gunman 2 sprayed 3the4building5 with 6 bullets 7.

  15. CYK (cont…) 0The 1 gunman 2sprayed3the4building5 with 6 bullets 7.

  16. CYK (cont…) 0The 1gunman2sprayed3the4building5 with 6 bullets 7.

  17. CYK: S found, but NO termination! 0The1gunman2sprayed3the4building5 with 6 bullets 7.

  18. CYK (cont…) 0The 1 gunman 2 sprayed 3 the 4 building 5with6 bullets 7.

  19. CYK (cont…) 0The 1 gunman 2 sprayed 3 the 4 building 5 with 6 bullets 7.

  20. CYK: Control moves to last column 0The 1 gunman 2 sprayed 3 the 4 building 5 with 6bullets7.

  21. CYK (cont…) 0The 1 gunman 2 sprayed 3 the 4 building 5with6bullets7.

  22. CYK (cont…) 0The 1 gunman 2 sprayed 3the4building5with6bullets7.

  23. CYK (cont…) 0The 1 gunman 2sprayed3the4building5with6bullets7.

  24. CYK: filling the last column 0The 1 gunman 2sprayed3the4building5with6bullets7.

  25. CYK: terminates with S in (0,7) 0The1gunman2sprayed3the4building5with6bullets7.

  26. CYK: Extracting the Parse Tree • The parse tree is obtained by keeping back pointers. S (0-7) NP (0-2) VP (2-7) DT (0-1) NN (1-2) NP (3-7) VBD (2-3) NP (3-5) PP (5-7) gunman The DT (3-4) NN (4-5) P (5-6) NP (6-7) sprayed NNS (6-7) with the building bullets

  27. Probabilistic parse tree construction

  28. Interesting Probabilities N1 What is the probability of having a NP at this position such that it will derive “the building” ? - Inside Probabilities NP The gunman sprayed the building with bullets 1 2 3 4 5 6 7 Outside Probabilities What is the probability of starting from N1 and deriving “The gunman sprayed”, a NP and “with bullets” ? -

  29. Interesting Probabilities • Random variables to be considered • The non-terminal being expanded. E.g., NP • The word-span covered by the non-terminal. E.g., (4,5) refers to words “the building” • While calculating probabilities, consider: • The rule to be used for expansion : E.g., NP  DT NN • The probabilities associated with the RHS non-terminals : E.g., DT subtree’s inside/outside probabilities & NN subtree’s inside/outside probabilities

  30. Outside Probability • j(p,q) :The probability of beginning with N1& generating the non-terminal Njpq and all words outside wp..wq N1 Nj w1 ………wp-1 wp…wqwq+1 ………wm

  31. Inside Probabilities • j(p,q) :The probability of generating the words wp..wq starting with the non-terminal Njpq. N1  Nj  w1 ………wp-1 wp…wqwq+1 ………wm

  32. Outside & Inside Probabilities: example N1 NP The gunman sprayed the building with bullets 1 2 3 4 5 6 7

  33. Calculating Inside probabilities j(p,q) Base case: • Base case is used for rules which derive the words or terminals directly E.g., Suppose Nj = NN is being considered & NN  building is one of the rules with probability 0.5

  34. Induction Step: Assuming Grammar in Chomsky Normal Form Induction step : Nj • Consider different splits of the words - indicated by dE.g., the huge building • Consider different non-terminals to be used in the rule: NP  DT NN, NP  DT NNS are available options Consider summation over all these. Nr Ns wp wd wd+1 wq Split here for d=2 d=3

  35. The Bottom-Up Approach NP0.5 • The idea of induction • Consider “the gunman” • Base cases : Apply unary rules DT  the Prob = 1.0 NN  gunman Prob = 0.5 • Induction : Prob that a NP covers these 2 words = P (NP  DT NN) * P (DT deriving the word “the”) * P (NN deriving the word “gunman”) = 0.5 * 1.0 * 0.5 = 0.25 DT1.0 NN0.5 The gunman

  36. Parse Triangle • A parse triangle is constructed for calculating j(p,q) • Probability of a sentence using j(p,q):

  37. Parse Triangle to from • Fill diagonals with

  38. Parse Triangle • Calculate using induction formula

  39. Example Parse t1 S1.0 Rule used here is VP  VP PP • The gunman sprayed the building with bullets. NP0.5 VP0.6 NN0.5 DT1.0 PP1.0 VP0.4 P1.0 NP0.3 NP0.5 VBD1.0 The gunman DT1.0 NN0.5 with NNS1.0 sprayed the building bullets

  40. Another Parse t2 S1.0 Rule used here is VP  VBD NP • The gunman sprayed the building with bullets. NP0.5 VP0.4 NN0.5 DT1.0 VBD1.0 NP0.2 The gunman sprayed NP0.5 PP1.0 DT1.0 NN0.5 P1.0 NP0.3 the building with NNS1.0 bullets

  41. Parse Triangle

  42. Different Parses • Consider • Different splitting points : E.g., 5th and 3rd position • Using different rules for VP expansion : E.g.,VP  VP PP, VP  VBD NP • Different parses for the VP “sprayed the building with bullets” can be constructed this way.

  43. The Viterbi-like Algorithm for PCFGs • Very similar to calculation of inside probabilities i(p,q) • Instead of summing over all ways of constructing the parse for wpq • Choose only the best way (the maximum probability one!)

  44. Calculation of i(p,q) This rule is chosen VP0.4 VP0.4 VP0.4 PP1.0 VBD1.0 NP0.2 0.6 * 1.0 * 0.3 = 0.18 0.4 * 1.0 * 0.015 = 0.06

  45. Viterbi-like Algorithm • Base case: • Induction : • i(p,q) stores • RHS of the rule selected • Position of splitting • Example : VP(3,7) stores VP, PP and split position = 5 because VP  VP PP is the rule used. • Backtracing : Start from 1(1,7) and 1(1,7) and backtrace.

  46. Example • 1(1,7) records S  NP VP & split position as 2 • NP(1,2) records NP  DT NN & split position as 1 • VP(3,7) records VP  VP PP & split position as 5 S NP VP The gunman sprayed the building with bullets 1 2 3 45 6 7

  47. Example S NP VP PP VP DT NN The gunman sprayed the building with bullets 1 2 3 4 5 6 7

  48. Grammar Induction • Annotated corpora like Penn Treebank • Counts used as follows: • Sample training data: NP NP NP NP NP DT DT NNS DT PRP NN NNS NN The boy Those cars Bears That book She

  49. Grammar Induction for UnannotatedCorpora: EM algorithm Start with initial estimates for rule probabilities Compute probability of each parse of a sentence according to current estimates of rule probabilities EXPECTATION PHASE Compute expectation of how often a rule is used (summing probabilities of rules used in previous step) Refine rule probabilities so that training corpus likelihood increases MAXIMIZATION PHASE

  50. Outside Probabilities j(p,q) Base case: Inductive step for calculating : N1 Nfpe Njpq Ng(q+1)e Summation over f, g & e wp wq wq+1 we w1 wp-1 we+1 wm

More Related