1 / 12

CS 461 – Oct. 10

CS 461 – Oct. 10. Review PL grammar as needed How to tell if a word is in a CFL? Convert to PDA and run it.  CYK algorithm Modern parsing techniques. Accepting input. How can we tell if a given source file (input stream of tokens) is a valid program? Language defined by CFG, so …

Download Presentation

CS 461 – Oct. 10

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 461 – Oct. 10 • Review PL grammar as needed • How to tell if a word is in a CFL? • Convert to PDA and run it.  • CYK algorithm • Modern parsing techniques

  2. Accepting input • How can we tell if a given source file (input stream of tokens) is a valid program? Language defined by CFG, so … • Can see if there is some derivation from grammar? • Can convert CFG to PDA? • Exponential performance not acceptable. (e.g. doubling every time we add token) • Two improvements: • CYK algorithm, runs in O(n3) • Bottom-up parsing, generally linear, but restrictions on grammar.

  3. CYK algorithm • In 1965-67, discovered independently by Cocke, Younger, Kasami. • Given any CFG and any string, can tell if grammar generates string. • The grammar needs to be in CNF first. • This ensures that the rules are simple. Rules are of the form X  a or X  YZ • Consider all substrings of len 1 first. See if these are in language. Next try all len 2, len 3, …. up to length n.

  4. continued • Maintain results in an NxN table. Top right portion not used. • Example on right is for testing word of length 3. • Start at bottom; work your way up. • For length 1, just look for “unit rules” in grammar, e.g. Xa.

  5. continued • For general case i..j • Think of all possible ways this string can be broken into 2 pieces. • Ex. 1..3 = 1..2 + 3..3 or 1..1 + 2..3 • We want to know if both pieces  L. This handles rules of form A  BC. • Let’s try example from 3+7+. (in CNF)

  6. 337  3+7+ ? S  AB A  3 | AC B  7 | BD C  3 D  7 For each len 1 string, which variables generate it? 1..1 is 3. Rules A and C. 2..2 is 3. Rules A and C. 3..3 is 7. Rules B and D.

  7. 337  3+7+ ? S  AB A  3 | AC B  7 | BD C  3 D  7 Length 2: 1..2 = 1..1 + 2..2 = (A or C)(A or C) = rule A 2..3 = 2..2 + 3..3 = (A or C)(B or D) = rule S

  8. 337  3+7+ ? S  AB A  3 | AC B  7 | BD C  3 D  7 Length 3: 2 cases for 1..3: 1..2 + 3..3: (A)(B or D) = S 1..1 + 2..3: (A or C)(S) no! We only need one case to work.

  9. CYK example #2 Let’s test the word baab S  AB | BC A  BA | a B  CC | b C  AB | a Length 1: ‘a’ generated by A, C ‘b’ generated by B

  10. baab S  AB | BC A  BA | a B  CC | b C  AB | a Length 2: 1..2 = 1..1 + 2..2 = (B)(A, C) = S,A 2..3 = 2..2 + 3..3 = (A,C)(A,C) = B 3..4 = 3..3 + 3..4 = (A,C)(B) = S,C

  11. baab S  AB | BC A  BA | a B  CC | b C  AB | a Length 3: [ each has 2 chances! ] 1..3 = 1..2 + 3..3 = (S,A)(A,C) = Ø 1..3 = 1..1 + 2..3 = (B)(B) = Ø 2..4 = 2..3 + 4..4 = (B)(B) = Ø 2..4 = 2..2 + 3..4 = (A,C)(S,C) = B

  12. Finally… S  AB | BC A  BA | a B  CC | b C  AB | a Length 4 [has 3 chances!] 1..4 = 1..3 + 4..4 = (Ø)(B) = Ø 1..4 = 1..2 + 3..4 = (S,A)(S,C) = Ø 1..4 = 1..1 + 2..4 = (B)(B) = Ø Ø means we lose! baab  L. However, in general don’t give up if you encounter Ø in the middle of the process.

More Related