1 / 26

CSC 3130: Automata theory and formal languages Tutorial 4

CSC 3130: Automata theory and formal languages Tutorial 4. KN Hung Office: SHB 1026. Department of Computer Science & Engineering. Agenda. Context Free Grammar (CFG) Design Parse Tree Cocke-Younger-Kasami (CYK) algorithm Parsing CFG in normal form Pushdown Automata (PDA) Design.

pia
Download Presentation

CSC 3130: Automata theory and formal languages Tutorial 4

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSC 3130: Automata theory and formal languagesTutorial 4 KN Hung Office: SHB 1026 Department of Computer Science & Engineering

  2. Agenda • Context Free Grammar (CFG) • Design • Parse Tree • Cocke-Younger-Kasami (CYK) algorithm • Parsing CFG in normal form • Pushdown Automata (PDA) • Design

  3. Context-Free Grammar (Recap) • A context free grammar is consisted of 4) Start Variable 3) Production Rule S  AB | ba A  aA | a B  b Another Production Rule 1) Variable 2) Terminal

  4. Context-Free Grammar (Recap) • A string is said to belong to the language (of the CFG) if it can be derived from the start variable = Apply Production Rule CFG Example Derivation S  AB | ba A  aA | a B  b S • AB • aAB • aaB • aab Therefore, aab belongs to the language

  5. Why CFG? • L = {w = 0n1n : n is an positive integer} • L is not a regular language • Proved by “Pumping Lemma” • A Context-Free Grammar can describe it • Thus, CFG is more general than regular expression • NFA  Regular Expression  DFA S  0S1 S  01

  6. CFG Design • Given a context-free language, design the CFG • L = { ab-string, w : Number of a’s < Number of b’s } • Some time for you to get into think… 1 min S  ? …

  7. CFG Design (Con’t) • Trial: Bottom-up • Shortest string in L : “b” • Given a string in L, we can expand it, s.t. it is still in L • i.e., Add terminals, while not violating the constraints

  8. After adding 1 “b”, number of “b” is still greater than that of “a” Adding 1 “a” and 1 “b”, the difference between the numbers of “a” and “b” keep constant CFG Design (Con’t) One Wrong Trial: S  b S  bS | Sb S  abS | baS | bSa | aSb However, cannot parse strings like “aabbbbbaa”

  9. Base Case #b still > #a 1st S 2nd S That a : #b ≥ #a + 1 : #b ≥ #a + 1 : #a = 1  #b ≥ #a + 2 - 1 CFG Design (Con’t) Approach 1: S  b S  SS S  SaS | aSS | SSa But, is it sufficient to say the grammar is correct?

  10. CFG Design (Con’t) Approach 2: • Start with the grammar for ab-strings with same number of a’s and b’s • Call the start symbol of this grammar E • Now, we generate all strings of type EbE | EbEbE | EbEbEbE | … • Thus, we have the grammar…

  11. CFG Design (Con’t) Approach 2 (Con’t): S  EbET T  bET | ε E  … For the pattern : EbE | EbEbE | … E generates ab-strings with same number of a’s and b’s (c.f. “09L7.pdf” – Slide #32)

  12. CFG Design (Con’t) • After designing the grammar, G, you may have to prove (if required) that the language of this grammar is equivalent to the given language • i.e., Prove that L(G) = L • Proof Part 1) L(G) ⊂ L Part 2) L ⊂ L(G) • Due to time limit, I will not do this part

  13. Derivation • AB • aAB • aaB • aab Parse Tree • How to parse “aab” in this grammar? (Previous example) CFG Example S  AB | ba A  aA | a B  b S

  14. S A B b a A a Parse Tree (Con’t) • Idea: Production Rule = Node + Children • Should be very intuitive to understand Derivation S • AB • aAB • aaB • aab

  15. S S S - S S - S - 2 3 S - A S S 3 1 1 2 S S  S - S  1 | 2 | 3 Parse Tree (Con’t) • Ambiguity: String: 3 - 1 - 2 CFG: 3 – 1 – 2 3 – (1 – 2)

  16. Parse Tree (Con’t) • Useful in programming language • CSC3180 • Useful in compiler • CSC3120

  17. S  AB | BC A  BA | a B  CC | b C  AB | a Example Normal Form • Every production is of type • X  YZ • X  a • S  ε Cocke-Younger-Kasami Algorithm • Used to parse context-free grammar in Chomsky normal form (or simply normal form)

  18. CYK Algorithm - Idea • = Algorithm 2 in Lecture Note (09L8.pdf) • Idea: Bottom Up Parsing • Algorithm: Given a string s of length N For k = 1 to N For every substring of length k Determine what variable(s) can derive it • sub(x,y) : starts at index x, ends at index y

  19. S  AB | BC A  BA | a B  CC | b C  AB | a CYK Algorithm - Init • Base Case : k = 1 • The possible choices of variable(s) can be known by scanning through each production A,C B A,C A,C B b a a b a We want to parse this string

  20. Substring of length = 3 Starting with index = 2 Length of Substring i.e., “aab” = sub(2,4) 3 A,C B A,C A,C B 2 Start Index of Substring CYK Algorithm – Table • Each cell: Variables deriving the substring b a a b a

  21. S  AB | BC A  BA | a B  CC | b C  AB | a CYK Algorithm – Loop (k>1) • When k = 2 • Example • sub(1,2) = “ba” • “ba” = “b” + “a” = sub(1,1) + sub(2,2) • Possible: BA | BC •  Variable A,S • Since ABA, SBC S,A A,C B A,C A,C B b a a b a

  22. S  AB | BC A  BA | a B  CC | b C  AB | a = sub(2,2) + sub(3,4) = sub(2,3) + sub(4,4) S,A B S,C S,A A,C B A,C A,C B Therefore , B is put into the cell CYK Algorithm – Loop (k>1) • For each substring • Decompose into two substrings • Example sub(2,4) = “aab” • Possible: AS, AC, CS, CC , BB b a a b a

  23. S  AB | BC A  BA | a B  CC | b C  AB | a CYK Algorithm – Loop (k>1) • How about sub(3,5) ? • Give you 1 min S,A B S,C S,A A,C B A,C A,C B b a a b a

  24. S  AB | BC A  BA | a B  CC | b C  AB | a S,A,C S,A,C B B S,A B S,C S,A A,C B A,C A,C B CYK Algorithm – Parse Tree • Parse Tree is known from the table • See “09L8.pdf” - Slide #21 Length of Substring b a a b a Start Index of Substring

  25. CYK Algorithm (Conclusion) • Start from shortest substring to the longest • i.e., from single-character-string to the whole string • For Context-free grammar, G 1) Convert G into normal form • Remove ε-productions • Remove unit-productions 2) Apply CYK algorithm • Con: Loss in intuition

  26. End • Thanks for coming! =] • Any questions?

More Related