490 likes | 717 Views
CSCI 3130: Formal languages and automata theory Tutorial 4. Chin. Reminder. Homework 3 is due on next Thursday . You can get back homework 1 after this tutorial. Context Free Grammar. A → 0A1 A → B B → #. start variable. variables. terminals. productions. A. 0A1. 00A11.
E N D
CSCI 3130: Formal languagesand automata theoryTutorial 4 Chin
Reminder • Homework 3 is due on next Thursday. • You can get back homework 1 after this tutorial.
Context Free Grammar A → 0A1A → BB → # start variable variables terminals productions A 0A1 00A11 000A111 000B111 000#111 derivation = a sequence of productions that results a string
Context Free Grammar • Regular languages are context free • Context free languages NOT necessarily regular • CFG describes the recursive structure of languages
Context Free Grammar • Design a CFG that represents the following. • S = {a, b} L1= {wywR : y, w ∈ S*} L2 = {aibj: i < j}
Context Free Grammar • L1 = {wywR : y, w ∈ S*} • Consider a simpler CFG L1’ = {wwR : w ∈ S*} • e.g. z = aabaabaa • First character and the last character are the same z = aabaabaa = ax1a x1 =abaaba =ax2a x2 = baab = bx3b x3 = aa = ax4a S → aSa | bSb ? How to stop? base case? S → aSa | bSb | recursive
Context Free Grammar • L1 = {wywR : y, w ∈ S*} • Consider a simpler CFG L1’ = {wwR : w ∈ S*} S → aSa | bSb | How to turn it into a CFG for L1? e.g. aabaaababaa? S → aSa | bSb | < change this How to write a CFG for {y : y ∈ S*}? T → aT | bT | Replace by T → aT | bT | S → aSa | bSb | T T → aT | bT |
Context Free Grammar • L2 = {ai bj : i < j} • Consider a simpler CFG L2’ = {aibj : i = j} • e.g. z = aaaabbbb • First character must be a and the last character must be b z = aaaabbbb = ax1b x1 =aaabbb =ax2b x2 = aabb = ax3b x3 = ab = ax4b S → aSb ? Base case again… S → aSb | recursive
Context Free Grammar • L2 = {ai bj : i < j} • Consider a simpler CFG L2’ = {wwR : w ∈ S*} S → aSb | How to turn it into a CFG for L2? need at least 1 more b in bb…b e.g. aaaabbbbbb? S → aSb | < change this.insert b’s in front of bb…b How to write a CFG for {bi : i > 0}? T → bT | b Replace by T → bT | b S → aSb | | T T → bT | b
S A B a A b a Parse Trees • Representation of derivations e.g. S → AB | ba A → aA | a B → b Derivation of aab • S→ AB → aAB → aaB → aab
Ambiguity • A CFG is ambiguous if the deviation of some string has two different parse trees. • Removing ambiguity is impossible for some CFG.
Ambiguity • Show that the following CFG is ambiguous. E → E * E |E / E | (A) | N A → A + A | N N → 6 | 1 | 2 • Find a string in the CFG and show that it has two different parse trees.
Ambiguity E E E → E * E |E / E | (A) | N A → A + A | N N → 6 | 1 | 2 • 6 / 2 * (1 + 2) * E E / E E ) ( E / E A E * E N N N ) A ( + A A N 6 N N 6 2 A + A 2 N N 1 2 1 2
Parsing - Preprocessing • Eliminate productions • Eliminate unit productions • Do the steps in order (1 then 2)
Parsing - Preprocessing • Eliminate productions i. Identify nullable variables (N) Let Q be the queue containing the nullable variables. Repeat the following: If X , push X into Q If X YZ…W and XZ…W are all in Q, push X into Q If start variable S nullable, add S’ → S | *
Parsing - Preprocessing • Eliminate productions i. Identify nullable variables (N) e.g. S → XY X → Y → Q = (X, Y, S) but not (S, X, Y) S’ → S S → XY X → Y → * • Repeat the following: • If X , push X into Q • If X YZ…W and XZ…W are all in Q, • push X into Q
Parsing - Preprocessing • Eliminate productions ii. Remove nullable variables Repeat the following until Q is empty: Let N be the first element in Q a) For eachX → N, add X → b) Remove all N → Remove N from Q Caution!: S → NTN becomes S → NTN | NT | TN | T
Parsing - Preprocessing • Eliminate productions ii. Remove nullable variables e.g. Q = (Y, X) S→ XY X → XY | YZY | Y Y→ After one step, Q = (X) S → XY | X X → XY | X| YZY | YZ | ZY | Z | Y | Y →
Parsing - Preprocessing • Eliminate productions for the following CFG S → ASA |aB A → B | S B → b |
Parsing - Preprocessing * S → ASA |aB A → B | S B → b | • B → , Q = (B) • A → B and B is in Q, Q = (B, A) • i. Identify nullable variables (N) • Let Q be the queue containing the nullable variables. • Repeat the following: • If X , push X into Q • If X YZ…W and XZ…W are all in Q, • push X into Q
Parsing - Preprocessing S → ASA |aB A →B | S B →b | Q = (B, A) N = B S → ASA |aB|a A → B | S | B → b | ii. Remove nullable variables Repeat the following until Q is empty: Let N be the first element in Q a) For each X → N, add X → b) Remove all N → Remove N from Q.
Parsing - Preprocessing S → ASA |aB A →B | S B →b | Q = (A) N = A S → ASA |aB| a | SA|AS|S A → B | S | B → b ii. Remove nullable variables Repeat the following until Q is empty: Let N be the first element in Q a) For each X → N, add X → b) Remove all N → Remove N from Q.
Parsing - Preprocessing • Eliminate unit productions i. If there is a cycle of unit productions A → B→ ... → C→ A delete it and replace everything with A e.g. S → T| X | Y T S→ U | X | Y US → S | Y | a
Parsing - Preprocessing • Eliminate unit productions ii. Replace every chainA → B→ ... → C→ by A → , B→ , ... , C → e.g. S→ TAX | a | XY | YZ T→ UAX | a | BY U → AX | a
Parsing - Preprocessing S → ASA |aB| a | SA | AS | S A → B | S B → b S→ ASA |aB| a | SA | AS |S A → B | S B → b i. If there is a cycle of unit productions A → B→ ... → C→ A delete it and replace everything with A
Parsing - Preprocessing S → ASA |aB| a | SA |AS A → B | S B → b S → ASA |aB| a | SA | AS A →Bb|SASA |aB| a | SA | AS B → b ii. Replace every chain A → B→ ... → C→ by A → , B→ , ... , C →
Parsing - Preprocessing • Eliminate productions and unit productions for the following CFG S → ASA |aB A → B | S B → b | Ans:S → ASA |aB| a | SA | AS A → b |ASA |aB| a | SA | AS B → b
CYK Algorithm • Eliminate productions • Eliminate unit productions • CYK Algorithm 1. More preprocessing – Chomsky normal form 2. Dynamic Programming
CYK Algorithm - Preprocessing • Chomsky normal form every production has the form A → BCorA → a Allow S →e for start variable A → BcDE A → BCDE C → c A → BX X → CDE C → c A → BX X → CY Y → DE C → c
Parsing - Preprocessing • Convert the CFG to Chomsky normal form S → ASA |aB| a | SA | AS A → b |ASA |aB| a | SA | AS B → b
Parsing - Preprocessing S → ASA |aB| a | SA | AS A → b |ASA |aB| a | SA | AS B → b S → AX|UB | a | SA |AS X→ SA U → a A → b |ASA |aB| a | SA | AS B → b
Parsing - Preprocessing S → AX |UB | a | SA | AS X → SA U → a A → b |ASA |aB| a | SA | AS B → b S → AX|UB | a | SA |AS A → b |AX|UB | a | SA | AS X → SA U → a B → b
Parsing - Preprocessing • Convert the CFG to Chomsky normal form S → ASA |aB| a | SA | AS A → b |ASA |aB| a | SA | AS B → b Ans: S → AX |UB | a | SA | AS A → b |AX | UB | a | SA | AS X → SA U→ a B → b
Parsing • CYK algorithm Dynamic Programming (taught in CSCI3160) Let s = s1s2s3s4…sn be a string Let s(i, j) be the substring si…sj of s e.g. s = abcde, s(2,4) = bcd s(i, j) can be construct by s(i, k) + s(k + 1, j) for some k. e.g. s = abcde s = “a” + “bcde”, “ab” + “cde”, …, “abcd” + “e”
Parsing • CYK algorithm Dynamic Programming (taught in CSCI3160) s(i, j) can be construct by s(i, k) + s(k + 1, j) for some k. Main Idea: If A derives s(i, k) and B derives s(k + 1, j), and S → AB. Then S derives s(i, j) e.g. A derives “ab”, B derives “cde”, S → AB. Then S derives “abcde”
Cocke-Younger-Kasami algorithm • Use the CYK algorithm to parse abbab for the CFG S → ASA |aB A → B | S B → b |
Cocke-Younger-Kasami algorithm Chomsky Normal Form: S → AX |UB | SA | AS | a A →AX | UB | SA | AS |a | b X → SA U → a B → b x = abbab - - - - - - - - - - - - - - - a b b a b
Cocke-Younger-Kasami algorithm S → AX |UB | SA | AS | a A →AX | UB | SA | AS | a | b X → SA U → a B → b S,A,U can derive a A,B can derive b - - - - - - - - - - SAU AB AB SAU AB a b b a b
Cocke-Younger-Kasami algorithm S → AX |UB | SA | AS | a A →AX | UB | SA | AS | a | b X → SA U → a B → b If we can derive “a”(SAU) and “b”(AB), then we can derive “ab”. SAU derives “a”, AB derives “b”. Look for variables that produce SA, SB, AA, AB, UA, or UB S → SA, A→ SA, X→ SA similarly for “bb”, “ba”, “ab” - - - - - - SAX - SA SAX SAU AB AB SAU AB a b b a b
Cocke-Younger-Kasami algorithm S → AX |UB | SA | AS | a A →AX | UB | SA | AS | a | b X → SA U → a B → b If we can derive “ab”(SAX) and “b”(AB), or “a”(SAU) and “bb”(-), then we can derive “abb”. 1. SAU derives “a”, nothing derives “b” 2. SAX derives “ab”, AB derives “b”. Look for variables that produce SA, SB, AA, AB, XA, or XB S → SA, A→ SA, X→ SA similarly for “bba” and “bab” - - - SAX SA SAX SAX - SA SAX SAU AB AB SAU AB a b b a b
Cocke-Younger-Kasami algorithm S → AX |UB | SA | AS | a A →AX | UB | SA | AS | a | b X → SA U → a B → b If we can derive “a”(SAU) and “bba”(SA), or “ab”(SAX) and “ba”(SA), or “abb”(SAX) and “a”(SAU), then we can derive “abba”. SAX derives “ab”, SA derives “ba”. Look for variables that produce SS, SA, AA, AS, XS, or XA S → SA, A→ SA, X→ SA similarly for “bbab” - SAX SAX SAX SA SAX SAX - SA SAX SAU AB AB SAU AB a b b a b
Cocke-Younger-Kasami algorithm S → AX |UB | SA | AS | a A →AX | UB | SA | AS | a | b X → SA U → a B → b If we can derive “a”(SAU) and “bbab”(SAX), or “ab”(SAX) and “bab”(SAX), or “abb”(SAX) and “ab”(SAX), then we can derive “abbab”. SAU derives “a”, AB derives “bbab”. Look for variables that produce SA, SB, AA, AB, UA, or UB S → SA, A→ SA, X→ SA If S is on the top left cell, then x can be derived. SAX SAX SAX SAX SA SAX SAX - SA SAX SAU AB AB SAU AB a b b a b
Parse tree reconstruction Starting from S in the top left, there must be some k such that s(i, j) = s(i, k) + s(k+1, j) Then there must be some l1, l2 such that s(i, k) = s(i, l1) + s(l1+1, k) s(i, k) = s(k+1, l2) + s(l2+1, j) Do the rest recursively. SAX SAX SAX SAX SA SAX SAX - SA SAX SAU AB AB SAU AB a b b a b
Parse tree reconstruction Starting from S in the top left, there must be some k such that s(i, j) = s(i, k) + s(k+1, j) Then there must be some l1, l2 such that s(i, k) = s(i, l1) + s(l1+1, k) s(i, k) = s(k+1, l2) + s(l2+1, j) Do the rest recursively. SAX SAX SAX SAX SA SAX SAX - SA SAX SAU AB AB SAU AB a b b a b
Parse tree reconstruction Starting from S in the top left, there must be some k such that s(i, j) = s(i, k) + s(k+1, j) Then there must be some l1, l2 such that s(i, k) = s(i, l1) + s(l1+1, k) s(i, k) = s(k+1, l2) + s(l2+1, j) Do the rest recursively. SAX SAX SAX SAX SA SAX SAX - SA SAX SAU AB AB SAU AB a b b a b
End • Questions?