Simplification of Context-Free Grammars

  1. Simplification of Context-Free Grammars • Some useful substitution rules. • Removing useless productions. • Removing -productions. • Removing unit-productions.

  2. Some Useful Substitution Rules G = (V, T, S, P) A  x1Bx2 P B y1 | y2 | ... | yn  P L(G) = L(G^) G^ = (V, T, S, P^) A  x1y1x2 | x1y2x2 | ... | x1ynx2 P^

  3. Example G = ({A, B}, {a, b}, A, P) A  a | aaA | abBc B abbA| b G^ = (V, T, S, P^) A  a| aaA| ababbAc | abbc

  4. Some Useful Substitution Rules G = (V, T, S, P) A  Ax1 | Ax2 | ... | Axn P A y1 | y2 | ... | ym  P L(G) = L(G^) G^ = (V{Z}, T, S, P^) A  yi | yiZ (i =1, m) P^ Z  xi | xiZ (i =1, n) P^

  5. Example G = ({A, B}, {a, b}, A, P) A  Aa | aBc |  B Bb| ba A  aBc| aBcZ| Z |  A  aBc| aBcZ| Z |  Z  a | aZ Z  a | aZ B  Bb | ba B  ba | baY Y  b | bY

  6. Removing Useless Productions S  aSb |  | A A  aA S  A is redundant as A cannot be transformed into a terminal string.

  7. Removing Useless Productions G = (V, T, S, P) A  V is useful iff there is w  L(G) such that: S * xAy* w A production is useless it it involves any uselessvariable.

  8. Example G = ({S, A, B}, {a, b}, S, P) S  A A  aA|  B bA

  9. Example G = ({S, A, B, C}, {a, b}, S, P) S  aS | A | C S  aS | AS  aS | A A  a A  aA  a B  aa B  aa C  aCb

  10. S A B Example G = ({S, A, B, C}, {a, b}, S, P) S  aS | A | C S  aS | AS  aS | A A  a A  aA  a B  aa B  aa C  aCb dependency graph

  11. Theorem Let G = (V, T, S, P) be a context-free grammar. Then there exists an equivalent grammar G^ = (V^, T^, S, P^) that does not contain any useless variables or productions.

  12. Theorem Let G = (V, T, S, P) be a context-free grammar. Then there exists an equivalent grammar G^ = (V^, T^, S, P^) that does not contain any useless variables or productions. Proof: ?

  13. Theorem Proof: • Construct (V1, T, S, P1) such that V1 contains only variables A for which A * w  T*. 1. Set V1 to . 2. Repeat until no more variables are added to V1: For every A  T for which P has a production of the form A  x1x2... xn (xi T*V1) add A to V1. 3.Take P1 as all the productions in P with symbols in (V1 T)*.

  14. Theorem Proof: • Draw the variable dependency graph for G1 and find all variables that cannot be reached from S. • Remove those variables and the productions involving them. • Eliminate any terminal that does not occur in a useful production.  G^ = (V^, T^, S, P^)

  15. Removing -Productions • Any production of a context-free grammar of the form: A   is called a -production. • Any variable A for which the derivation: A *  is possible is called nullable.

  16. Example S  aS1b S  aS1b | ab S1  aS1b |  S1 aS1b | ab

  17. Theorem Let G = (V, T, S, P) be a CFG such that   L(G). Then there exists an equivalent grammar G^ having no -productions.

  18. Theorem Proof: • Find the set VN of all nullable variables of G: 1. For all productions A  , put A into VN. 2. Repeat until no more variables are added to VN: For all productions B  A1A2... An (Ai VN) add B to VN.

  19. Theorem Proof: • For each production in P of the form: A  x1x2... xm (m  1, xi VT) put into P^ that production as well as all those generated by replacing null variables with  in all possible combination. Exception: if all xi are nullable, then A   is not put into P^.

  20. Example S  ABaC S  ABaC | BaC | AaC | ABa | aC | Aa |Ba | a A  BC A  B | C | BC B  b |  B  b C  D |  C  D D  d D  d VN = {A, B, C}

  21. Removing Unit-Productions Any production of a context-free grammar of the form: A  B is called a unit-production.

  22. Theorem Let G = (V, T, S, P) be a CFG without -productions. Then there exists an equivalent grammar G^ = (V, T, S, P^) that does not have any unit-productions.

  23. Theorem Proof: 1. Put into P^ all non-unit-productions of P. 2. Repeat until no more productions are added to P^: For every A and B  V such that A * B and B y1 | y2 | ... | yn  P^ add A y1 | y2 | ... | yn to P^.

  24. Example S  Aa | B B  A | bb A  a | bc | B

  25. Example S  Aa | B S  Aa B  A | bb A  a | bc A  a | bc | B B  bb S * A S  a | bc | bb S * B A  bb A * B B  a | bc B * A

  26. Theorem Let L be a context-free language that does not contain . Then there exists a CFG that generates L and does not have any useless productions,  -productions, or unit-productions.

  27. Theorem Let L be a context-free language that does not contain . Then there exists a CFG that generates L and does not have any useless productions,  -productions, or unit-productions. Proof: 1. Remove  -productions. 2. Remove unit-productions. 3. Remove useless-productions

  28. Two Important Normal Forms • Chomsky normal form. • Greibach normal form.

  29. Chomsky Normal Form A context-free grammar G = (V, T, S, P) is in Chomsky normal form iff all productions are of the form: A  BC or A  a where A, B, C  V and a  T.

  30. Theorem Any context-free grammar G = (V, T, S, P) such that   L(G) has an equivalent grammar G^ = (V^, T, S, P^) in Chomsky normal form.

  31. Theorem Proof: First, construct an equivalent grammar G1 = (V1, T, S, P1). • V1 = V  {Ba | a  T} P1 = {Ba  a | a  T} • Remove all terminals from productions of length  1: 1. Put all productions A  a into P1. 2. Repeat until no more productions are added to P1: For each production A  x1x2... xn (n  2, xi TV) add A  C1C2... Cn to P1 where Ci = xi if xi V or Ci = Ba if xi = a

  32. Theorem Proof: Construct G^ = (V^, T, S, P^) from G1 = (V1, T, S, P1). • V^ = V1. • Reduce the length of the right sides of the productions: 1. Put all productions A  a andA  BC into P^. 2. Repeat until no more productions are added to P^: For each production A  C1C2... Cn (n  2) add A  C1D1,D1 C2D2, ... , Dn-2 Cn-1Dn to P^.

  33. Example S  ABa A  aaB B  aC

  34. Greibach Normal Form A context-free grammar G = (V, T, S, P) is in Greibach normal form iff all productions are of the form: A  ax where a  T and x  V*.

  35. Theorem Any context-free grammar G = (V, T, S, P) such that   L(G) has an equivalent grammar G^ = (V^, T, S, P^) in Greibach normal form.

  36. Theorem Proof: • Rewrite the grammar in Chomsky normal form. • Relabel variables A1, A2, ..., An. • Rewrite the grammar so that all productions have one of the following forms: Ai  Ajxj (j > i) Zi  Ajxj (j  n, Zi introduced to eliminate left recursion) Ai  axi (a  T and xi  V*) • Start from An  axn to derive Greibach productions.

  37. Example A2 A1A2 | b A1  A2A2 | a

  38. Homework • Exercises: 3, 4, 5, 6, 7, 8, 17, 22 of Section 6.1 - Linz’s book. • Exercises: 2, 3, 4, 6, 9, 10, 11 of Section 6.2 - Linz’s book. • Presentations: Section 6.3 and Section 7.4.

