1 / 36

Chapter 6 Simplification of CFGs and Normal Forms

Chapter 6 Simplification of CFGs and Normal Forms. Parsing (Review). Given a string w and a grammar G , a parser finds a derivation of the string w from the grammar G , or else determines that the string is not part of the language

leone
Download Presentation

Chapter 6 Simplification of CFGs and Normal Forms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 6Simplification of CFGs and Normal Forms

  2. Parsing (Review) • Given a string w and a grammar G, a parser finds a derivation of the string w from the grammar G, or else determines that the string is not part of the language • Thus, a parser solves the membership problem for a language, which is the problem of deciding, for any string w and grammar G, whether w belongs to the language generated by G • Typically, a parser also constructs a parse tree for the string (which can be used by a compiler for code generation)

  3. Parsing (Review) • Can we solve the membership problem for context-free languages? • That is, can we develop a parsing algorithm for any context-free language? • If so, can we develop an efficient parsing algorithm? • We saw in the previous chapter (ch5) that we can, if we place restrictions on the grammar. • Normal forms of context-free grammars are interesting in that, although they are restricted forms, it can be shown that every context-free grammar can be converted to a normal form. • The two types of normal forms that we will look at are Chomsky normal form and Greibach normal form.

  4. Parsing (Review) • Simplified forms can eliminate ambiguity and otherwise “improve” a grammar • What we would like to do is to have all productions in a context-free grammar be in a form such that the derivation string (sentential form) length is strictly non-decreasing. • Given this form, if parsing produces derivation strings (sentential form) longer than the input string, we know that the string cannot belong to the language.

  5. 6.1: Methods for Transforming Grammars (1) A Useful Substitution Rule • Theorem 6.1: • This intuitive theorem allows us to simplify grammars. • Let G = (NT, T, S, P) be a context-free grammar. Suppose that P contains a production rule of the form • A → xBz • Assume that A and B are different NT and that • B → y1 | y2 | ... | ynis the set of all productions in P which have B as the left side. • Let G’ =(NT, T, S, P’) be the grammar in which P’ is constructed from P by replacing rule • A → xBz with A → xy1z | xy2z | ... | xynz • Then L(G’) = L(G)

  6. 6.1: Methods for Transforming Grammars (2) A Useful Substitution Rule • Let G be • S → a | aaS | abBc • B → abbS | b • Applying theorem 6.1 results in • S → a | aaS | ababbSc| abbc • B → abbS | b • The rules B → abbS | b, which are still part of the grammar, no longer serve any purpose. • Both of these useless rules may be deleted without effectively changing the grammar.

  7. 6.1: Methods for Transforming Grammars (3) Removing Useless Productions • A non-terminalA is useful (it occurs in at least one derivation.) if: • it is reachable:occurs in a sentential form S*aAb • it is live:generates a terminal string A*w T* • A non-terminal A is useless if: • A does not occur in any sentential form • It cannot be reached from start symbol OR • A does not generate any string of terminals. • It cannot derive a terminal string • A terminal is useful if it occurs in a sentencewL(G) • Any production involving a useless symbol is a useless production.

  8. 6.1: Methods for Transforming Grammars (4) Removing Useless Productions • To eliminate useless symbols: • First: Find the set TERM that contains all non-terminals that derive a terminal string • A*w, where w  T* • Non-terminals NOT in TERM are useless, they cannot contribute to generate strings in L(G) • Second: Find the set REACH that contains all non-terminals ATERM that are reachable from S • S*aAb

  9. 6.1: Methods for Transforming Grammars (5) Removing Useless Productions • Example 1 • G: S →AC | BS | B A→aA | aF B→CF | b  C→cC | D D→aD | BD | C E→aA | BSA F→bB | b  • L(G) is b+ • B, F TERM, since both generate terminals • S TERM, since S→B and hence S*b • A TERM, since A→aF and hence A*ab • E TERM, since E→aA and hence E*aab

  10. 6.1: Methods for Transforming Grammars (6) Removing Useless Productions • C and D do not belong to TERM, so all rules containing C and D are removed • The new grammar is • GT: S →BS | B A→aA | aF B→b E→aA | BSA F→bB | b • All non-terminals in GT derive terminal strings • Now, we must remove the non-terminals that do not occur in sentential forms of the grammar • A set REACH is built that contains all non-terminals TERMderivable from S

  11. 6.1: Methods for Transforming Grammars (7) Removing Useless Productions • GT: S →BS | B A→aA | aF B→b E→aA | BSA F→bB | b • S REACH, since it is the start symbol • B REACH, since S→SB, and hence B is derivable from S • A, E, and F can not be derived from S or B, so all rules containing A, E and F are removed

  12. 6.1: Methods for Transforming Grammars (8) Removing Useless Productions • The new grammar is • GU: S →BS | B B→b • L(GU) = b+ • The set of terminals of GU is {b}, a is removed since it does not occur in any string in the language of GU • The order is important: • Applying Second step (REACH) before First Step (TERM) may not remove all useless symbols.

  13. 6.1: Methods for Transforming Grammars (9) Removing Useless Productions • Home exercise: Remove all useless productions. • S → AB | CD | ADF | CF | EA • A → abA | ab • B → bB | aD | BF | aF • C → cB | EC | Ab • D → bB | FFB • E → bC | AB • F → abbF | baF | bD | BB • G → EbE | CE | ba • Let G = ({S, A, B, C}, {a, b}, S, {S → aS | A | C,A → a, B → aa, C → aCb}) be a CFG. • Remove all useless productions • Final grammar is • G’ = ({S}, {a}, S, {S → aS | a})

  14. 6.1: Methods for Transforming Grammars (10) Removing e-Productions • Let G be S→ SaB | aB B→ bB | e • A non-terminal symbol that can derive the null string (e) is called nullable. • For example, in G above, B is nullable since B → e • A grammar withoutnullable non-terminals is called non-contracting • G, above, is not non-contracting, since it has one nullable non-terminal, which is B.

  15. 6.1: Methods for Transforming Grammars (11) Removing e-Productions How to find nullable non-terminals? Mark all non-terminals A for which there exists a production of the form A→  Repeat Mark non-terminal X for which there exists X→  and all symbols in  have been marked as nullable Until no new non-terminal is marked Read Theorem 6.3

  16. 6.1: Methods for Transforming Grammars (12) Removing e-Productions The set of nullable non-terminals of the grammar S→ ACA A→ aAa | B | C B→ bB | b C→ cC | e is {S, A, C} C is nullable since C→ e and hence C*e A is nullable since A→ C, and C is nullable S is nullable since S→ ACA, and A and C are nullable

  17. 6.1: Methods for Transforming Grammars (13) Removing e-Productions Find nullable non-terminals. S→ aS | SS | bA A→BB B → CC | ab | aAbC C→

  18. 6.1: Methods for Transforming Grammars (14) Removing e-Productions B→ aAb | … A→ e | … B→ ab | aAb | … A→ … • If   L(G), we can eliminate all productions A →  • For every B referring to A: • For example, if B→ e and A→ BABa • Then after eliminating the rule B→, new rules for A will be added • A → BABa • A → ABa • A → BAa • A → Aa

  19. 6.1: Methods for Transforming Grammars (15) Removing e-Productions Let G be S→ SaB | aB B→ bB | e After removing e-productions, the new grammar will be S→ SaB | Sa | aB | a B→ bB | b Let G = (NT, T, P, S) be a CFG. If Aw, then the grammar G’ = (NT, T, P {A→w}, S) is equivalent to G(i.e., L(G) = L(G’)) The removal of e-productionsincreases the number of rules but reduces the length of derivations. *

  20. 6.1: Methods for Transforming Grammars (16) Removing e-Productions Let GS→ ACA A→ aAa | B | C B→ bB | b C→ cC | e The equivalent essentially non-contracting grammar GL is GL: S→ ACA | CA | AA | AC | A | C | e A→ aAa | aa | B | C B→ bB | b C→ cC | c Since S*e in G, the rule S→e is allowed in GL, but all other e-productions are replaced A grammar satisfying these conditions is called essentially non-contracting (only start symbol is nullable)

  21. 6.1: Methods for Transforming Grammars (17) Removing e-Productions • Let G be • S→ aS | SS | bA • A→ BB • B→ ab | aAbC | aAb | CC • C→  • We eliminate C→  by replacing: • B→ CC into B→ CC, B→ C, and B→  • B→ aAbC into B→ aAbC and B→ aAb • Since C →  is only C production • only B →  and B → aAb retained. • The new grammar: • S→ aS | SS | bA • A→ BB • B→  | ab | aAb

  22. 6.1: Methods for Transforming Grammars (18) Removing e-Productions • The new grammar: • S→ aS | SS | bA • A→ BB • B→  | ab | aAb • We eliminate B→  by replacing • A→BB into A→BB, A→B, and A→ • Since there are other B productions, these are all retained • The new grammar: • S→ aS | SS | bA • A→ BB | B |  • B→ ab | aAb

  23. 6.1: Methods for Transforming Grammars (19) Removing e-Productions • The new grammar: • S→ aS | SS | bA • A→ BB | B |  • B→ ab | aAb • Finally we eliminate A → by replacing • B→aAb into B→aAb, B→ab • S→bA into S→bA | b • The final CFG is: • S→ aS | SS | bA | b • A→ BB | B • B→ ab | aAb

  24. 6.1: Methods for Transforming Grammars (20)Removing of Unit Rules • Rules having this form A→B are called unit rules • Consider the rules • A→ aA | a | B • B → bB | b | C • The unit rule A→B indicates that any string derivable from B is also derivable from A • The removal of unit rules increases the number of rules but reduces the length of derivations.

  25. 6.1: Methods for Transforming Grammars (21)Removing of Unit Rules To eliminate the unit rule, add A rules that directly generate the same strings as B Add a rule A→u for each B → u and deleting A→B from the grammar Read Theorem 6.4 A→B B→a | ... A→a | ... B→a | ...

  26. 6.1: Methods for Transforming Grammars (22)Removing of Unit Rules Consider the rules A→ aA | a | B B → bB | b | C The new rules after eliminating the unit rule A→B A→ aA | a | bB | b | C B → bB | b | C We add new rules to A by replacing B in A with all its RHS rules

  27. 6.1: Methods for Transforming Grammars (23)Removing of Unit Rules GL: S → ACA | CA | AA | AC | A | C | e A → aAa | aa | B | C B → bB | b C → cC | c The new equivalent grammar (without unit rules) GC: S → ACA | CA | AA | AC | aAa | aa | bB | b | cC | c| e A → aAa | aa | bB | b | cC | c B → bB | b C → cC | c

  28. 6.1: Methods for Transforming Grammars (24)Removing of Unit Rules Remove unit rules: S → T | S + T T →F | F * T F → a | (S) S →T | S + T T →a | (S) | F * T F → a | (S) S →a | (S) | F * T | S + T T →a | (S) | F * T F → a | (S)

  29. 6.2: Chomsky Normal Form (1) • The Chomsky normal form places restrictions on the length and the composition of the right-hand side of a rule • Definition 6.4: • A CFG is in Chomsky normal form if each production rule has one of the following forms: • A→a • A→BC • S→e • where B, C NT • Read Theorem 6.6

  30. 6.2: Chomsky Normal Form (2) • Algorithm Step 1 • Make sure that the following are satisfied: • No e-productions (other than S→ e) • No chain rules • No useless symbols

  31. 6.2: Chomsky Normal Form (3) • Algorithm Step 2 • Eliminate terminals from RHS of productions • For each production A→ X1X2…Xm • where Xi NT T • If m 1, replace each terminala RHS of A • Add (if needed) Ca→ a for each a T, where each Ca is new non-terminal. • In production A, replace terminal a with corresponding Ca

  32. 6.2: Chomsky Normal Form (4) • Algorithm Step 3 • Eliminate productions with long RHS: • For each production: • A→ B1B2…Bm, m 2, where BiNT • replace with productions • A→ B1D1 • D1→ B2D2 • … • Dm-2→ Bm-1Bm • where D1…Dm-2 are new non-terminals.

  33. 6.2: Chomsky Normal Form (5) • Original grammar (no chain rules, useless symbols, or e-productions): S  X a Y | Y b X  Y X a Y | a Y  S S| a X | b • Grammar after eliminating terminals from RHSs: S  X A Y | Y B A  a X  Y X A Y | a B  b Y  S S| A X | b • Grammar after eliminating long RHSs: S  X T | Y B T  A Y A  a X  Y F | a F  X G B  b Y  S S| A X | b G  A Y Note: Could simplify by combining redundant variables T and G

  34. 6.2: Chomsky Normal Form (6) • Original grammar (no chain rules, useless symbols, or e-productions): S  aXYZ | a X  aX | a Y  bcY | bc Z  cZ | c • Grammar after eliminating terminals from RHSs: S  AXYZ | a A  a X  AX | a B  b Y  BCY | BC C  c Z  CZ | c • Grammar after eliminating long RHSs: S  AF | a A  a F XG X  AX | a B  b G  YZ Y  BH | BC C  c H  CY Z  CZ | c • See Example 6.8

  35. 6.2: Greibach Normal Form (7) • A context-free grammar is in Greibach Normal Form if every production is of the form A → aX • where A  NT, X  NT*, and a  Σ • Examples: • G1 = ({S, A}, {a, b}, S, {S → aSA| a, A → aA| b}) • GNF • G2= ({S, A}, {a, b}, S, {S → AS | AAS, A → SA | aa}) • not GNF • This grammar S  AB A aA|bB | b B b is not in GNF • This grammar S aAB | bBB | bB A aA|bB | b B  b is in GNF

  36. CFG Simplification: Example How can the following be simplified? S  A B 1) Delete B useless because nothing derivable from B S  A C D 2) Delete either AAaorAaA A  A a3) Delete one of the identical productions A a 4) Delete C e, also replace SACD with SAD A aA 5) Replace with D eAe A a6) Delete E useless after change #5 C e7) Delete F useless because not derivable from S D dD D E E e A e F ff Note how some simplifications can allow other subsequent simplifications.

More Related