170 likes | 180 Views
Learn how to simplify grammars using various techniques like removing useless symbols, epsilon and unit productions, and transforming to Chomsky Normal Form. Understand Chomsky hierarchy and language types. Suitable for students studying Theory of Computation.
E N D
About Grammars CS 130 Theory of ComputationHMU Textbook: Sec 7.1, 6.3, 5.4
About grammars • Simplifying grammars • Normal forms for grammars • Grammar Ambiguity
Grammar Productions • Formal definition of a grammar provides much leeway • Productions can be simplified or restricted to make proofs about CFGs simpler
Simplifications • Removing useless symbols • Those that cannot be derived from S and those that cannot reduce to a terminal string • Removing є-productions • A є • Removing unit productions • A B • Normal forms e.g., Chomsky Normal Form
Useless symbols • We want to ensure all productions in the grammar have no useless symbols, i.e., all symbols are generating and reachable • Generating symbols • All variables that could eventually derive a string of terminals; i.e., all A in V, such that there exists a string w of terminals where A * w • Reachable symbols • All variables that can be reached from the start symbol; i.e., all A in V, such that S * uAw, for some u and w
Removing useless productions • Remove productions with non-generating symbols • Requires identifying generating symbols recursively: right hand side of production contains only terminals and generating symbols • Remove productions with non-reachable symbols • Requires identifying reachable symbols recursively: S is reachable, and so are symbols that exist on the right hand side of productions with reachable symbols on the left hand side
Epsilon Productions • є-productions: productions of the form A є • Nullable symbols: symbols A whereA є or A B1B2…Bn such that each Bi is nullable • For each production that has a nullable symbol on the right hand side, add a production without that symbol; apply rule iteratively on resulting productions • After this step, all є-productions can be removed • Note, if the language L generated by the original grammar includes є, then the language generated by the resulting grammar will be L – {є}
Unit Productions • Unit productions: all productions of the form A B • Removing unit productions • Identify unit pairs: pairs of variables (A, B) such that A * B, and the derivation involves only unit productions • For each unit pair (A, B), add the production A w, whenever B w and w is not a variable • Unit productions may now be removed
Chomsky Normal Form • CNF: all productions are of the form • A BC (B, C are variables) • A a (a is a terminal) • How do we convert a grammar to an equivalent CNF grammar?
Greibach Normal Form • GNF: all productions are of the form • A aB1B2…Bn • Note that A a is allowed • Note that if the grammar is GNF, each step in a derivation of a string adds a terminal • How do we convert a grammar to an equivalent GNF grammar?
Recall CFG to PDA conversion • Transition function is based on the variables, productions and terminals of the grammar: • (q0 ,є , A) includes (q0, w) whenever A w • (q0 ,a , a) = (q0, є) for each a in T • Easier and more intuitive if the grammar is of GNF • (q0 ,a , A) = (q0, B1B2…Bn) for each productionA aB1B2…Bn
Ambiguous grammar • A grammar G is ambiguous if there exists a string for which two different parse trees exist (two different leftmost derivations) • Example:S i = EE nE iE E + EE E * E Parse tree fori = n + n * n?
Two leftmost derivations • S i = E i = E + E i = n + E i = n + E * E i = n + n * E i = n + n * n • S i = E i = E * E i = E + E * E i = n + E * E i = n + n * E i = n + n * n
Grammar and precedence Parse tree fori = n + n * n? • S i = EE E + TE TT T * FT FF nF i S i = E i = E + T i = T + T i = F + T i = n + T i = n + T * F i = n + F * F i = n + n * F i = n + n * n
Chomsky hierarchy • Relaxing or adding restrictions to productions in a grammar leads towards a hierarchy of languages • Note: Context-free grammar definition imposes that a production should take the form A w, where A T and w is a string over T V
Chomsky hierarchy • Regular languages (type 3) • A sB, A s (A, B V, s T) • Context-free languages (type 2) • A w (w is a string over T V) • Context-sensitive languages (type 1) • uAw uvw (u,v,w are strings over T V) • Recursively enumerable languages (type 0) • v w (productions are unrestricted)
Chomsky hierarchy recursively enumerable type 0 recursive context-sensitive type 1 context-free type 2 type 3 regular