180 likes | 382 Views
About Grammars. CS 130 Theory of Computation HMU Textbook: Sec 7.1, 6.3, 5.4. About grammars. Simplifying grammars Normal forms for grammars Grammar Ambiguity. Grammar Productions. Formal definition of a grammar provides much leeway
E N D
About Grammars CS 130 Theory of ComputationHMU Textbook: Sec 7.1, 6.3, 5.4
About grammars • Simplifying grammars • Normal forms for grammars • Grammar Ambiguity
Grammar Productions • Formal definition of a grammar provides much leeway • Productions can be simplified or restricted to make proofs about CFGs simpler
Simplifications • Removing useless symbols • Those that cannot be derived from S and those that cannot reduce to a terminal string • Removing є-productions • A є • Removing unit productions • A B • Normal forms e.g., Chomsky Normal Form
Useless symbols • We want to ensure all productions in the grammar have no useless symbols, i.e., all symbols are generating and reachable • Generating symbols • All variables that could eventually derive a string of terminals; i.e., all A in V, such that there exists a string w of terminals where A * w • Reachable symbols • All variables that can be reached from the start symbol; i.e., all A in V, such that S * uAw, for some u and w
Removing useless productions • Remove productions with non-generating symbols • Requires identifying generating symbols recursively: right hand side of production contains only terminals and generating symbols • Remove productions with non-reachable symbols • Requires identifying reachable symbols recursively: S is reachable, and so are symbols that exist on the right hand side of productions with reachable symbols on the left hand side
Epsilon Productions • є-productions: productions of the form A є • Nullable symbols: symbols A whereA є or A B1B2…Bn such that each Bi is nullable • For each production that has a nullable symbol on the right hand side, add a production without that symbol; apply rule iteratively on resulting productions • After this step, all є-productions can be removed • Note, if the language L generated by the original grammar includes є, then the language generated by the resulting grammar will be L – {є}
Unit Productions • Unit productions: all productions of the form A B • Removing unit productions • Identify unit pairs: pairs of variables (A, B) such that A * B, and the derivation involves only unit productions • For each unit pair (A, B), add the production A w, whenever B w and w is not a variable • Unit productions may now be removed
Chomsky Normal Form • CNF: all productions are of the form • A BC (B, C are variables) • A a (a is a terminal) • How do we convert a grammar to an equivalent CNF grammar?
Greibach Normal Form • GNF: all productions are of the form • A aB1B2…Bn • Note that A a is allowed • Note that if the grammar is GNF, each step in a derivation of a string adds a terminal • How do we convert a grammar to an equivalent GNF grammar?
Recall CFG to PDA conversion • Transition function is based on the variables, productions and terminals of the grammar: • (q0 ,є , A) includes (q0, w) whenever A w • (q0 ,a , a) = (q0, є) for each a in T • Easier and more intuitive if the grammar is of GNF • (q0 ,a , A) = (q0, B1B2…Bn) for each productionA aB1B2…Bn
Ambiguous grammar • A grammar G is ambiguous if there exists a string for which two different parse trees exist (two different leftmost derivations) • Example:S i = EE nE iE E + EE E * E Parse tree fori = n + n * n?
Two leftmost derivations • S i = E i = E + E i = n + E i = n + E * E i = n + n * E i = n + n * n • S i = E i = E * E i = E + E * E i = n + E * E i = n + n * E i = n + n * n
Grammar and precedence Parse tree fori = n + n * n? • S i = EE E + TE TT T * FT FF nF i S i = E i = E + T i = T + T i = F + T i = n + T i = n + T * F i = n + F * F i = n + n * F i = n + n * n
Chomsky hierarchy • Relaxing or adding restrictions to productions in a grammar leads towards a hierarchy of languages • Note: Context-free grammar definition imposes that a production should take the form A w, where A T and w is a string over T V
Chomsky hierarchy • Regular languages (type 3) • A sB, A s (A, B V, s T) • Context-free languages (type 2) • A w (w is a string over T V) • Context-sensitive languages (type 1) • uAw uvw (u,v,w are strings over T V) • Recursively enumerable languages (type 0) • v w (productions are unrestricted)
Chomsky hierarchy recursively enumerable type 0 recursive context-sensitive type 1 context-free type 2 type 3 regular