1 / 27

Languages That Are and Are Not Context-Free

Delve into the study of regular vs. context-free languages, closure properties, Kleene star, and intersection theorems. Understand key concepts like the Pumping Lemma and parse tree heights.

allenfry
Download Presentation

Languages That Are and Are Not Context-Free

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Languages That Are and Are Not Context-Free Section 3.5 Wed, Oct 26, 2005

  2. Regular vs. Context-Free • Theorem: Every regular language is context-free. • Proof: • Let L be regular. • Given a DFA for L, add a stack, but do not use the stack. • That is, change each DFA transition (p, a, q) to a PDA transition ((p, a, e), (q, e)). • The result is a PDA whose language is L. • Therefore, L is context-free.

  3. Closure under Union • Theorem: Let L1 and L2 be CFLs. Then L1L2 is also a CFL. • Proof: • Let L1 have grammar (V1, Σ1, R1, S1) and let L2 have grammar (V2, Σ2, R2, S2). • Then L1L2 has the grammar (V, Σ, R, S) where • Σ = Σ 1Σ 2 • V = V1V2 • S is the new start symbol • R = R1R2 {S → S1S2}.

  4. Proof, continued • Therefore, L1L2is a CFL. • We must assume in the proof that (V1 – Σ1)  (V2 – Σ2) = . • Why?

  5. Closure under Concatenation • Theorem: Let L1 and L2 be CFLs. Then L1L2 is also a CFL. • Proof: • Let L1 have grammar (V1, Σ1, R1, S1) and let L2 have grammar (V2, Σ2, R2, S2). • Then L1L2 has the grammar (V, Σ, R, S) where • Σ = Σ 1Σ 2 • V = V1V2 • S is the start symbol • R = R1R2 {S → S1S2}.

  6. Proof, continued • Therefore, L1L2 is a CFL. • Again, we must assume that (V1 – Σ1)  (V2 – Σ2) = .

  7. Closure under Kleene Star • Theorem: Let L be a CFL. Then L* is also a CFL. • Proof: • Let L have grammar (V, Σ, R, S). • Then L* has the grammar (V, Σ, R, S) where • R = R {S → e | SS}. • Therefore, L*is a CFL.

  8. Intersection of a Regular Language and a CFL. • Theorem: The intersection of a CFL and a regular language is a CFL. • Proof (outline): • Use the cross product to construct the intersection of the PDA and the DFA. • Only one component uses the stack. • Therefore, there is no complication. • The cross product will function as a PDA.

  9. Intersection of a Regular Language and a CFL. • More specifically, the transitions (p, a) q from the DFA and (p', a, )  (q', ) from the PDA may be combined into ((p, p'), a, )  ((q, q'), ) for the new PDA.

  10. Complementation and Intersection • The complement of a context-free language is not necessarily context-free. • The intersection of two context-free languages is not necessarily context-free. • Counterexamples will be given later.

  11. The Concept behind the Pumping Lemma for CFLs • The Pumping Lemma for CFLs will allow us to show that some languages are not context-free. • If a CFL contains a word w with a sufficiently long derivation S*w, then some nonterminal A must appear more than once. • This is the Pigeonhole Principle.

  12. The Concept behind the Pumping Lemma for CFLs • That is, we have S*uAz*uvAyz*uvxyz. • Thus, A*vAy and A*x. • We may repeat the derivation A*vAy as many times as we like (including zero times), producing strings uvnxynz, for any n 0.

  13. The Length of a Path in a Parse Tree • In a parse tree T, define a path to be • empty, or • a sequence of nodes, starting at a node in the tree and ending at one of its descendants, and including all of the children along the way. • The length of a path is • 0, if the path is empty, or • 1 less than the number of nodes in the path.

  14. Height and Fanout • The height of a parse tree is the length of the tree’s longest path. • Given a grammar G, the fanout of G, denoted (G), is the largest number of symbols on the right side of any rule in G.

  15. A Lemma for the Lemma • Lemma: Let G be a CFG. The yield of any parse tree of G of height h has length no greater than (G)h. • Proof: • The longest possible string is obtained if we always use a grammar rule with the maximum number of symbols on the right-hand side. • Therefore, if we apply grammar rules to each nonterminal in the string at most h times, then the length of the resulting string is at most f(G)h.

  16. The Pumping Lemma for CFLs • The Pumping Lemma for CFLs: Let G = (V, Σ, R, S) be a context-free grammar. Then any string wL(G) with length at least n = (G)|V – | + 1 can be written as w = uvxyz for some strings u, v, x, y, z Σ* such that • |v| > 0 or |y| > 0, • |vxy|n, and • uvkxykzL(G) for every k 0.

  17. The Pumping Lemma for CFLs • Proof: • Let n = (G)|V – | + 1. • Let wL(G) with |w| n. • Let T be a parse tree for w that uses the smallest number of leaves possible (minimize the number of empty strings.) • Let P be a path of maximum length in T. • Since |w| > (G)|V – |, the length of P is greater than |V – |, i.e., P is at least |V – | + 1. (Lemma) • Therefore, the number of nodes on P is at least |V – | + 2.

  18. The Pumping Lemma for CFLs • Let P' be the last part of P consisting of exactly |V – | + 2 nodes. • P' must contain exactly |V – | + 1 nonterminals. • Therefore, at least one nonterminal must be repeated. • Let A be the first nonterminal that is repeated as we follow the path from the leaf back towards the root. • Let T' be the subtree with root at the second-to-last occurrence of A on the path P. • If we remove T' from T, except for its root A, the result is a parse tree for a string uAz.

  19. The Pumping Lemma for CFLs • Let T'' be the subtree whose root node is the last occurrence of A on the path P. • T'' is a parse tree for a string x. • If we remove T'' from T' except the root A, the result is a parse tree for a string vAy. • This parse tree may be attached at the leaf A in the tree T – T' repeatedly as many times as we like (including zero times), creating parse trees for uvkAykz for any k 0. • Finally, we re-attach T'' and get a parse tree for uvkxykz.

  20. The Pumping Lemma for CFLs • If v = e and y = e, then they could have been eliminated, producing a shorter tree. • We assumed that this was the shortest possible parse tree for w. • Therefore, v ≠ e or y ≠ e. • The path from the second-to-last A to the last A and then to the terminal has length at most |V – | + 1. • Therefore, the subtree T' represents no more than (G)|V – | + 1 terminals. (Lemma) • Thus, |vwy| n.

  21. Standard Example of a Non-CFL • The language {anbncn | n 0} is not context-free. • Proof: • Suppose it is. • Let n be the n of the Pumping Lemma. • Let w = anbncn. • Then w = uvxyz where |v| > 0 or |y| > 0 and |vxy| n. • Then vxy contains at most two different symbols. • Suppose it contains at most as and bs (but no cs). • Then either v contains at least one a or y contains at least one b.

  22. Standard Example of a Non-CFL • Say v contains ias and y contains jbs, for some i and j, with i > 0 or j > 0. • Then uv2xy2z contains at least n + ias and at least n + jbs, at least one of which is greater than n. • But uv2xy2z contains only ncs. • Thus, uv2xy2z L. • This is a contradiction. • Therefore, this language is not context-free. • The other case, where vxy contains bs and cs, but no as, is handled similarly.

  23. Example of a Non-CFL • The language {ambncmdnm, n 0} is not context-free. • Proof: • Suppose that it is context-free. • Let n be the n of the Pumping Lemma. • Let w = anbncndn. • Complete the proof using the Pumping Lemma.

  24. Example of a Non-CFL • The language L = {w *#as = #bs = #cs} is not context-free. • Proof: • Suppose that it is context-free. • Intersect it with L(a*b*c*), which is regular. • The intersection is {anbncn | n 0}, which known to be non-CFL. • Therefore, the language L is not context-free.

  25. Nonclosure Properties • Theorem: The set of context-free languages is not closed under intersection. • Proof: • Let L1 = {anbncm | m, n 0} and let L2 = {ambncn | m, n 0}. • Clearly, L1and L2are context-free. • However, L1 L2= {anbncn | n 0}, which is known to be non-context-free.

  26. Nonclosure Properties • Theorem: The set of context-free languages is not closed under complementation. • Proof: • Suppose it were closed under complementation. • Let L1and L2be context-free languages. • Then (L1'  L2')' is also context-free. • However, by DeMorgan’s Laws, this is L1 L2, which we now know is not necessarily context-free.

  27. Example • The language L = {w * | wuu for any u*} is context-free. • The language L′ = {w * | w = uu for some u*} is not context-free.

More Related