250 likes | 668 Views
Chapter 6 Properties of Context-free Languages. 6.1 Pumping Lemma for CFL's. Lemma 1 (The pumping lemma for context-free languages). Let L be any infinite CFL. Then there exists a constant n, depending on L, such that if z is in L and |z| n, then we can write z = uvwxy such that.
E N D
Chapter 6 Properties of Context-free Languages 6.1 Pumping Lemma for CFL's Lemma 1 (The pumping lemma for context-free languages). Let L be any infinite CFL. Then there exists a constant n, depending on L, such that if z is in L and |z| n, then we can write z = uvwxy such that |vx| 1, |vwx| n, and for all i 0, uviwxiy is in L.
Proof (sketch of the proof) Let G be a Chomsky normal-form grammar generating L - {}. Since the size of the variables of a context-free grammar for L is fixed and L is infinite. There exits a long sentence z in L such that any parse tree for z must contain a long path of variables. And there would be at least one variable that appear twice in the path. For Chomsky normal-form grammar, there are only two types of the production rules: ABC and Aa. Hence for a parse tree of sentence z in L having no path of length greater than i, then the length of z is no more than 2 i-1. We consider the length of a path as the length of the internal nodes of the path, not including the leaf.
S S a B A T2 T1 b a S |Path|=k=1, |word|=1 ≦ 2k-1 |path|=k=2 |word|=2 ≦ 2k-1 |path of T1| ≦ i |word of T1| ≦ 2i-1 |path of T2| ≦ i |word of T2| ≦ 2i-1 => |path| ≦ i+1 |word| ≦ 2i
Suppose that G has k variables, and let n = 2k. If z is in L and |z| n, then any parse tree of z must have a path of length > k. Hence there must be at least one variables that appears at least twice in the path of the parse tree. Let the two vertices v1 and v2 in the parse tree be the vertices labeled with the same variable A, and the vertex v1 is closer to the root than vertex v2 . Also they are the vertices with the same label closest to leaves of the tree.
S A ≦ k+1 A u v w x y ≦ 2k = n Since ABC, we have that |vx|≧1. A * vAx and A*w, where |vwx| ≦n.
Now, we have that A * vAx and A*w, where |vwx| ≦n. A * viwxi, for i=0, 1, 2, … . Hence S * uviwxiy in L, for i=0, 1, 2, … . End of lemma 1.
Example 1 L={0n1n0n | n=0, 1, 2,…} is not context-free. Pf Suppose that L is context-free. Let n be the constant in the pumping lemma for CFL. For a sentence z in L, |z| n. Write z = uvwxy, for all u, v, w, x, and y with |vx| 1 and |vwx| n. If the substring vwx contains only one kind of symbol, then for i=0, uwx is not in L. Since the number of 1’s is not equal to the number of consecutive 0’s. If the substring vwx contains two kinds of symbols, then for i=2, uv2wx2y is not in L. Since there are 0’s between two 1’s. Contradiction.
Example 2 L={0n | n is a prime numer} is not context-free. Pf Suppose that L is context-free. Let n be the constant in the pumping lemma for CFL. For a sentence z in L, |z| = p n, where p is a prime. Write z = uvwxy, for all u, v, w, x, and y satisfying that |vx| 1, and |vwx| n. Let |vx|=k 1. Choose i = (p+1)k. We have that |uviwxiy| = (p – k) + i*k = (p – k) + (p + 1)k = p(k + 1), a composite number. Hence uviwxiy is not in L. Contradiction.
Example 3 L={ambrcsdt | m = 0 or r = s = t} is not context-free. Pumping lemma for CFLfails. For z = brcsdt = uvwxy. When v and x are of the same type of symbol, say b, for i = 0, 1, 2, …, uviwxiy are all in L. For z = ambrcrdr = uvwxy. When v and x are of the same type of symbol, say a, for i = 0, 1, 2, …, uviwxiy are all in L. Ogden’s lemma for CFLworks.
Lemma 2 (Ogden’s lemma). Let L be any infinite CFL. Then there exists a constant n, depending on L, such that if z is in L and |z| n, and we mark any n or more positions of z “distinguished,” then we can write • z = uvwxy, such that: • v and x together have at least one distinguished position, • vwx has at most n distinguished positions, and • for all i 0, uviwxiy is in L.
Proof (sketch of the proof) Let G be a Chomsky normal-form grammar generating L - {}. Suppose that G has k variables, and let n = 1+ 2k. Suppose that z is in L and |z| n. We can mark any n or more positions of z “distinguished”. Select a path in a parse tree of z so that each vertex of the path is a branch point, i.e., both branches of the vertex have distinguished descendants.
The selection is as follows: Step 1: path = {}, empty. Step 2: path S; P := S; // path = <S> Step 3: If P is leaf, done. Step 4: If P has two children, say A and B. If the sub-tree of B has less number of distinguished descendant than the sub-tree of A, then path A; P := A; // path = <S, …, A> else path B; P := B; // path = <S, …, B> goto step 3.
Since there are at least n markers on the leaves, the selected path must have at least k+1 branch points. Hence there are at least one variable on the path appears twice or more. The rest of the proof is similar to the proof of pumping lemma for context-free languages. End of lemma 2.
Example 4 L = {brcsdt | r≠s ≠ t ≠r} is not context-free. Proof (by Ogden’s lemma) Suppose that L is CF. Let n be the constant in the Ogden’s lemma. Choose z = bncn+n!dn+2n!. Let positions of the b’s be distinguished and z = uvwxy. v and x together have at least one b . vwx has at most n b’s. If either v or x contains two different kinds of symbols, then uv2wx2y is not in L.
If each of v and x contains only one kind of symbol, then one of v and x must be a substring of b+. • If x is in c* or d*, then v must be in b+. Assume that x is in c* and let |v| = s, i.e., v = bs. Then 1≦s ≦n. We have that s | n!. Let t = n!/s. Choose i = 2t+1. Then z’= uv2t+1wx2t+1y is in L. But v2t+1 = bs+2st = bs+2n!. uwx has (n – s) b’s z’ has (n – s) + s + 2n! = n+2n! b’s. We have that z’ is not in L. Contradiction. Assume that x is in d* and let |v| = s, i.e., v = bs. Let t = n!/s. Choose i = t+1. Then z’= uvt+1wxt+1y is in L. But vt+1 = bs+st = bs+n!. uwx has (n – s) b’s. Then z’ has (n – s) + s + n! = n+n! b’s and z’ is not in L. Contradiction.
If x is in b+, then v must be in b+. And w is in b*. Let |vx| = s, i.e., vx = bs. Then 1≦s ≦n. We have that s | n!. Let t = n!/s. Choose i = 2t+1. Then z’= uv2t+1wx2t+1y is in L. But v2t+1 x2t+1= bs+2st = bs+2n!. The substring uwx has (n – s) b’s. Then z’ has n+2n! b’s and hence is not in L. Contradiction. End of example 4.
Example 5 L={ambrcsdt | m = 0 or r = s = t} is not context-free. Proof (by Ogden’s lemma) Choose z = anbncndn. Mark all positions of b’s “distinguished. Write z = uvwxy, where vx contains at least one b, and vx contains at most n b’s. Choose i = 0. We have that z’ = uwy is in L. But the number of b’s in z’ ≠ the number of c’s in z’. Contradiction.