1 / 9

Section 12.4 Context-Free Language Topics

Section 12.4 Context-Free Language Topics Algorithm . Remove L -productions from grammars for langauges without L . Find nonterminals that derive L . For each production A  w construct all productions A  w’ where w’ is obtained from

julesn
Download Presentation

Section 12.4 Context-Free Language Topics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Section 12.4 Context-Free Language Topics Algorithm. Remove L-productions from grammars for langauges without L. Find nonterminals that derive L. For each production A w construct all productions A w’ where w’ is obtained from w by removing one or more occurrences of the nonterminals from Step 1. Combine the original productions with those of step 2 and eliminate any L-productions. Example. Remove L-productions from the grammar S  ABc A aA | L B  bB | L. Solution. Step 1: The nonterminals A and B derive L. Step 2: From the production S  ABc we construct S  Bc | Ac | c. From the production A aA we construct A a. From the production B bB we construct B b. Step 3: S  ABc | Bc | Ac | c A aA | a B  bB | b. Quiz. Remove L -productions from S  ABc | Ab | c A ABa | L B  Bbc | L. Solution. S  ABc | Ab | c| Bc | Ac | b A ABa | Ba | Aa | a B  Bbc | bc.

  2. Chomsky Normal Form. Productions have one of the following forms A  a (a a terminal) A  BC S  L (if L is in the language). Advantages: Parse trees are binary, which are easy to represent. Any string of length n > 0 can be derived in 2n – 1 steps. Algorithm. Transform to Chomsky normal form (with the additional property that no start symbol occurs on the right side of a production) 1. If the start symbol S occurs on some right side, create a new start symbol S´ and a new production S´  S. 2. Remove AL (if A ≠ S) by previous algorithm. (If S  L is removed, add it back.) 3. Remove unitproductions (i.e., A  B): If A  B or A + B, then construct productions Aw where Bw is not a unit production. Now remove all unit productions. 4. For each production whose right side has two or more symbols, replace all occurrrences of each terminal a with a new nonterminal A and also add the new production Aa. 5. Replace each production B C1…Cnwith n > 2 with B C1D where D C2 …Cn. Repeat this step until all right sides have length two.

  3. Example. Construct a Chomsky normal form for the grammar S  aSb | D D  Dc | L. Solution. Step 1: Add the production S´  S. Step 2: S´  S | L S  aSb | ab | D D  Dc | c. Step 3: S´  aSb | ab | Dc | c | L S  aSb | ab | Dc | c D  Dc | c. Step 4: S´  ASB | AB | DC | c | L S  ASB | AB | DC | c D  DC | c A  a B  b C  c. Step 5: Replace S´ ASB and S  ASB by S´ AE, S´ AE, and E  SB.

  4. Quiz. Construct a Chomsky normal form for the grammar S  aTbb | U |L T  cT | c U  Ud | d. Solution. Step 1: No change.Step 2: No change. Step 3: Remove unit production S  U to obtain S  aTbb | Ud | d |L T  cT | c U  Ud | d. Step 4: Transform right sides of length at least two into strings of nonterminals. S  ATBB | UD | d |L T  CT | c U  UD | d. A  a B  b C  c D  d. Step 5: Replace S  ATBB with the productions S  AE, E  TF, F  BB.

  5. Greibach Normal Form. Productions have one of the following forms A  b (b a terminal) A  bD1…Dk S  L (if L is in the language). Advantage: Any string of length n > 0 can be derived in n steps. Algorithm (idea). Transform context-free grammar to Greibach normal form. Perform steps 1, 2, and 3 of the Chomsky algorithm. Remove all left-recursion, including indirect, without addingL 3. Make substitutions to transform the grammar into the proper form. Example. Put the following grammar into Greibach normal form. S  AB | Ac | d A  aA | a B Ab | c. Solution: Steps 1 (Chomsky steps 1, 2, and 3) and 2 are non needed. Step 3: Replace A in S  AB | Ac | d with aA | a to obtain S  aAB | aB | aAc | ac | d. Replace A in B Ab | c with theright side of A  aA | a to obtainB aAb | ab | c. Now add the new productions C  c and D  b to obtain the proper form: S  aAB | aB | aAC | aC | d A  aA | a B aAD | aD | c C  c D  b.

  6. Example. Put the following grammar into Greibach normal form. S  AB | Ac | d A  aA | a B Ab | c. Solution: Steps 1 (Chomsky steps 1, 2, and 3) and 2 are not needed. Step 3: Replace A in S  AB | Ac | d with aA | a to obtain S  aAB | aB | aAc | ac | d. Replace A in B Ab | c with theright side of A  aA | a to obtain B aAb | ab | c. Now add the new productions C  c and D  b and make appropriate replacements to obtain the proper form: S  aAB | aB | aAC | aC | d A  aA | a B aAD | aD | c C  c D  b.

  7. Properties of Context-Free Languages When we know some properties of context-free languages they can help us argue, BWOC, that certain languages are not context-free. The Pumping Lemma If L is an infinite context-free language, then any grammar for L must be recursive, so there must be derivations of the the following form where u, v, w, x, and y are terminal strings. S + uNy N + vNx (where v and x are not both L) N + w. These derivations lead to derivations like S + uNy + uvNxy + uv2Nx2y + uvkNxky + uvkwxky L for all kN. This is the basis for the Pumping Lemma: There is an integer m > 0 such that if zL and | z | ≥ m, then z has the form z = uvwxy where 1 ≤ | vx | ≤ | vwx | ≤ m and uvkwxky L for all kN. Note: The number m depends on the grammar as we’ll see in the following example.

  8. Example. Suppose we have the following grammar for {L, bbc}  {abcnd | n N}. S  aNd | bbc | L N Nc | b. Here are a few derivations: S  aNd  abd S  aNd  aNcd  abcd S  aNd  aNcd  aNccd  abccd S + abckd for any k in N. For this grammar m = 4 can be used in the pumping lemma because any derivation of a string z with | z | ≥ 4 must use the nonterminal N. For example, if | z | = 8 and z = abcccccd, then the pumping lemma factors z = abcccccd = uvwxy where 1 ≤ | vx | ≤ | vwx | ≤ 4 and uvkwxky L for all kN. In this case let u = a, v = L, w = b, x = c, and y = ccccd. Example. The language L = {anbncn+k| k, n N} is not context-free. Proof: Assume, BWOC, that L is context-free. L is infinite, so pumping lemma applies. Choose z = ambmcmwhere m is the positive integer from the lemma. Then z = ambmcm = uvwxy where 1 ≤ | vx | ≤ | vwx | ≤ m and uvkwxky L for all kN. Observe neither v nor x can contain distinct letters. For example, if v = …a…b…, then v2 = …a…b……a…b…, which can’t appear as a substring of any string in L. So v and x must be strings of repeated occurrences of a single letter. Now since | vwx | ≤ m, there are two possible places in ambmcm where v and x must occur: (1) v and x occur in ambm. (2) v and x occur in bmcm. But we obtain the following contradictions because v and x are not both L. (1) Let k = 2 to obtain uv2wx2y = am+ibm+jcm, where i > 0 or j > 0. So uv2wx2y L (2) Let k = 0 to obtain uwy = ambm-icm–j, where i > 0 or j > 0. So we have uwy L. These contradictions imply that L is not context-free. QED.

  9. Example/Quiz. Prove that the language L = {ss | s  {a, b}*} is not context-free. Proof: Assume, BWOC, that L is context-free. L is infinite, so pumping lemma applies. Choose z = ambmambmwhere m is the positive integer from the lemma. Then z = ambmambm = uvwxy where 1 ≤ | vx | ≤ | vwx | ≤ m and uvkwxky L for all kN. Now since | vwx | ≤ m, there are three possible places in ambmambm where v and x must occur: (1) v and x occur in ambm (on the left of z). (2) v and x occur in bmam (in the center of z). (3) v and x occur in ambm (on the right of z). Notice that v and x can consist only of repetitions a single letter. For example, in case (1) suppose v = aibj for some i > 0 and j > 0 and x = bn for some n ≥ 0. Then, letting k = 0, we would obtain uwy = am–ibm–j–nambm, which cannot be in L. The argument is similar for the other cases. So v and x must consist only of repetitions of a single letter. We need to find a contradiction in each of the three cases. We’ll do it by using k = 0. This tells us that uwy L. But we obtain the following contradictions because v and x are not both L. (1) uwy = am–ibm–jambmwhere either i > 0 or j > 0 Souwy  L, (2) uwy = ambm–iam–jbmwhere either i > 0 or j > 0.Souwy  L. (3) uwy = ambmam–ibm–jwhere either i > 0 or j > 0.Souwy  L. Therefore L is not context-free. QED. Remark: Be careful that the choice of z is not in a context-free sublanguage of L. For example, if we chose z = (ab)m(ab)m in the preceding example, we would not get any contradictions.

More Related