360 likes | 378 Views
This tutorial discusses the Pumping Lemma, DFA Minimization, and Context-Free Languages. It provides examples and outlines the algorithms involved.
E N D
Tutorial 03-- CSC3130 : Formal Languages and Automata Theory Haifeng Wan (hfwan@cse.cuhk.edu.hk) 2009-09-27
Outline • Pumping Lemma • DFA Minimization • Context-free Languages
Pigeonhole Principle • Pigeonhole principle • If m objects are put into n containers, where m>n, then at least one container must hold more than one object. • The pigeonhole can be used to prove that certain infinite languages are not regular. • Remind: any finite language is regular.
Pumping Lemma for Regular Languages • Theorem: For every regular language L There exists a number nsuch that for every string z in L, we can write z = uvw where |uv| ≤ n |v| ≥ 1 For everyi≥ 0, the string u vi w is in L. v z … … w u
Pumping Lemma • What does the Pumping Lemma say? • If an infinite language is regular, it can be defined by a DFA. • The DFA has a finite number m of states. • Since the language is infinite, some strings of the language must have length greater than m. • For a string of length greater than m accepted by the DFA, the walk through the DFA must contain a cycle. • Repeating the cycle an arbitrary number of times must yield another string accepted by the DFA. • Remind: the Pumping Lemma is not sufficient. • It is one way to prove that a given infinite language is not regular, while it cannot be used to prove that a given infinite language is regular.
Outline to Prove by Pumping Lemma • Main idea: prove by contradiction. • Brief outline: • Assume the language L is regular (and thus the Pumping Lemma holds). • Show that repeating the cycle some number of times (“pumping” the cycle) yields a string that is not in L. • Conclude that L is not regular by contradiction. • What can we think about during using Pumping Lemma? • On choosing the particular string z in L. • On choosing the number of times to “pump” the cycle.
Example 1 • Prove that L3={uu: u in {0,1}*} is not regular. • Suppose L3 is regular,there exists n • Choose a string z=0m10m1 with m>n, • Although the decomposition of z into uvw is unknown, uv must consist entirely of 0s because |uv|≤n. Moreover, |v|≥1. • Simply choose i=2. Thus uv2w will have more 0s before the first 1 than the second 1, which is not in L3. • Thus L3 is not regular due to the contradiction.
Example 2 • Prove that L={x: x has different numbers of 0s and 1s} is not regular. • Trick: • Instead of directly prove this, let’s prove its dual stated language D={x: x has the same number of 0s and 1s}is not regular. • Steps: • Remind that we have proven L={0n1n: n≥0} is not regular. And L= D • If L is regular, then Dshould also be regular. • Thus D is not regular according to the contradiction. Neither is L..
Take x = 0^n1^{n! + n}. • Then the adversary splits it as uvw. Let k be the length of the v part. • Now pump it (n!+k)/k times. • Then you get uv^iw = 0^{k((n!+k)/k) + (n-k)}1^{n!+n} = 0^{n! + n}1^{n! + n}
Example • Prove that L2={1m: m is prime} is not regular. • Suppose L2 is regular, and thus Pumping Lemma holds. Although n is unknown, we can still assume that there is one. • Choose a string z=1m where m is a prime number and |uvw|=m>n+1. Any prefix of z consists entirely of 1s. • Although the decomposition of z into uvw is unknown, it follows that |w|>1 due to |uvw|>n. Moreover, |v|≥1. • Choose i=|uw|. (Remind |w|>1 and |uw|>1). We have |u vi w|=|uw|+|v||uw|=(1+|v|)|uw|. Because both 1+|v| and |uw| are greater than 1, the product must be a composite number, i.e., |u vi w| is a composite not a prime number. It is not in L2. • Thus, L2 is not regular due to the contradiction. Q.E.D.
Outline • Pumping Lemma • DFA Minimization • Context-free grammars (CFG)
DFA Minimization • There is an algorithm to start with any DFA and reduce it to the smallest possible DFA • The algorithm attempts to identify classes of equivalent states • These are states that can be merged together without affecting the answer of the computation
Equivalent and Distinguishable States • Two states q, q’ are equivalent if • Here, d(q, w) is the state that the machine is in if it starts at q and reads the string w • q, q’ are distinguishable if they are not equivalent: ^ ^ For every string w, the states d(q, w) and d(q’, w) are either both accepting or both rejecting ^ For some string w, one of the states d(q, w),d(q’, w) is accepting and the other is rejecting
DFA Minimization Algorithm • Find all pairs of distinguishable states as follows: For any pair of states q, q’: If q is accepting and q’ is rejectingMark(q, q’) as distinguishable Repeat until nothing is marked: For any pair of states (q, q’): For every alphabet symbol a: If (d(q, a), d(q’, a)) are marked as distinguishableMark(q, q’) as distinguishable For any pair of states (q, q’): If (q, q’) is not marked as distinguishableMergeq and q’ into a single state
Example 1 q0 q1 1 0 0, 1 q1 0 0 q2 1 q0 q4 q2 q3 0 1 q4 1 q3 q4 q0 q1 q2 q3
Example 1 (cont.) q1 1 0 0, 1 q1 0 0 q2 1 q0 q4 q2 q3 0 1 x x x x q4 1 q3 q0 q1 q2 q3 q4 is distinguishable from all other states
Example 1 (cont.) q1 1 0 0, 1 x q1 0 0 x q2 1 q0 q4 q2 x q3 0 1 x x x x q4 1 q3 q0 q1 q2 q3 q0 is distinguishable from q1, q2, q3,q4
Example 1 (cont.) q1 1 0 0, 1 x q1 0 0 x B q2 1 q0 q4 q2 x B B q3 0 1 x x x x q4 1 q3 q0 q1 q2 q3 Merge states not marked distinguishableq0 cannot be merged → group A q1, q2, q3 are equivalent → group B q4 cannot be merged → group C
Example 1 (cont.) B q1 1 0 0, 1 x q1 A C 0 0 x B q2 1 q0 q4 q2 x B B q3 0 1 x x x x q4 1 q3 q0 q1 q2 q3 0 0, 1 0, 1 1 minimized DFA: qA qB qC
Example 2 0 1 q1 q2 0 1 q0 q1 q2 q3 0 1 q4 0 1 q5 1 1 0 q3 q4 q5 q6 q6 1 0 q0 q1 q2 q3 q4 q5 0
Example 2 (cont.) 0 1 q1 q2 x x 0 1 q0 q1 q2 q3 x 0 1 x q4 0 1 x q5 1 1 0 q3 q4 q5 q6 q6 x 1 0 q0 q1 q2 q3 q4 q5 0 q2 is distinguishable from all other states
Example 2 (cont.) 0 1 q1 x q2 x x 0 1 q0 q1 q2 q3 x 0 1 x x q4 0 1 x x q5 1 1 0 q3 q4 q5 q6 q6 x x 1 0 q0 q1 q2 q3 q4 q5 0 q0 is distinguishable from q1, q2, q4,q5,q6
Example 2 (cont.) 0 1 q1 x q2 x x 0 1 q0 q1 q2 q3 x x 0 1 x x x q4 0 1 x x x q5 1 1 0 q3 q4 q5 q6 q6 x x 1 0 q0 q1 q2 q3 q4 q5 0 q1 is distinguishable from q0, q2, q3,q4, q5
Example 2 (cont.) 0 1 q1 x q2 x x 0 1 q0 q1 q2 q3 x x 0 1 x x x x q4 0 1 x x x x q5 1 1 0 q3 q4 q5 q6 q6 x x x 1 0 q0 q1 q2 q3 q4 q5 0 q3 is distinguishable from q1, q2, q4, q5, q6
Example 2 (cont.) 0 1 q1 x q2 x x 0 1 q0 q1 q2 q3 x x 0 1 x x x x q4 0 1 x x x x x q5 1 1 0 q3 q4 q5 q6 q6 x x x x 1 0 q0 q1 q2 q3 q4 q5 0 q4 is distinguishable from q0, q1, q2,q3, q5, q6
Example 2 (cont.) 0 1 q1 x q2 x x 0 1 q0 q1 q2 q3 x x 0 1 x x x x q4 0 1 x x x x x q5 1 1 0 q3 q4 q5 q6 q6 x x x x x 1 0 q0 q1 q2 q3 q4 q5 0 q5 is distinguishable from q0, q1, q2,q3, q4, q6
Example 2 (cont.) 0 1 q1 x q2 x x 0 1 q0 q1 q2 q3 x x 0 1 x x x x q4 0 1 x x x x x q5 1 1 0 q3 q4 q5 q6 q6 x x x x x 1 0 q0 q1 q2 q3 q4 q5 0 Merge states not marked distinguishableq0, q3are equivalent→ group Aq1, q6are equivalent → group B q2cannot be merged → group C q4 cannot be merged → group D q5cannot be merged → group E
Example 2 (cont.) B 0 1 q1 x A q2 x x 0 1 q0 q1 q2 q3 x x 0 1 C x x x x q4 0 1 x x x x x q5 1 1 0 q3 q4 q5 q6 q6 x x x x x 1 0 D E q0 q1 q2 q3 q4 q5 0 1 0 1 qE qD 1 qA 0 0 0 1 qB qC 1 minimized DFA: 0
Outline • Pumping Lemma • DFA Minimization • Context-free Languages
Relations L = L(G) Context-free Languages L Context-free Grammars G L(G) = L(M) L = L(M) Push-down Automata M PDA = NFA + a stack (infinite memory)
Example (I) • Given the following CFG S X | Y X aXb | aX | a Y aYb | Yb | b • (1) L(G) = ? Σ={a, b}
Example (I) --- solution: L(S) S X |Y X aXb | aX | a Y aYb | Yb | b Try to write some strings generated by it: SXaXbaaXbbaaaXbbaaaabb SYaYbaYbbaaYbbbaabbbb more a’s than b’s more b’s than a’s • Observations: • Start from S, we can enter two States X & Y, and X, Y are “independent”; • In X state, always more a are generated; • In Y state, always more b are generated. Ls = Lx U Ly L(S) = { aibj; i≠j } Lx = { aibj; i>j } Lx = { aibj; i<j }
Example (II) • Given the following language: • (1) design a CFG for it; • L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1}
Example (II) -- solution: CFG L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} Consider two extreme cases: (a). if j = i, then L1 = { 0i1j: i=j }; (b). if j = 2i, then L2 = { 0i1j: 2i=j }. S 0S1 S ε S 0S11 S ε “blue-rule” “red-rule” If i ≤ j ≤ 2i , then randomly choose “red-rule” or “blue-rule” in the generation. S 0S1 S 0S11 S ε
Example (II) -- solution: CFG L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} ? Need to verify L = L(G) G = S 0S1 S 0S11S ε 1). L(G) is a subset of L: The “red-rule” and “blue-rule” guarantee that in each derivation, the number of 1s generated is one or two times larger than that of 0s. So, L(G) is a subset of L. 2). L is a subset of L(G): For any w = 0i1j, i ≤ j ≤ 2i, we use “red-rule” (2i - j) times and then “blue-rule” ( j - i ) times, i.e., S =*=> 02i-jS12i-j =*=> 02i-j0j-iS12(j-i)12i-j==> 0i1j = w