470 likes | 515 Views
Dive into the world of Deterministic Finite Automata (DFA) and Non-Deterministic Finite Automata (NFA) computation with finite memory, exploring how to transition from NFAs to DFAs using subset construction. Learn about regular languages, subset construction, and closure properties under operations like concatenation and star. Discover how regular expressions represent languages, with examples and inductive definitions for understanding complex computations.
E N D
CS 154, Lecture 3:DFANFA, Regular Expressions
Deterministic Finite Automata Computation with finite memory
Non-Deterministic Finite Automata Computation with finite memoryand “verified guessing”
From NFAs to DFAs Input: NFA N = (Q, Σ,, Q0, F) Output: DFA M = (Q, Σ, , q0, F) To learn if an NFA accepts, we could do the computation in parallel, maintaining the set of all possible states that can be reached Idea: Set Q = 2Q
(R,) = ε( (r,) ) rR From NFAs to DFAs: Subset Construction Input: NFA N = (Q, Σ, , Q0, F) Output: DFA M = (Q, Σ, , q0, F) Q = 2Q : Q Σ→ Q * q0 = ε(Q0) F = { R Q | f R for some f F } * For S Q, the ε-closure of S is ε(S) = {q | q reachable from some s S by taking 0 or more ε transitions}
Example of the ε-closure {q0 , q1, q2} ε({q0}) = {q1, q2} ε({q1}) = ε({q2}) = {q2}
Given: NFA N = ( {1,2,3}, {a,b}, , {1}, {1} ) Construct: Equivalent DFA M = (2{1,2,3}, {a,b}, , {1,3}, …) M N a a b 1 {1,3} {3} a b b a b b a,b ε a , b {2} {2,3} a 2 3 a b a {1}, {1,2} ? ε({1}) = {1,3} {1,2,3}
Reverse Theorem for Regular Languages Theorem: The reverse of a regular language is also a regular language Ifa language can be recognized by a DFA that reads strings from right to left, thenthere is an “normal” DFA that accepts the same language Proof? Given a DFA for a language L, “reverse” its arrows and flip its start and accept states, getting an NFA. Convert that NFA back to a DFA!
Using NFAs in place of DFAs can make proofs about regular languages much easier! Remember this on homework/exams!
1 1 0 0 0,1 0,1 1 1 0 0 0 0 1 1 Regular Languages are closed under concatenation Concatenation: A B = { vw | v A and w B } 1 Given DFAs M1 and M2, connect the accept states of M1 to the start states of M2 0 0,1 1 ε 0 0 1 ε L(N) = L(M1) L(M2)
1 0 0,1 1 0 0 1 Regular Languages are closed under star A* = { s1 … sk | k ≥ 0 and each si A } Let M be a DFA, and let L = L(M) We can construct an NFA N that recognizes L* ε ε ε
Formally, the construction is: Input: DFA M = (Q, Σ, , q1, F) Output: NFA N = (Q, Σ, , q0, F) Q = Q {q0} F = F {q0} {(q,a)} if q Q and a ≠ ε if q F and a = ε {q1} if q = q0 and a = ε {q1} (q,a) = if q = q0 and a ≠ ε else
Regular Languages are Closed Under Star • How would we prove that this NFA construction works? • Want to show: L(N) = L* • L(N) L* • L(N) L*
1. L(N) L* Assume w = w1…wk is in L* where w1,…,wk L We show N accepts w by induction on k Base Cases: (w = ε) k = 0 (w L) k = 1 Inductive Step: Assume N accepts all strings v = v1…vk L*, vi L Let u = u1…ukuk+1 L* , uj L Since N accepts u1…uk(by induction) and M accepts uk+1, N also accepts u (by construction)
2. L(N) L* Assume w is accepted by N; we want to show w L* If w = ε, then w L* ε I.H. N accepts u and takes at most k ε-transitionsu L* uL(N), so By I.H. uL* u • Let w be accepted by N with k+1 ε-transitions. • Write w as w=uv, • where v is the substring read after the lastε-transition ε v vL w = uv L* accept
Closure Properties for Regular Languages Union: A B = { w | w A or w B } Intersection: A B = { w | w A and w B } Complement:A = { w Σ*| w A } Reverse: AR = { w1 …wk | wk …w1 A, wi Σ} Concatenation: A B = { vw | v A and w B } Star: A* = { s1 … sk | k ≥ 0 and each si A } Theorem: if A and B are regular then so are: A B, A B, A, AR, A B, and A*
Regular Expressions Computation as simple, logical description A totally different way of thinking about computation: What is the complexity of describing the strings in the language?
Inductive Definition of Regexp Let Σ be an alphabet. We define the regular expressions overΣ inductively: For all ∊Σ, is a regexp ε is a regexp is a regexp If R1 and R2 are both regexps, then (R1R2), (R1 + R2), and (R1)* are regexps
Precedence Order: * thenthen+ Example: R1*R2+ R3 = ( ( R1* ) ) + R3 · R2
Definition: Regexps Represent Languages The regexp∊Σrepresents the language {} The regexpε represents {ε} The regexp represents If R1 and R2 are regular expressions representing L1 and L2 then: (R1R2) represents L1 L2 (R1 + R2) represents L1 L2 (R1)* represents L1*
Regexps Represent Languages For every regexpR, define L(R) to be the language that R represents A string w ∊Σ*is acceptedbyR(or, wmatchesR)if w ∊ L(R) Examples: 0, 010, and 01010 match (01)*0 110101110100100 matches (0+1)*0
Assume Σ = {0,1} { w | w has exactly a single 1 } 0*10* { w | w contains 001 } (0+1)*001(0+1)*
Assume Σ = {0,1} What language does the regexp* represent? {ε}
Assume Σ = {0,1} { w | w has length ≥ 3 and its 3rd symbol is 0 } (0+1)(0+1)0(0+1)*
Assume Σ = {0,1} { w | every odd position in w is a 1 } (1(0 + 1))*(1 + ε)
Assume Σ = {0,1} { w | w has equal number of occurrences of 01 and 10} = { w | w = 1, w = 0, or w = ε, or w starts with a 0 and ends with a 0, or w starts with a 1 and ends with a 1 } Claim: A string w has equal occurrences of 01 and 10 w starts and ends with the same bit. 1 + 0 + ε +0(0+1)*0 + 1(0+1)*1
DFAs NFAs Regular Expressions! L can be represented by some regexp L is regular
L can be represented by some regexp L is regular
Given any regexp R, we will construct an NFA N s.t.N accepts exactly the strings accepted by R Proof by induction on the length of the regexp R L can be represented by some regexp L is regular Base Cases (R has length 1): R = R = ε R =
Induction Step: Suppose every regexp of length < k represents some regular language. Consider a regexpR of length k > 1 Three possibilities for R: R = R1+ R2 R = R1 R2 R = (R1)*
Induction Step: Suppose every regexp of length < k represents some regular language. Consider a regexp R of length k > 1 Three possibilities for R: R = R1+ R2 By induction, R1and R2 representsome regular languages, L1 and L2 R = R1 R2 But L(R) = L(R1+ R2) = L1L2so L(R) is regular, by the union theorem! R = (R1)*
Induction Step: Suppose every regexp of length < k represents some regular language. Consider a regexp R of length k > 1 Three possibilities for R: R = R1+ R2 By induction, R1and R2 representsome regular languages, L1 and L2 R = R1 R2 But L(R) = L(R1·R2) =L1·L2so L(R) is regular by the concatenation theorem R = (R1)*
Induction Step: Suppose every regexp of length < k represents some regular language. Consider a regexp R of length k > 1 Three possibilities for R: R = R1+ R2 By induction, R1and R2 representsome regular languages, L1 and L2 R = R1 R2 But L(R) = L(R1*) =L1*so L(R) is regular, by the star theorem R = (R1)*
Induction Step: Suppose every regexp of length < k represents some regular language. Consider a regexp R of length k > 1 Three possibilities for R: R = R1+ R2 By induction, R1and R2 representsome regular languages, L1 and L2 R = R1 R2 But L(R) = L(R1*) =L1*so L(R) is regular, by the star theorem R = (R1)* Therefore: If L is represented by a regexp, then L is regular
Give an NFA that accepts the language represented by (1(0 + 1))* ε 1 0,1 ε ( )* 1 (0+1) Regular expression:
Generalized NFAs (GNFA) L can be represented by a regexp L is a regular language Idea: Transform an NFA for L into a regular expression by removing states and re-labeling the arcs with regular expressions Rather than reading in just letters from the string on a step, we can read in entire substrings
Generalized NFA (GNFA) Is aaabcbcba accepted or rejected? Is bba accepted or rejected? Is bcba accepted or rejected? This GNFA recognizes L(a*b(cb)*a)
NFA Add unique start and accept states
0 0 01*0 1 NFA While the machine has more than 2 states: Pick an internal state, rip it out and re-label the arrowswith regexps, to account for paths through the missing state
R(q1,q2)R(q2,q2)*R(q2,q3) + R(q1,q3) R(q1,q2) R(q2,q3) R(q2,q2) NFA G While the machine has more than 2 states: In general: R(q1,q3) q2 q3 q1
a,b a (a*b)(a+b)* ε b ε a*b q1 q2 q3 q0 R(q0,q3) = (a*b)(a+b)* represents L(N)
NFAs DFAs DEFINITION Regular Languages Regular Expressions
Parting thoughts: Regular Languages can be defined by their closure properties NFA=DFA, does it mean that non-determinism is free for Finite Automata? Questions?