380 likes | 555 Views
CS 3240 – Chapter 4. Properties of Regular Languages. Topics. Closure Properties Algorithms for Elementary Questions: Is a given word, w , in L ? Is L empty, finite or infinite? Are L 1 and L 2 the same set? Detecting non-regular languages. Closure Properties.
E N D
CS 3240 – Chapter 4 Properties of Regular Languages
Topics • Closure Properties • Algorithms for Elementary Questions: • Is a given word, w, in L? • Is L empty, finite or infinite? • Are L1 and L2 the same set? • Detecting non-regular languages CS 3240 - Properties of Regular Languages
Closure Properties • Closure of operations • If x and y are in the same set, is x op y also? • Example: The integers are closed under addition • They are not closed under division • Regular languages are closed under everything! • Typical set operations CS 3240 - Properties of Regular Languages
Regular Operations • Regular languages are closed under: • Kleene Star (*) • Union (+) • Concatenation (xy) • (By definition!) • They are also closed under: • Complement (reverse state acceptability✓) • Intersection • Set difference • Reversal (already proved in homework #12, 2.3✓) CS 3240 - Properties of Regular Languages
Closure Under Intersection • Proof from set theory: • L1 ∩ L2 = (L1’ ∪ L2’)’ • Since complement and union are closed, intersection must be also! QED CS 3240 - Properties of Regular Languages
Union of ComplementsAnd Complement the Result • Note how the intersection is never shaded • L1’ ∪ L2’ shades everything but where they overlap • Therefore, (L1’ ∪ L2’)’is the overlap (intersection) CS 3240 - Properties of Regular Languages
Set DifferenceA Simple Proof • A – B: • Everything that is in A but not in B • A – B = A ∩ B’ • We have already shown that regular languages are closed under intersection and complement. QED CS 3240 - Properties of Regular Languages
Computing the Union by MachineBy “combining” the machines • Start with a composite start state: • Consisting of the two start states • Follow all out-edges simultaneously • As we did for NFA-to-DFA conversion • States containing any original final state is a final state in the result for union • Because one of the machines accepts there • States containing an original final state from each original machine is a final state in the result for intersection • Because both of the machines accept there • ¿How would you construct the difference machine? CS 3240 - Properties of Regular Languages
a,b b a a -x1 x2 +x3 b b b a a a a b b Double-a EVEN-EVEN
For union: assign accepting states where any original xi or yi accept. For intersection: assign accepting states only where both original xi or yi accept simultaneously. No need to compute (L1’ ∪ L2’)’! For difference, assign accepting states where one accepts and the other does not.
The resulting machine… a a a a b a a b b b b b b b a a b b a a b a a
The Membership ProblemSection 4.2 • Given a word w, and a regular language, L, can we answer the question: • Is w ∊ L? • You tell me… CS 3240 - Properties of Regular Languages
Is L Empty? • A graph theory problem: • Find a path from the start to a final state in the associated FA • Algorithm: “mark” the start state repeat: mark any state with an incoming edge from a previously marked state untilan accepting state is marked or no new states were marked at all CS 3240 - Properties of Regular Languages
Another SolutionTo See if L is Non-empty • Attempt to convert the associated FA to a regular expression • By the state bypass and elimination algorithm • If you get a regular expression, then a string is accepted CS 3240 - Properties of Regular Languages
Yet Another ApproachTo See if L is non-empty (by computer) • Suppose a minimal machine, M, for the language L has p states • If M accepts any non-empty words at all, it must accept one of length <=p • Why? • So… • Systematically try all possible strings in Σ* of length 1 through p. If none are accepted, then no non-empty strings at all are in L. CS 3240 - Properties of Regular Languages
Is L Finite or Infinite? • Convert its machine to a regular expression • It is infinite iff it has a star • • Another way: • A language is infinite if there is a cycle in an accepting path • A (tedious) graph theory problem • CS 3240 - Properties of Regular Languages
An ObservationAbout Infinite Regular Languages • Suppose L’s minimal machine, M, has p states • Any path of length p has (or is) a cycle • And any cycle must have or be a cycle of length p or less • Because a state is revisited after at mostp characters • So, infinite languages have a machine with at least one cycle of length p or less in an accepting path* • And all non-empty languages have a string of length p or less (already showed that)… CS 3240 - Properties of Regular Languages
Finishing the ReasoningAbout Detecting Infinite Languages – A third way • Let m denote the length of a cycle in an accepting path • We know m ≤ p • Let k be the length of a string in L such that k ≤ p • There has to be one if the language is infinite! • Then strings of length k + im are accepted, i ≥ 0 • By traversing the cycle i times • But k + im ≤ p + ip = (i+1)p • So, there must be some i such that p ≤ k+im ≤ 2p • Procedure: Test all strings of length p through 2p-1 CS 3240 - Properties of Regular Languages
Is L1 = L2? • That is, are they the same set of strings? • Set-theoretic argument: • Two sets are equal if their symmetric difference is empty (denoted by A ∆ B or A ⊖ B) • A ∆ B = A ∪ B – A ∩ B = A – B ∪ B – A • But A – B = A ∩ B’, and B – A = B ∩ A’ • So L1 = L2 iff (L1 ∩ L2’) ∪ (L1’ ∩ L2) = ∅ CS 3240 - Properties of Regular Languages
Is L1 = L2? CS 3240 - Properties of Regular Languages
Is L1 = L2? CS 3240 - Properties of Regular Languages
Non-Regular LanguagesSection 4.3 • Not all languages are regular • We need to recognize whether languages are regular or not • We don’t want to waste time using regular language processing techniques where they don’t apply CS 3240 - Properties of Regular Languages
ab CS 3240 - Properties of Regular Languages
ab + aabb CS 3240 - Properties of Regular Languages
ab + aabb + aaabbb CS 3240 - Properties of Regular Languages
Recognizing Non-Regular Languages • Consider anbn • ab is regular • ab + aabb = anbn, 0 ≤ n ≤ 2, is regular • Any finite language is regular (why?) • But anbn, n ≥ 0 is not regular (why not?) • How do we prove it’s not regular!?! CS 3240 - Properties of Regular Languages
An Observation • Finite Automata don’t have unlimited counting capability • They only have a fixed number of states • Intuitively, we see that an automaton can’t keep track of counts for anbn where n is arbitrarily large • But intuition is often faulty. We need a proof! CS 3240 - Properties of Regular Languages
About Infinite Regular LanguagesRedux • Any accepted string of length p (the number of states) or greater forces a cycle in an accepting path. • In other words, at least one state is visited a second time • And that “revisit” must happen within the first p characters of the string • Because that’s when the (p+1)th state is entered • This could be any state (start, final, other) CS 3240 - Properties of Regular Languages
anbn is Not RegularProof by Contradiction • Consider akbk, where k is greater than the number of states in a supposed DFA accepting all anbn, n ≥ 0 • Before the first b is encountered, a state has been visited at least twice (because there are more a’s than states) • Suppose the length of the associated cycle is m • Then the string ak+imbk is also accepted! • This contradicts the existence of a DFA that accepts anbn CS 3240 - Properties of Regular Languages
“Revisiting” a State The first “revisit” CS 3240 - Properties of Regular Languages
The Pumping LemmaFor Regular Languages • For every infinite regular language, L, there is a number, p, such that for all strings, s, in L, where |s| ≥ p, you can partition s into three concatenated substrings, xyz, such that: • |y| > 0 • |xy| ≤ p • xy*z ∈ L CS 3240 - Properties of Regular Languages
Using the Pumping LemmaRegular => Pumpable≣ ¬Pumpable => ¬Regular • You can only use the pumping lemma to show that a language is not regular • By showing it fails the “pumping” conditions of infinite regular languages • Note: Some non-regular languages pump! • The trick is to find a convenient string • Usually the condition |xy| ≤ p is also key • Sometimes pumping down (i = 0) is easiest CS 3240 - Properties of Regular Languages
Using the Pumping Lemma on anbn • Consider the string apbp • It is in this language • It is long enough (≥ p in length) • Now let apbp = xyz • Remember |xy| ≤ p • What can you conclude about y? CS 3240 - Properties of Regular Languages
Playing Games • You can treat proving a language non-regular as a “game”: • You pick a string, s, in L, where |s| ≥ p • You may pick any such string; choose wisely! • Opponent picks x, y, and z • But must obey |xy| ≤ p and |y| > 0 • You show it can’t be “pumped” • Because a pumped string falls “outside” the language • Must anticipate all possible partitions xyz CS 3240 - Properties of Regular Languages
Some Non-regular languagesAll require arbitrary counting capability • aibj, i > j • PALINDROME • w = wR (same backwards and forwards) • ww • Equal halves • PRIME (am where m is prime) • SQUARE (am where m is a perfect square) CS 3240 - Properties of Regular Languages
Using Closure Properties • Strings with equal number of a’s and b’s • NOTPRIME CS 3240 - Properties of Regular Languages
A Pumpable Non-regular Language • NOTPRIME is pumpable! • Let y = the whole string (akm) • The number of a’s will always be a multiple of km, hence not prime • Note: zero is not a prime number • This does not violate the pumping lemma • The pumping lemma draws no conclusion about non-regular languages CS 3240 - Properties of Regular Languages