350 likes | 541 Views
Week 13 - Thursday. CS322. Last time. What did we talk about last time? Regular expressions Introduction to finite state automata. Questions?. Logical warmup. You are paddling a canoe around a perfectly circular pond
E N D
Week 13 - Thursday CS322
Last time • What did we talk about last time? • Regular expressions • Introduction to finite state automata
Logical warmup • You are paddling a canoe around a perfectly circular pond • Enjoying yourself immensely, you fail to notice that a goblin has crept up to the shore • You remember four things from your old lessons on goblin lore • Goblins can't swim • Goblins are always hungry for human flesh • Goblins can run four times as fast as people can paddle canoes • People can run faster than goblins • The goblin always assumes that you are making for the closest point on the shore and is always trying to cut you off • If you can get to the shore, you can escape, provided that the goblin isn't waiting for you (you need a little margin) • What's your escape strategy?
Examples • Let Σ = {0, 1} • Find regular expressions for the following languages: • The language of all strings of 0's and 1's that have even length and in which the 0's and 1's alternate • The language consisting of all strings of 0's and 1's with an even number of 1's • The language consisting of all strings of 0's and 1's that do not contain two consecutive 1's • The language that gives all binary numbers written in normal form (that is, without leading zeroes, and the empty string is not allowed)
Practical notation • Regular expressions are used in some programming languages (notably Perl) and in grep and other find and replace tools • The notation is generally extended to make it a little easier, as in the following: • [ A – C] means any character in that range, • [A – C] means ( A | B | C ) • [0 – 9] means ( 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 ) • [ABC] means (A | B | C ) • ABC means the concatenation of A, B, and C • A dot stands for any letter: A.C could match AxC, A&C, ABC • ^ means NOT, thus [^D – Z] means not the characters D through Z • Repetitions: • R? means 0 or 1 repetitions of R • R* means 0 or more repetitions of R • R+ means 1or more repetitions of R • Notations vary and have considerable complexity • Use this notation to describe the regular expression for legal C++ identifiers
Finite-State Automata Student Lecture
Finite-state automaton • A finite-state automaton is an idealized machine composed of five objects: • A finite set I, called the input alphabet, of input symbols • A set S of states the automaton can be in • A designated state s0 called the initial state • A designed set of states called the set of accepting states • A next-state functionN: S x I S that maps a current state with current input to the next state
Transition diagram • FSA's are often described with a state transition diagram • The starting state has an arrow • The accepting states are marked with circles • Each rule is represented by a labeled transition arrow • The following FSA represents a vending machine quarter 25¢ 75¢ half-dollar half-dollar quarter 25¢ 0¢ quarter half-dollar quarter quarter quarter 50¢ $1 half-dollar half-dollar half-dollar
FSA example • Consider this FSA: • What are its states? • What are its input symbols? • What is the initial state of A? • What are the accepting states of A? • What is N(s1, 1)? • What's a verbal description for the strings accepted? 1 1 s0 s1 s2 0 0 0 1
Annotated next-state tables • Consider the same FSA: • We can also describe an FSA using an annotated next-state table • A next-state table shows what the transition is for each state for all possible input • An annotated next-state table also marks the initial state and accepting states • Find the annotated next-state table for this FSA 1 1 s0 s1 s2 0 0 0 1
Table to transition diagram • Consider the following annotated next-state table • marks initial state • marks accepting states): • Draw the corresponding transition state diagram
FSA example • Consider this FSA again: • Which state will be reached on the following inputs: • 01 • 0011 • 0101100 • 10101 1 1 s0 s1 s2 0 0 0 1
Eventual-state function • Let A be a FSA with a set of states S, set of input symbols I, and next-state function N: S x I S • Let I* be the set of all strings over I • The eventual-state functionN*: S x I* S is the following • N*(s,w) = the state that A goes to if the symbols of w are input to A in sequence, starting with A in state s • All of this is just a notational convenience so that we have a way of talking about the state that a string will transition an FSA to • We say that wis accepted by AiffN*(s0, w) is an accepting state of A • The language of A, L(A) = { w I* | w is accepted by A }
Designing automata • Design a finite-state automaton that accepts the set of all strings of 0's and 1's such that the number of 1's in the string is divisible by 3 • Make a regular expression for this language • Design a finite-state automaton that accepts the set of all strings of 0's and 1's that contain exactly one 1 • Make a regular expression for this language
FSA = regular expressions • Kleene's Theorem shows that Finite-State Automata are equivalent to regular expressions • That is, for every finite-state automaton there is some equivalent regular expression, and vice versa • We won't prove it, but it should be intuitively clear because there are algorithms for building FSA's from regular expressions and vice versa
Irregular • Languages that can be expressed as an FSA or a regular expression are called regular • Some languages are not regular • For example, the language consisting of strings akbk, meaning all strings that have a positive number of a's followed by the same number of b's, is not regular • Prove it (by contradiction) • Hint: We use the pigeonhole principle to show that more than one sequence of ap and aq must end up the in the same state
Comparison • List strings accepted by the FSA A • List strings accepted by the FSA B A B 0 0 1 s0 s1 0 0 1 s0 s1 1 1 1 s3 s2 1 0 0
*-equivalence • Two states of a finite-state automaton are *-equivalent if any string accepted by the automaton when it starts from one state is accepted when starting from the other • Given an automaton A with eventual-state function N*, we can formally say: • States s and t in A are*-equivalent iffN*(s,w) and N*(t,w) are both accepting states or both not • It turns out that *-equivalence defines an equivalence relation
k-equivalence • *-equivalence is hard to demonstrate directly • Instead, we'll focus on equivalence after k or fewer inputs • Given an automaton A with eventual-state function N*, we can formally say: • States s and t in A are k-equivalent iffN*(s,w) and N*(t,w) are both accepting states or both not, for all strings w of length k or less
Facts about k-equivalence • For k ≥ 0, k-equivalence is an equivalence relation • For k ≥ 0, the k-equivalence classes partition the set of all states of the automaton into a union of mutually disjoint subsets • For k ≥ 1, if two states are k-equivalent, they are also (k-1)-equivalent • For k ≥ 1, each k-equivalence class is a subset of a (k-1)-equivalence class • Any two states that are k-equivalent for all integers k ≥ 0 are *-equivalent
k-equivalence theorems • Let A be an FSA with next-state function N • Given any states s and t in A: • s is 0-equivalent to tiff either s and t are both accepting states or they are both nonaccepting states • For every integer k ≥ 1, s is k-equivalent to tiffs and t are (k-1)-equivalent and for any input symbol m, N(s,m) and N(t,m) are also (k-1)-equivalent • These theorems essentially allow us to create a recursive definition for testing k-equivalence
k-equivalence examples • Find the 0-equivalence classes, the 1-equivalence classes, and the 2-equivalence classes for the following FSA: 0 1 1 0 s0 s1 s2 1 0 1 1 0 0 s4 s3 0
Finding the *-equivalence classes • Keep finding k-equivalence classes for larger and larger values of k • If you ever find that the set of k-equivalence classes is equal to the set of (k+1)-equivalence classes, that is the set of *-equivalence classes • This is known as a fixed point in mathematics
The quotient automaton • We can build a new FSA from the *-equivalence classes • Recall that [s] means the equivalence class of s • This FSA is called the quotient automatonA', and is defined from an FSA A with states S, input symbols I, and next-state function N as follows: • The set of states S' of A' is the set of *-equivalent classes of states of A • The set of input symbols I' of A' equals I • The initial state of A' is [s0] where so is the initial state of A • The accepting states of A' are the states of the form [s] where s is an accepting state of A • The next-state function N': S' x I S' is: For all states [s] in S' and input symbols m, N'([s], m) = [N(s,m)]
Constructing a quotient automaton • Let A be an FSA with states S, input symbols I, and next-state function N • To build A': • Find the set of 0-equivalence classes of S • For each integer k ≥ 1, find the k-equivalence classes of S until the k-equivalence classes are the same as the (k-1)-equivalence classes • Build a quotient automaton whose states are the equivalence classes given above with transition function N'([s],m) = [N(s,m)] for any input symbol m
Quotient automaton example • Find the quotient automaton for the following FSA 1 1 0 s0 s1 s2 1 0 1 1 0 0 s4 s3 0
Equivalent automata • Two automata A1 and A2 are equivalent iffL(A1) = L(A2) • Proving the languages accepted by two automata can be difficult • However, the quotient automata for both A1 and A2 will be the same (except for labeling) if A1 is equivalent to A2
Proving equivalence • Prove that the following two automata are equivalent by finding their quotient automata s2 s1' 0, 1 0 1 1 0 0 1 1 s0 s0' s2' s1 0 0 0 1 1 1 s3 s3' 0
Next time… • Finish simplifying finite state automata • Context free languages • Push down automata
Reminders • Keep reading Chapter 12 • Work on Assignment 10 • Due Friday, April 25 before midnight • No class on Monday!