570 likes | 586 Views
This course covers the theory of finite automata, including deterministic and non-deterministic finite automata, converting NFAs to DFAs, simplifying DFAs, and regular expressions.
E N D
Outline • Introduction • Deterministic finite automata (DFA’s) • Non-deterministic finite automata (NFA’s) • NFA’s to DFA’s • Simplifying DFA’s • Regular expressions finite automata
Automatic one way door Consider the control system for a one-way swinging door: There are two states: Open and Closed It has two inputs, person detected at position A and person detected at position B If the door is closed, it should open only if a person is detected at A but not B Door should close only if no one is detected A B
Control schematic Open Closed A, no B No A or B A and B A, no B B, no A A and B B, no A No A or B
Finite automaton • A finite automaton is usually represented like this as a directed graph • Two parts of a directed graph: The states (also called nodes or vertices) The edges with arrows which represent the allowed transitions • One state is usually picked out as the starting point • For so-called ‘accepting automata,’ some states are chosen to be final states
Strings and automata • The input data are represented by a string over some alphabet and it determines how the machine progresses from state to state. • Beginning in the start state, the characters of the input string cause the machine to change from one state to another. • Accepting automata give only yes or no answers, depending on whether they end up in a ‘final state.’ Strings which end in a final state are accepted by the automaton.
The labeled graph in the figure above represents a FA over the alphabet Σ = {a, b} with start state 0 and final state 3.Final states are denoted by a double circle. Example
Deterministic finite automata (DFA’s) • The previous graph was an example of a deterministic finite automaton – every node had two edges (a and b) coming out • A DFA over a finite alphabet Σ is a finite directed graph with the property that each node emits one labeled edge for each distinct element of Σ.
More formally • A DFA accepts a string w in Σ* if there is a path from the start state to some final state such that w is the concatenation of the labels on the edges of the path. • Otherwise, the DFA rejectsw . • The set of all strings accepted by a DFA M is called the language of M and is denoted by L(M)
Example: (a+b)* • Construct a DFA to recognize the regular languages represented by the regular expression (a + b)* over alphabet Σ = {a, b}. • This is the set {a, b}* of all strings over {a, b}. This can be recognised by
Example: a(a+b)* • Find a DFA to recognize the language represented by the regular expression a(a + b)* over the alphabet Σ = {a, b}. • This is the set of all strings in Σ* which begin with a. One possible DFA is:
Example: pattern recognition • Build a DFA to recognize the regular language represented by the regular expression (a + b)*abb over the alphabet Σ = {a, b}. • The language is the set of strings that begin with anything, but must end with the string abb. • Effectively, we’re looking for strings which have a particular pattern to them
Solution: (a+b)*abb The diagram below shows a DFA to recognize this language. If in state 1: the last character was a If in state 2 : the last two symbols were ab If in state 3: the last three were abb
State transition function • We can also represent a DFA by a state transition function, which we'll denote by T, where any state transition of the form • is represented by: T(i,a) = j • To describe a full DFA we need to know: • what states there are, • which are the start and final ones, • the set of transitions between them.
Regular languages • The class of regular languages is exactly the same as the class of languages accepted by DFAs! • Kleene (1956) • For any regular language, we can find a DFA which recognizes it!
Applications of DFA’s • DFA’s are very often used for pattern matching, e.g. searching for words/structures in strings • This is used often in UNIX, particularly by the grep command, which searches for combinations of strings and wildcards (*, ?) • grep stands for Global (search for) Regular Expressions Parser • DFA’s are also used to design and check simple circuits, verifying protocols, etc. • They are of use whenever significant memory is not required
Non-deterministic finite automata • DFA’s are called deterministic because following any input string, we know exactly which state its in and the path it took to get there • For NFA’s, sometimes there is more than one direction we can go with the same input character • Non-determinism can occur, because following a particular string, one could be in many possible states, or taken different paths to end at the same state!
NFA’s • A non-deterministic finite automaton (NFA) over an alphabet Σ is a finite directed graph with each node having zero or more edges, • Each edge is labelled either with a letter from Σor with . • Multiple edges may be emitted from the same node with the same label. • Some letters may not have an edge associated with them. Strings following such paths are not recognised.
Non-determinism • If an edge is labelled with the empty string , then we can travel the edge without consuming an input letter. Effectively we could be in either state, and so the possible paths could branch. • If there are two edges with the same label, we can take either path. • NFA’s recognise a string if any one of its many possible states following it is a final state • Otherwise, it rejects it.
NFA’s versus DFA’s NFA for a*a : DFA for a*a : Why is the top an NFA while the bottom is a DFA?
Example • Draw two NFAs to recognize the language of the regular expression ab + a*a. • This NFA has a edge, which allows us to travel to state 2 without consuming an input letter. • The upper path corresponds to ab and the lower one to a*a
An equivalent NFA This NFA also recognizes the same language. Perhaps it's easier to see this by considering the equality ab + a*a = ab + aa*
NFA transition functions • Since there may be non-determinism, we'll let the values of this function be sets of states. • For example, if there are no edges from state k labelled with a, we'll write T(k, a) = • If there are three edges from state k, all labelled with a, going to states i, j and k, we'll write T(k, a) = {i, j, k}
Comments on non-determinism • All digital computers are deterministic; quantum computers may be another story! • The usual mechanism for deterministic computers is to try one particular path and to backtrack to the last decision point if that path proves poor. • Parallel computers make non-determinism almost realizable. We can let each process make a random choice at each branch point, thereby exploring many possible trees.
Some facts • The class of regular languages is exactly the same as the class of languages accepted by NFAs! • Rabin and Scott (1959) • Just like for DFA’s! • Every NFA has an equivalent DFA which recognises the same language.
From NFA’s to DFA’s • We prove the equivalence of NFA’s and DFA’s by showing how, for any NFA, to construct a DFA which recognises the same language • Generally the DFA will have more possible states than the NFA. If the NFA has n states, then the DFA could have as many as 2nstates! • Example: NFA has three states {A}, {B}, {C} the DFA could have eight: {}, {A}, {B}, {C}, {A, B}, {A, C}, {B, C}, {A, B, C} • These correspond to the possible states the NFA could be in after any string
DFA construction • Begin in the NFA start state, which could be a multiple state if its connected to any by • Determine the set of possible NFA states you could be in after receiving each character. Each set is a new DFA state, and is connected to the start by that character. • Repeat for each new DFA state, exploring the possible results for each character until the system is closed • DFA final states are any that contain a NFA final state
Example (a+b)*ab C A B b The start state is A, but following an a you could be in A or B; following a byou could only be in state A a NFA a,b A A,B A,C b a DFA b a a b
Summary • Regular expressions represent the regular languages. • DFA’s recognize the regular languages. • NFA’s also recognize the regular languages.
Finite automata • So far, we’ve introduced two kinds of automata: deterministic and non-deterministic. • We’ve shown that we can find a DFA to recognise anything language that a given NFA recognises. • We’ve asserted that both DFA’s and NFA’s recognise the regular languages, which themselves are represented by regular expressions. • We prove this by construction, by showing how any regular expression can be made into a NFA and vice versa.
Regular expressions finite automata • Given a regular expression, we can find an automata which recognises its language. • Start the algorithm with a machine that has a start state, a single final state, and an edge labelled with the given regular expression as follows:
Four step algoritm • If an edge is labelled with , then erase the edge. • Transform any diagram like into the diagram
Four step algoritm 3. Transform any diagram like into the diagram
Four step algoritm 4. Transform any diagram like into the diagram
Example a*+ab Construct a NFA for the regular expression, a* + ab Start with Apply rule 2 a* + ab a* ab
Example a*+ab a Apply rule 4 to a* ab a Apply rule 3 to ab a b
Finite automata regular expressions 1 Create a new start state s, and draw a new edge labelled with from s to the original start state. 2 Create a new final state f, and draw new edges labelled with from all the original final states to f
Finite automata regular expressions 3 For each pair of states i and j that have more than one edge from i to j, replace all the edges from i to j by a single edge labelled with the regular expression formed by the sum of the labels on each of the edges from i to j. 4 Construct a sequence of new machines by eliminating one state at a time until the only states remaining are s and the f.
Eliminating states As each state is eliminated, a new machine is constructed from the previous machine as follows: • Let old(i,j) denote the label on edge i,j of the current machine. If no edge exists, label it . • Assume that we wish to eliminate state k. For each pair of edges i,k (incoming edge) and k,j (outgoing edge) we create a new edge label new(i, j)
Eliminate state k • The label of this new edge is given by: new(i,j) = old(i,j) + old(i, k) old(k, k)* old(k,j) • All other edges, not involving state k, remain the same: new(i, j) = old(i, j) After eliminating all states except sand f, we wind up with a two-state machine with the single edge s, f labelled with the desired regular expression new(s, f)
Example Initial DFA Steps 1 and 2 Add start and final states
Example Eliminate state 2 (No path to f) Eliminate state 0 Eliminate state 1 Final regular expression
Finding simpler automata • Sometimes our constructions lead to more complicated automata than we need, having more states than are really necessary • Next, we look for ways of making DFA’s with a minimum number of states • Myhill-Nerode theorem: ‘Every regular expression has a unique* minimum state DFA’ * up to a simple renaming of the states
Finding minimum state DFA Two steps to minimizing DFA: 1 Discover which, if any, pairs of states are indistinguishable. Two states, sand t, are equivalent if for all possible strings w, T(s,w) and T(t,w) are both either final or non-final. 2 Combine all equivalent states into a single state, modifying the transition functions appropriately.
Consider the DFA b a,b a 1 a States 1 and 2 are indistinguishable! Starting in either, b*is rejected and anything with ainit is accepted. b a 2 b b a,b a,b a
Part 1, finding indistinguishable pairs • Remove all inaccessible states, where no path exists to them from start. • Construct a grid of pairs of states. • Begin by marking those pairs which are clearly distinguishable, where one is final and the other non-final. • Next eliminate all pairs, which on the same input, lead to a distinguishable pair of states. Repeat until you have considered all pairs. • The remaining pairs are indistinguishable.
Part 2, construct minimum DFA • Construct a new DFA where any pairs of indistinguishable states form a single state in the new DFA. • The start state will be the state containing the original start state. • The final states will be those which contain original final states. • The transitions will be the full set of transitions from the original states (these should all be consistent.)