1.41k likes | 1.8k Views
Digital State Machines. Finite Automata & Regular Languages. Chapter Outline. Introduction Finite-State Automata Regular Languages and Finite-State Automata Summary. Introduction: Finite State Automata.
E N D
Digital State Machines Finite Automata & Regular Languages
Chapter Outline • Introduction • Finite-State Automata • Regular Languages and Finite-State Automata • Summary Veton Këpuska
Introduction: Finite State Automata • Finite-state automaton is one of the most significant tools of computational linguistics. Its variations: • Finite-state transducers • Hidden Markov Models, and • N-gram grammars Important components of the Speech Recognition and Synthesis, spell-checking, and information-extraction applications. • The FSA theory was designed in the beginning of computer science as a model of abstract computing machines pioneered by the work of Allan Turing. • FSA’s are devices that accept-recognize or reject an input stream of characters. • FSA’s are very efficient in term of speed and memory • The most frequent usage of Finite-State Automata is searching words or phrases. • Additional uses in application areas such as: • Morphological parsing, • Parts of speech annotation, and • Speech Processing and Recognition. Veton Këpuska
This FSA accepts (recognizes) or generates strings like: ac abc abbc abbbc, abbbbbbbbbbc, etc. A Simple Example of Finite State Automata Veton Këpuska
State Transition Table of a Finite State Automata • State Transition Table of the FSA above 2 September 2008 Veton Këpuska 5
Example of Finite State Automata • An automaton accepts an input string in the following way: • Starts in initial state, • Follows a transition where the arc character matches the first character of the string, • Consumes the corresponding string character, • Reaches the destination state, • It makes next transition based on the current input string, • Ends up in one of the final states when there is no characters left in the input.
Introduction: D-FSA vs. ND-FSA • Adding non-determinism to FSA will not allow us define any language that can not be defined by deterministic FSAs. • Why then bother with ND-FSAs: • It turns out that there can be substantial efficiency in describing an application using ND-FSAs. • ND-FSAs allows us to program solutions to problems using a higher-level language. • This program then is compiled, by the algorithm (that we will learn in this chapter), into a deterministic FSA that can be executed on a conventional computer. Veton Këpuska
Finite State Automata An Informal Description of Finite State Automata
Finite Automata • Study extended example of a real-world problem whose solution uses finite automata. • Investigate protocols that support “electronic money” – files that: • a customer can use to pay for goods on the internet, retains a copy of the same file to spend again, and • a seller can receive with assurance that “money” is real. It must know that the file has not been forged, nor has it been copied and sent to the seller. • Nonforgeability of the file must guaranteed by a third party – a bank and by a cryptography policy. • Encryption of the money files ensures that forgery is not a problem. • Bank must also keep a database of al the valid money that it has issued: • It can verify to a store that the file it has recived represents real money and can be credited to the store’s account. • Encryption is not going to be addressed as it is beyond the scope of the topic covered in this class. Veton Këpuska
Finite Automata • Nevertheless, in order to use electronic money, protocols need to be devised to allow the manipulation of the money in a variety of ways that the users want. • Monetary systems always invite fraud, and the protocol must verify whatever policy is adopted regarding home money is used. • The solution needs to ensure that the only things that can happen are things we intend to happen: an unscrupulous user will not be allowed to steal from others or to “manufacture” money. Veton Këpuska
The Protocol • The customer – • Assume that the customer can not be relied to act responsibly. • Customer may try to copy the money file, • Use the same money file to pay several times, or both • The bank – • Assuming that the bank must behave responsibly, or it can not be a bank. • It must ensure that two stores cannot both redeem the same money file, • It will not allow money file to be both canceled and redeemed. • The store – • Will not ship goods until it is sure it has been given valid money. Veton Këpuska
The Protocol • FSA can represent the protocols as the one being discussed. • States – will represent each possible “state”/situation that each participants could be in. • The state remembers important events that have happened, • Also it knows which ones did not yet happen. • Transitions – occur between states when one of the five events described previously occur. Veton Këpuska
FSAs for Money Transfer Example Bank: • Beginning State is state “1” • The bank has issues a money file • No requests have been made to either redeem it or cancel it. • Cancel request • Bank restores the money and enders state 2. • Bank can not leave state 2 since it can not allow the same money to be canceled again or to be spent by the customer. • Redeem request • Enters state 3, and • Initiates transfer and upon completion enters state 4. • In state 4 it will no longer accept cancel, nor redeem requests, nor will it perform any other transactions regarding this particular money file. Veton Këpuska
FSAs for Money Transfer Example Store: Procedures in the store are assumed to be imperfect. • Beginning State is state “a” • Pay request • Customer orders the goods by performing pay action. • Enters state “b” and initiates both shipping and redemption process. • Ship and Redeem request • Enters state c or d in any order, and • Initiates redeem /transfer or ship and enters state e/f or e. Customer: • Pay and Cancel request • Can do them any number of times and in any order. Veton Këpuska
Enabling Automata to Ignore Actions Missing transitions: • Store is not affected by a “cancel” action. • According to the formal definition of FSA (next) whenever an input X is received by an automaton, the automaton must follow an arc labeled X from the state that it is in to a new state. • Store FSA must be augmented with transitions that correspond to “cancel” actions. • Effects of unexpected actions: • Customer executed “pay” action second time, while store is in state e. • Since store automaton does not have an arc corresponding to pay action in that state it will cause FSA to “die”. Veton Këpuska
Enabling Automata to Ignore Actions • The two kinds of actions that must be ignored by FSA’s: • Actions that are irrelevant to the participant involved. • For the store FSA : “cancel” action. • For the bank FSA: “pay” and “ship” • For the customer FSA: “ship”, “redeem” and “transfer” • Actions that must not be allowed to kill an automaton. • For the store FSA: customers second “pay”, or “cancel” actions should not be allowed to kill its FSA. • For the bank FSA: stores multiple “redeem” actions should be ignored.
Completed FSA’s Veton Këpuska
Complete System as FSA • Previous models accounted actions of each participants independently. • Customer’s FSA is simple – no-matter what actions are taken it resides in the same state. • Bank’s and Store’s FSAs are complex and it is not immediately obvious in what combinations of states these tow automata can be. • Product Automaton: • The normal way to explore the interaction of automata is to construct product automaton. • New product FSA states are composed of pairs of states from each original FSAs: (3,d) – state denotes the situation where the bank is in state 2 and store in state d. • Bank = 4 states, Store = 7 states, Product FSA = 4x7=28 states Veton Këpuska
Product Automaton for the Store and Bank Veton Këpuska
Product Automaton • Each of the two component of the product automaton independently makes transitions on the various inputs. • If an input action is received, and one of the two automata has no sate to go to on that input, then the product automaton “dies”; it has no state to go to. • Formal Rule: • Assume (bank, store) product automaton being in state (i, x). • Let Z be one of the input actions. • Observe if there is a transition from state i under the input Z. Suppose there is a transition to state j. • Similarly, observe if there is a transition from state x under the same input Z to state y. • Thus, there is a transition from (i, x) to state (j, y) under input Z. If any of the states j or y do not exist than there is no transition arc labeled Z out of (i, x). Veton Këpuska
Product Automaton for the Store and Bank Example: Consider the input pay: • Store goes from state a to b • Store stays put if in any other state but a • Bank if in state 1 is unaffected by the input pay – it is irrelevant to the bank when in state 1. Consider the input redeem: • If bank receives a redeem message when in state 1, it goes to state 3. If it is in state 3 or 4 it stays there. If in state 2 the bank automaton dies. • (1,b) → (3,d) 2 September 2008 Veton Këpuska 22
Using Product Automaton to validate the Protocol • Only 10 states are accessible from start state • Example of states that are not accessible. • Real purpose of analyzing a protocol such as this one using automata is to ask and answer questions that mean: “Can the following type of error occur?” • Example: “Is it possible that the store can ship goods and never get paid?” State is c, e, or g and no transition on input T was ever made? • Problem State (2,c) 1 3 5 2 4 6 ? 7 8 9 10 Veton Këpuska
Deterministic Finite State Automaton Formalism of a Deterministic Finite State Automaton Veton Këpuska
Deterministic Finite State Automaton • “Deterministic” refers to the fact that on each input there is one and only one state to which the automaton can transition from its current state. • Non-deterministic automaton can transition from its present state to more than one states on the same input. Veton Këpuska
Definition of D-FSA • A deterministic Finite State Automaton consists of: • A finite set of states – Q • A finite set of input symbols, • A transition function, , that takes as arguments: • a state, and • an input symbol, and • returns a state : • A start state, q0, one of the states in Q • A set of final, or accepting, states F. FQ • Five-tuple notation of a D-FSA named A: A=(Q, , , q0,F) Veton Këpuska
Formal Definition of Automaton Veton Këpuska
String Processing with D-FSA • Suppose a1a2…an is a sequence of inputs symbols. • Initial state of D-FSA is its start state q0, then • q1= (q0, a1) • q2= (q1, a2) … i.qi= (qi-1, ai) … n.qn= (qn-1, an) • If qnF then the input a1a2…an sequence “accepted” (or “recognized”) otherwise it is “rejected”. Veton Këpuska
D-FSA Example • Using FSA to Recognize Sheeptalk “baa…!” Veton Këpuska
FSA Use • The FSA can be used for recognizing (e.g accepting) strings in the following way. First, think of the input as being written on a long tape broken up into cells, with one symbol written in each cell of the tape, as figure below: Veton Këpuska
Recognition Process • The machine starts in the start state (q0), and iterates the following process: • Check the next letter of the input. • If it matches the symbol on an arc leaving the current state, then • cross that arc • move to the next state, also • advance one symbol in the input • If we are in the accepting state (q4) when we run out of input, the machine has successfully recognized/accepted an instance of sheeptalk. • If the machine never gets to the final state, • either because it runs out of input, or • it gets some input that doesn’t match an arc (as in Fig in previous slide), or • if it just happens to get stuck in some non-final state, we say the machine rejects or fails to accept an input. Veton Këpuska
FSA For “ShpeepTalk” Example • Q = {q0,q1,q2,q3,q4}, • = {a, b, !}, // Sheep Language • F = {q4}, and • δ(q, i) // Defined in next slide Veton Këpuska
State Transition Table We’ve marked state 4 with a * to indicate that it’s a final/accepting state (you can have as many final states as you want), and the Ø indicates an illegal or missing transition. We can read the first row as “if we’re in state q0 and we see the input b we must go to state q1. If we’re in state q0 and we see the input a or !, we fail”. Veton Këpuska
Deterministic Algorithm for Recognizing a String function D-RECOGNIZE(tape,machine) returns accept or reject index←Beginning of tape current-state←Initial state of machine loop if End of input has been reached then if current-state is an accept state then return accept else return reject elsif transition-table[current-state,tape[index]] is empty then return reject else current-state←transition-table[current-state,tape[index]] index←index + 1 end Veton Këpuska
Tracing Execution for Some Sheep Talk Before examining the beginning of the tape, the machine is in state q0. Finding a b on input tape, it changes to state q1 as indicated by the contents of transition-table[q0,b] in Fig. It then finds an a and switches to state q2, another a puts it in state q3, a third a leaves it in state q3, where it reads the “!”, and switches to state q4. Since there is no more input, the End of input condition at the beginning of the loop is satisfied for the first time and the machine halts in q4. State q4 is an accepting (final) state, and so the machine has accepted the string baaa! as a sentence in the sheep language. Veton Këpuska
Fail State • The algorithm will fail whenever there is no legal transition for a given combination of state and input. The input abc will fail to be recognized since there is no legal transition out of state q0 on the input a, (i.e., this entry of the transition table has a Ø). • Even if the automaton had allowed an initial a it would have certainly failed on c, since c isn’t even in the sheeptalk alphabet! We can think of these “empty” elements in the table as if they all pointed at one “empty” state, which we might call the fail state or sink state. • In a sense then, we could adopt FAIL STATE view of any machine with empty transitions as if we had augmented it with a fail state, and drawn in all the extra arcs, so we always had somewhere to go from any state on any possible input. • Just for completeness, next Fig. shows the FSA from previous Figure with the fail state qF filled in. Veton Këpuska
Adding a Fail State to FSA Veton Këpuska
Example 2 • Suppose we have a D-FSA that accepts all and only the strings of 0’s and 1’s that have the sequence 01 somewhere in the string. We can write this language L as follows: {w|w is of the form x01y for some strings x and y consisting of 0’s and 1’s} • Equivalent description is: {x01y | x and y are any strings of 0’s and 1’s} • Example strings in this language Linclude 01, 110110, 100011. • Example strings not in this language Lare ∊, 0, and 111000. Veton Këpuska
Example 2 (cont.) • What can be said about this D-FSA (A) that accepts this languageL? • Alphabet:S = {0, 1} • States: It has a number (of yet unknown) set of states with one of them say q0 a starting state. • It has to remember some important facts about what inputs it has seen so far. This is necessary to decide whether 01 is a substring of the input. That is • A needs to remember: • Has it already seen 01? If yes than it will be in accepting state from now on. • Has not seen 01, but its most recent input was 0, thus if now sees a 1, it will have seen 01 and can accept everything it sees from here on? • Has not seen 01, but its last input was either nonexistent (it just started) or it has saw a 1? In this case A cannot accept until it first sees a 0 and then sees a 1 immediately after. Veton Këpuska
Example 2 (cont.) • Each condition presented in previous slide can be represented by a state. • Condition (3) is represented by the start (first) state q0: • If we are in the q0 state, and next input is “0” we are then governed by condition (2): 1 1 0 q0 q0 q2 0 Veton Këpuska
Example 2 (cont.) • If we are in the state (2) and we receive input “1” – FSA should transit to the accepting state, which in this case we choose to name it state q1. • Finally in accepting state q1 any combination of 0’s and 1’s should not change the state. Thus Q = {q0, q1, q2} and F={q1} A=({q0, q1, q2}, {0,1}, , q0,{q1}) 0 1,0 1 0 q0 q2 q1 1 Veton Këpuska
Simpler Notations for D-FSA • A five-tuple with a detailed description of the d transitions is both tedious and hard to read. • There are two preferred notations: • A transition diagram, which is a graph such as the ones we have seen previously. • A transition table, which is a tabular listing of the d function, which provides the set of states and the input alphabet. Veton Këpuska
Transition Diagrams • A transition diagram for a FSA A=(Q, , , q0,F) is a graph defined as follows: • For each state in Q there is a node • For each state q in Q and each input symbol a in S, let d(q,a)=p.The transition diagram has an arc from node q to node p, labeled a. If there are several input symbols that cause transitions from q to p, then the transition diagram can have one arc, labeled by the list of these symbols. • There is an arrow into the start state q0, labeled Start. • Nodes corresponding to accepting states (set F) are marked with double circle. Veton Këpuska
Example A=(Q, , , q0,F) A=({q0, q1, q2}, {0,1}, , q0, {q1}) Transition Diagram of a FSA Veton Këpuska
Transition table is a conventional, tabular representation of a function like d that takes two arguments and returns a value. Rows – correspond to states Columns – correspond to inputs Transition Tables Transition table for the D-FSA of previous example Veton Këpuska
Extending the Transition Function to Strings • D-FSA defines a language: • The set of all strings that result in a sequence of state transitions from the start state to an accepting state, or alternatively • The set of labels along all the paths that lead from the start state to any accepting state - in terms of the transition diagram. Veton Këpuska
Extending the Transition Function • Formulate precisely the notation of the language expressed by D-FSA: • Define extended transition function of d • It describes what happens when we start in any state and follow any sequence of inputs.
Definition of Extended Transition Function BASIS: • If we are in state q and read no inputs, then we are still in state q. INDUCTION: • Suppose w is a string of the form xa (w= xa); • w = 1101 x = 110 & a = 1 Veton Këpuska
Design Example • Design D-FSA to accept the language: L={w|w has both an even number of 0’s and 1’s} • Solution: • Use states to count how many 0’s and 1’s has seen. Since even number requires counting modulo 2 we need to have 2 states for each symbol of the alphabet total of 4. • S = {0,1} • Q = {q0,q1, q2,q3} • q0 – both number of 0’s and 1’s seen so far is even Accepting State; F = {q0} • q1 – number of 0’s is even and number of 1’s seen so far is odd • q2 – number of 0’s is odd and number of 1’s seen so far is even • q3 – number of 0’s and 1’s seen so far is odd Veton Këpuska
Transition Diagram of D-FSA Veton Këpuska
Transition Table Veton Këpuska