210 likes | 308 Views
Learn about DFA, its functioning, language, examples, and how to design one for C++ tokens like identifiers, integers, and floating-point numbers.
E N D
Deterministic Finite Automata Section 2.1 Fri, Sep 17, 2004
Intuitive View of an Automaton • An automaton is a machine that has an input tape and can be put into any of several states. • A string of symbols is written on the tape before execution. • The automaton begins by reading the symbols on the tape, from left to right. • Upon reading a symbol from the tape, the machine changes state and advances the tape. • After reading the last symbol, the machine halts. • The last state tells the result of the processing.
a p q Transition Diagrams • In a transition diagram, • Each state is represented by a circle. • Each final state is represented by a circle within a circle. • Transitions are represented by arrows from one state to another state. • Transitions are labeled with an input symbol. • Final states are indicated by a circle within a circle. • The following diagram represents the transition from state p to state q upon reading the symbol a.
Example of an Automaton • Describe an automaton that reports whether the input string begins with a 0. • Describe an automaton that reports whether the input string ends with a 0. • Describe an automaton that would read the input tape and report whether the tape contained an even number or an odd number of symbols. • Describe an automaton that would read a binary string and report whether the string contained an even number of 1s and an odd number of 0s.
Definition of a Deterministic Finite Automaton • A deterministic finite automaton (DFA) is a quintuple (K, , , s, F) where • K is a finite of states. • is a finite input alphabet. • sK is the initial state. • FK is the set of final states. • is the transition function from K to K, i.e., KK.
The Functioning of a DFA • The DFA begins in the initial state s. • Upon reading an input symbol, the DFA changes state according to the rule expressed by the transition function . • After reading the last symbol on the tape, the DFA halts. • The input is accepted if the final state is in F. Otherwise, the input is rejected.
The Language of a DFA • Let = {0, 1}. • Design a DFA that accepts every string that begins with 0. • Design a DFA that accepts every string that ends with 0. • Design a DFA that accepts only strings of length 1. • Design a DFA that accepts only strings of length 2. • Design a DFA that accepts no string. • Design a DFA that accepts every string. • Design a DFA that accepts only the empty string.
Configurations • Given a DFA M, a configuration is a pair (q, w) where q is the current state of M and w is the remainder of the input string. • Beginning with configuration (q, w), if the next transition produces the configuration (q, w), then we say that (q, w) yields(q, w) in one step. • That is, w = aw' for some a and there is a transition (q, a) = q'. • This is denoted (q, w) M(q, w).
Configurations • The configuration (q, w) yields the configuration (q, w) if there is a sequence of configurations (q1, w1), (q2, w2), …, (qn, wn) such that (qi, wi) yields (qi + 1, wi + 1) in one step, for i = 1, …, n – 1, and (q1, w1) = (q, w) and (qn, wn) = (q, w). • This is denoted (q, w) M*(q, w). • The relation M*is the reflexive, transitive closure of the relation M.
The Language of a DFA • The string w is accepted by A if (s, w) M*(q, e) for some state q F. • The language of a DFA M, denoted L(M), is L(M) = {w * | M accepts w}.
Examples of DFAs • Let = {0, 1}. • Design a DFA whose language is the set of all strings containing 00. • Design a DFA whose language is the set of all strings containing 00 or 11. • Design a DFA whose language is the set of all strings that do not contain 00.
C++ Tokens: Identifiers • A C++ identifier is a string of letters, digits, and underscores that begins with a letter. • Design a DFA that will accept C++ identifiers.
C++ Tokens: Integers • A C++ integer may be expressed in decimal, octal, or hexadecimal. • In each base, the integer may begin with an optional + or – sign and end with an optional L or l. • Decimal • One or more decimal digits (0 – 9), first digit nonzero. • Octal • An initial 0, followed by one or more octal digits (0 – 7). • Hexadecimal • An initial 0x, followed by one or more hexadecimal digits (0 – 9, a – f, A – F).
C++ Tokens: Integers • Design a DFA that accepts C++ integers.
C++ Tokens: Floating-Point Numbers • A C++ floating-point number consists of the following. • The mantissa • A + or – sign, followed by zero or more digits, followed by a decimal point, followed by zero or more digits. • The exponent • An e or E, followed by a + or – sign, followed by one or more digits. • The final F or f.
C++ Tokens: Floating-Point Numbers • The mantissa and exponent + or – signs are optional. (-1.23, 1.23e-4) • Digits before the decimal point are optional, provided there is at least one digit after the decimal point, and vice versa. (.123 and 123.) • The decimal point, the exponential part, and the final F are optional, provided at least one is used. (1230f or 123e4 or 1230.)
C++ Tokens: Floating-Point Numbers • Design a DFA that accepts C++ floating-point numbers.
Programming DFAs in C++ • I have written a program that will read a description of a DFA from a file, and then simulate that DFA. • The project is named Universal DFA.mcp. • The input file contains • A list of transitions. • A list of final states. • States • Each state is represented as a nonnegative integer. • 0 is the start state.
Simulating DFAs in C++ • Input symbols • Each input symbol is a character. • Transitions • Each transition is of the form (state, symbol; state). • Example • = {0, 1}. • The following DFA will accept all strings of even length. {(0, 0, 1), (0, 1, 1), (1, 0, 0), (1, 1, 0)} {0}