190 likes | 327 Views
Lexical Analysis. Uses formalism of Regular Languages Regular Expressions Deterministic Finite Automata (DFA) Non-deterministic Finite Automata (NDFA) RE NDFA DFA minimal DFA (F)Lex uses RE as input, builds lexor. DFAs: Formal Definition. DFA M = (Q, S , d , q 0 , F)
E N D
Lexical Analysis • Uses formalism of Regular Languages • Regular Expressions • Deterministic Finite Automata (DFA) • Non-deterministic Finite Automata (NDFA) • RE NDFA DFA minimal DFA • (F)Lex uses RE as input, builds lexor
DFAs: Formal Definition DFA M = (Q, S, d, q0, F) Q = states finite set S = alphabet finite set d = transition function function in Q S Q q0 = initial/starting state q0 Q F = final states F Q
a b …aa …ab b a a a e b a b a b b …ba …bb a a b b DFAs: Example strings over {a,b} with next-to-last symbol = a
Nondeterministic Finite Automata “Nondeterminism” implies having a choice. Multiple possible transitions from a state on a symbol. d(q,a) is a set of states d : Q S Pow(Q) Can be empty, so no need for error/nonsense state. Acceptance: exist path to a final state? I.e., try all choices. Also allow transitions on no input: d : Q (S {e}) Pow(Q)
NFAs: Formal Definition NFA M = (Q, S, d, q0, F) Q = states a finite set S = alphabet a finite set d = transition function a total function in Q (S {e}) Pow(Q) q0 = initial state q0 Q F = final states F Q
S a …a …aS … S Loop until we “guess” which is the next-to-last a. NFAs: Example strings over {a,b} with next-to-last symbol = a
0 0s 2 e e 2s e 1 e e 1s NFAs: Example strings over {0,1,2} having (either 0-or-more 0’s or 0-or-more 1’s) followed by 0-or-more 2’s
Regular Expressions • Regular expression (over S) • • e • a where aS • r+r’ • r r’ • r* • where r,r’ regular (over S) • Notational shorthand: • r0 = e, ri = rri-1 • r+ = rr*
RE NFA Defined inductively on structure of RE. • This construction produces NFA with single final state. • 6 cases: , e, a, r’+r’’, r’r’’, r’*
RE NFA: q0 qf Accepts nothing since no edge to final state.
RE NFA: e q0
RE NFA: a q0 a qf
q’0 q’f e e q0 qf e e q’’0 q’’f e edges guess whether to use r’ or r’’. RE NFA: r’+r’’
q0 e e e qf q’0 q’f q’’0 q’’f Could conflate q0 with q’0, q’’f with qf. RE NFA: r’r’’
e q0 qf e e q’0 q’f e Can loop r’ as many times as desired or skip it. RE NFA: r’*
e 0 e e e e e e e 0 1 e RE NFA: Example (0+01)*
RE NFA: Notes Most constructions produce very large NFAs. • Not optimal for size. • But easy to construct.
NFA -> DFA Subset Construction • Complicated but well described in the text • Section 3.7.1 (pp 152-155), Algorithm 3.20 (2nd edition) • In section 3.6 (pp 116-121) in 1st edition
Minimizing DFA • Partition states of DFA, D, into two sets, final states, and non-final states. • Continue until no more partitions are needed • For each partition, P, split the DFA states of P so that, for each subpartition, all DFA states in that partition have the same transition for each input symbol, x.