100 likes | 223 Views
Minimizing the number of states in DFAs. Paul Richardson Student Lecture 6 November Course 08.73.11 Algorithms, Logic and Complexity University of Iceland Department of Computer Science. DFA use. DFA much used as pattern recogniser e.g.: in compilers in text processing .
E N D
Minimizing the number of states in DFAs Paul Richardson Student Lecture 6 November Course 08.73.11 Algorithms, Logic and Complexity University of Iceland Department of Computer Science
DFA use • DFA much used as pattern recogniser e.g.: • in compilers • in text processing. • Text processing tasks for DFAs use increasingly larger data sets which usually require more complex DFA • e.g. reference corpora (Bank of England Corpora has 400 million words) • Smaller DFA's (fewer states) use less space and time. • DFAs are minimised to save resources. • Problems in 1a) and 1b) are essentially the same processing problem.
Compiler conversion of regular expression input regular expression NFA by Thompson's construction DFA NFA by subset construction DFA by minimisation algorithm DFAMIN parser
Thomson's Construction generated automatically for regex: (a ∪b)*abb
Subset ConstructionNFA N DFA D Let N = (Q,∑,δ, q0,F) and D = (Q´,∑,δ´, q0´,F´) Let Q´ = P(Q) = { Ø,{1}, {2}, {3}, {1,2}, {1,3}, {2,3}, {1,2,3} } Create D start state as all states accessible by λarrows = {1,3} Accept states F´ = {{1}, {1,2}, {1,3}, {1,2,3}}
DFA DFAMIN • All minimising algorithms involve finding equivalent states and calling them the same state thus collapsing to fewer states. Many variations on the theme published. • Most common algorithm shown is a partitioning algorithm that runs at O(n2) • Hopcroft published an O(nlogn) algorithm in 1971 • Much literature deals with the study of this algorithm and variations.
Partitioning Algorithm O(n2) 1. Partition into blocks by final/not final (1,2,3,4,5) (6) On input 0 2. Partition by successor state (1,2,3,4) (5) (6) 3. Repeat until no new blocks are generated i.e. (1,2,3) (4) (5) (6) ... (1) (2) (3) (4) (5) (6) 4. Combine the members of each block to form a single state from each partition. In this case all states are unequivalent so the DFA is not minimised but normally there are blocks containing more than one state and these blocks are treated as one state.
Partitioning Algorithm O(nlogn)Hopcroft 1971 This algorithm may need n iterations but there are fewer actions in each iteration which give it O(nlogn). We do not need to repeat partitioning on a block with an input symbol until it is split then we need only partition on one of the two sub blocks and we can always choose the smaller of the two, which needs less processing. [Hopcroft] 1. Invert state table 2. Partition by final/not final (1,2,3,4,5) (6) 3. Select a partition and input symbol (e.g. (6),0) 4. Partition by condition if input 0 (6) to get (1,2,3,4) (5) (6) (n.b. Using ((1,2,3,4,5),0) in step 3 is equivalent 5. Iterate the partition choosing the smallest block 6. Combine the members of each partition to form a single state from each partition. T
References An nLogn Algorithm for Minimizing States in kFinite Automaton, John Hopcroft, Stan-Cs-71-190 January, 1971 Sipser, Introduction to the Theory of Computation ISBN 0-619-21764-2 Google scholar: Various numerous web sites on algorithms, research papers and teaching material. Questions?