230 likes | 436 Views
CPSC 388 – Compiler Design and Construction. Scanner – Regular Expressions to DFA. Announcements. ACM Programming contest (Tues 8pm) PROG 1 Feedback Linux Install Fest – When? Saturday?, Fliers, CDROMS, Bring Laptops (do at own risk) LUG Understanding Editors (Eclipse, Vi, Emacs).
E N D
CPSC 388 – Compiler Design and Construction Scanner – Regular Expressions to DFA
Announcements • ACM Programming contest (Tues 8pm) • PROG 1 Feedback • Linux Install Fest – When? Saturday?, Fliers, CDROMS, Bring Laptops (do at own risk) • LUG • Understanding Editors (Eclipse, Vi, Emacs)
Scanners Source Code Lexical Analyzer (Scanner) Token Stream Deterministic Finite State Automata Regular Expression Nondeterministic Finite State Automata
Regular Expressions • Easy way to express a language that is accepted by FSA • Rules: • ε is a regular expression • Any symbol in Σ is a regular expression If r and s are any regular expressions then so is: • r|s denotes union e.g. “r or s” • rs denotes r followed by s (concatination) • (r)* denotes concatination of r with itself zero or more times (Kleene closer) • () used for controlling order of operations
* Cat a | a b RE to NFA: Step 1 • Create a tree from the Regular Expression • Example (a(a|b))* • Leaf Nodes are either • members of Σ • or ε • Internal Nodes are operators • cat, |, *
RE to NFA: Step 2 • Do a Post-Order Traversal of Tree(children processed before parent) • At each node follow rules for conversion from a RE to a NFA
F F Leaf Nodes • Either εor member ofΣ * ε S Cat a | a S a b
Internal Nodes • Need to keep track of left (l)and right (r) NFA and merge them into a single NFA • Or • Concatination • Kleene Closure
F Or Node l ε ε S ε r ε
F Kleene Closure ε ε ε S ε
Try It • Convert the regular expression to a NFA (a|b)*abb • First convert RE to a tree • Then convert tree to NFA
NFA to DFA • Recall that a DFA can be represented as a transition table
Operations on NFA • ε-closure(t) – Set of NFA states reachable from NFA state t on ε-transitions alone. • ε-closure(T) – Set of NFA states reachable from some NFA state t in set T on ε-transitions alone. • move(T,a) – Set of NFA states to which there is a transition on input symbol a from some state t in T
NFA to DFA Algorithm Initially ε-closure(s) is the only state in DFA and it is unmarked While (there is unmarked state T in DFA) mark T; for (each input symbol a) { U = ε-closure(move(T,a)); if (U not in DFA) add U unmarked to DFA transition[T,a]=U;
Try it • Take NFA from previous example and construct DFA Regular Expression: (a|b)*abb ε a 2 3 ε ε ε 6 7 a 8 b 9 b F ε S 1 ε ε 4 5 b ε
Corresponding DFA b C 1,2,4, 5,6,7 b b a NewS S,1,2,4,7 B 1,2,3,4 6,7,8 D 1,2,4,5, 6,7,9 NewF 1,2,4,5, 6,7,F a b b a a a
Start State and Accepting States • The Start State for the DFA isε-closure(s) • The accepting states in the DFA are those states that contain an accepting state from the NFA
Efficiency of Algorithms • RE -> NFA • NFA -> DFA • Recognition of a string by DFA O(|r|) where |r| is the size of the RE O(|r|22|r|) – worst case(not seen in typical programming languages) O(|x|) where |x| is length of string
4 2 More Practice • Convert RE to NFA ((ε|a)b*)* • Convert NFA to DFA a 1 a ε S b 3 ε b
Solution to Practice • RE to NFA ε ε 2 3 ε ε ε ε ε ε 6 7 b 8 9 F ε S 1 ε ε 4 5 a ε ε
Solution to Practice • NFA to DFA a A 2 a NewS S,1,3 B 4 b b
Summary of Scanners • Lexemes • Tokens • Regular Expressions, Extended RE • Regular Definitions • Finite Automata (DFA & NFA) • Conversion from RE->NFA->DFA • JLex