150 likes | 263 Views
Other Issues - § 3.9 – Not Discussed. More advanced algorithm construction – regular expression to DFA directly. space required. time to simulate. NFA. O(|r|). O(|r|*|x|). DFA. O(2 |r| ). O(|x|). Final Notes : R.E. to NFA Construction.
E N D
Other Issues - § 3.9 – Not Discussed • More advanced algorithm construction – regular expression to DFA directly
space required time to simulate NFA O(|r|) O(|r|*|x|) DFA O(2|r|) O(|x|) Final Notes : R.E. to NFA Construction • So, an NFA may be simulated by algorithm, when NFA is constructed using Previous techniques • Algorithm run time is proportional to |N| * |x| where |N| is the number of states and |x| is the length of input • Alternatively, we can construct DFA from NFA and use the resulting Dtran to recognize input: where |r| is the length of the regular expression.
Pulling Together Concepts • Designing Lexical Analyzer Generator • Reg. Expr. NFA construction • NFA DFA conversion • DFA simulation for lexical analyzer • Recall Lex Structure • Pattern Action • Pattern Action • … … • Each pattern recognizes lexemes • Each pattern described by regular expression (a | b)*abb e.g. (abc)*ab etc. Recognizer!
Lex Specification Lexical Analyzer • Let P1, P2, … , Pn be Lex patterns • (regular expressions for valid tokens in prog. lang.) • Construct N(P1), N(P2), … N(Pn) • Note: accepting state of N(Pi) will be marked by Pi • Construct NFA: N(P1) • Lex applies conversion algorithm to construct DFA that is equivalent! N(P2) N(Pn)
Lex Specification Lex Compiler Transition Table (a) Lex Compiler FA Simulator Transition Table Pictorially lexeme input buffer (b) Schematic lexical analyzer
NFA’s : start a 1 2 start a b b 3 4 5 6 a b start 7 8 b Example P1 : aP2 : abbP3 : a*b+ 3 patterns P1 P2 P3
1 0 Example – continued (2) Combined NFA : a P1 2 a b b start P2 3 4 5 6 a b P3 7 8 b Examples a a b a {0,1,3,7} {2,4,7} {7} {8} deathpattern matched: - P1 - P3 - a b b {0,1,3,7} {2,4,7} {5,8} {6,8}pattern matched: - P1P3P2,P3 break tie in favor of P2
Input Symbol STATE a b Pattern {0,1,3,7} {2,4,7} {8} none {2,4,7} {7} {5,8} P1 {8} - {8} P3 {7} {7} {8} none {5,8} - {6,8} P3 {6,8} - {8} P2 Example – continued (3) Alternatively Construct DFA: (keep track of correspondence between patterns and new accepting states) break tie in favor of P2
Minimizing the Number of States of DFA • Construct initial partition of S with two groups: accepting/ non-accepting. • (Construct new)For each group G of do begin • Partition G into subgroups such that two states s,tof G are in the same subgroup iff for all symbols astates s,t have transitions on a to states of the same group of . • Replace G in new by the set of all these subgroups. • Compare new and . If equal, final:= then proceed to 4, else set :=new and goto 2. • Aggregate states belonging in the groups of final
example a a A a F B a b b a D C b b b a a A,C,D B,F b Minimized DFA: b
Using LEX Lex Program Structure: declarations%%translation rules%%auxiliary procedures Name the file e.g. test.lexThen, “lex test.lex”produces the file“lex.yy.c” (a C-program)
LEX %{ /* definitions of all constants LT, LE, EQ, NE, GT, GE, IF, THEN, ELSE, ... */ %} ...... letter [A-Za-z] digit [0-9] id {letter}({letter}|{digit})* ...... %% if { return(IF);} then { return(THEN);} {id} { yylval = install_id(); return(ID); } ...... %% install_id() { /* procedure to install the lexeme to the ST */ C declarations declarations Rules Auxiliary
Example of a Lex Program int num_lines = 0, num_chars = 0; %% \n {++num_lines; ++num_chars;}. {++num_chars;} %% main( argc, argv )int argc; char **argv; { ++argv, --argc; /* skip over program name */ if ( argc > 0 ) yyin = fopen( argv[0], "r" ); else yyin = stdin; yylex(); printf( "# of lines = %d, # of chars = %d\n", num_lines, num_chars ); }
Another Example %{ #include <stdio.h> %} WS [ \t\n]* %% [0123456789]+ printf("NUMBER\n"); [a-zA-Z][a-zA-Z0-9]* printf("WORD\n"); {WS} /* do nothing */ . printf(“UNKNOWN\n“); %% main( argc, argv ) int argc; char **argv; { ++argv, --argc; if ( argc > 0 ) yyin = fopen( argv[0], "r" ); else yyin = stdin; yylex(); }
Concluding Remarks • Focused on Lexical Analysis Process, Including • - Regular Expressions • Finite Automaton • Conversion • Lex • Interplay among all these various aspects of lexical analysis Looking Ahead: • The next step in the compilation process is Parsing: • Top-down vs. Bottom-up • - Relationship to Language Theory