70 likes | 171 Views
SCRIBE SUBMISSION GROUP 8 Date: 7/8/2013 By – IKHAR SUSHRUT MEGHSHYAM 11CS10017. Topic Covered : Detecting lexemes from a given set of patterns/stream of chars. Index. Lexical Analyser Constructing Tokens State-Transition Diagram
E N D
SCRIBE SUBMISSIONGROUP 8Date: 7/8/2013 By – IKHAR SUSHRUT MEGHSHYAM 11CS10017 Topic Covered : Detecting lexemes from a given set of patterns/stream of chars Index • Lexical Analyser • Constructing Tokens • State-Transition Diagram • S-T Diagrams of Operators, Variables, Digits
Lexical Analyser Regular Expression Pattern Pattern To Parser Pattern Regular Language BLACK BOX Lexeme Stream ofCharacters Construct Token • Describe Patterns using regular expression • For a specific pattern we can define a regular expression corresponding to regular language
Construct Tokens for specified set of patterns • Tokens for some patterns:1. Keywords : if <if>, else <else>, while <while>, • then <then>, do <do> • 2. Operators : > <op , GT>, >= <op , GE>, • < <op , LT>, <= <op , LE>. • = <op , EQ> • Variables : start with letter followed by letters/digits/underscores • < id , pointer to symbol table> • Numbers : Whole numbers & Floating point numbers • < number , pointer to constant table> • Whitespaces : tab/newline/whitespace • No tokens will be created
STATE - TRANSITION DIAGRAM S-T diagram is a directed graph consisting states as set of nodes and directed edges corresponding to transitions from one state to another. I N P U T a a start a Final states b Start state ∑ = {a, b} For an input string X, If final state is reached then X is accepted by the machine M defined over the alphabet ∑ L(M) denotes the set of all accepted strings by machine M
S-T Diagrams for some patterns 1. S-T Diagram for ‘while’ * \0 w l i h e start backtracking Token : <while> 2. S-T Diagram for ‘digits’ digit digit * . digit digit other symbol start other symbol * digit : [0-9] digits : {digit}*
3. S-T Diagram for ‘operators’ < = start <OP,LE> \0 Other symbol * <OP,LT> = > = <OP,GE> \0 Other symbol * <OP,EQ> <OP,GT>
3. S-T Diagram for ‘variables’ letters/digits/underscore else * letters start backtracking To distinguish between keywords and variables we try to maximize the length of the lexeme. We apply “parallel simulation” for all the above S-T diagrams and determine the token for which lexeme is of maximum length letter : [A-Za-z] letters : {letter}* digit : [0-9] digits : {digit}* underscore : _ + €