1 / 20

Lex & Yacc

Lex & Yacc. Compilation Process. Lex. Yacc. Lexical Analyzer. Syntax Analyzer. Intermed. Code Gen. Code Generator. Machine Code. Source Code. Symbol Table. Lexical Analyzer (Scanner, Lexer). A pattern matcher Extracts lexeme in a given string Produces corresponding token

booker
Download Presentation

Lex & Yacc

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lex & Yacc

  2. Compilation Process Lex Yacc Lexical Analyzer Syntax Analyzer Intermed. Code Gen. Code Generator Machine Code Source Code Symbol Table

  3. Lexical Analyzer (Scanner, Lexer) • A pattern matcher • Extracts lexeme in a given string • Produces corresponding token • Detects errors in token • Front-end of a syntax analyzer • Serves as a finite state automata

  4. Syntax Analyzer (Parser) • Checks the syntactic correctness of input string • Produces a complete parse tree or trace structure of it • In case of error displays message and tries recover to detect as many errors as it can

  5. LEX • A tool for automatically generating a lexer or scanner given a lex specification (.l file) • A lexer or scanner is used to perform lexical analysis, or the breaking up of an input stream into meaningful units, or tokens. • Takes a set of descriptions of possible tokens (i.p. Regular expressions)

  6. lex.yy.c is generated after running > lex x.l x.l %{ < C global variables, prototypes, comments > %} [DEFINITION SECTION] %% [RULES SECTION] %% C auxiliary subroutines This part will be embedded into lex.yy.c substitutions, code and start states; will be copied into lex.yy.c define how to scan and what action to take for each token any user code. For example, a main function to call the scanning function yylex(). Skeleton of a lex specification (.l file)

  7. The rules section %% [RULES SECTION] <pattern> { <action to take when matched> } <pattern> { <action to take when matched> } … %% Patterns are specified by regular expressions. For example: %% [A-Za-z]* { printf(“this is a word”); } %%

  8. The rule of lex specification file Rule section is list of rules <pattern> { corresponding actions } <pattern> { corresponding actions } … … … [1-9][0-9]* { yylval = atoi (yytext); return NUMBER; } Actions are C statements Pattern in regular expr form

  9. Lex Reg Exp (cont) x|yx or y {i} definition of i x/yx, only if followed by y (y not removed from input) x{m,n} m to n occurrences of x xx, but only at beginning of line x$ x, but only at end of line "s" exactly what is in the quotes (except for "\" and following character) A regular expression finishes with a space, tab or newline

  10. Meta-characters • meta-characters (do not match themselves, because they are used in the preceding reg exps): • ( ) [ ] { } < > + / , ^ * | . \ " $ ? - % • to match a meta-character, prefix with "\" • to match a backslash, tab or newline, use \\, \t, or \n

  11. Two Rules • lex will always match the longest (number of characters) token possible. • 2. If two or more possible tokens are of the same length, then the token with the regular expression that is defined first in the lex specification is favored.

  12. LEX Rules Disambiguation • “ali” “aliye” “[a-zA-Z]+” • Rules defined before have precedence over rules defined after • An input is matched with at most one pattern • Action for the longest possible match among the patterns executed

  13. LEX: A Simple Example %{ /* firstlexer.l : Our First Lexer */ #include <stdio.h> }% %% [\t ]+ ; /* Ignore whitespace */ like|need|love|care {printf(“%s: verb\n”, yytext); } [a-zA-Z]+ { printf(“%s: yet not defined\n”, yytex); } .|\n { ECHO; } %% main() { yylex(); }

  14. LEX Compilation >lex firstlexer.l #Generates lex.yy.c >cc lex.yy.c -lfl

  15. LEX Built-in Functions & Variables • yymore() • append next string matched to current contents of yytext • yyless(n) • remove from yytext all but the first n characters • unput(c) • return character c to input stream • yywrap() • may be replaced by user • The yywrap method is called by the lexical analyser whenever it inputs an EOF as the first character when trying to match a regular expression

  16. LEX Built-in Functions & Variables • yytext • where text matched most recently is stored • yyleng • number of characters in text most recently matched • yylval • associated value of current token • yyin - points current file parsed by the lexer • yyout - points file that output of the lexer will be written

More Related