160 likes | 423 Views
Structure of a YACC File. Has the same three-part structure as Lex Each part is separated by a %% symbol The three parts are even identical: d efinition section r ules section c ode section (copied directly into the generated program). Definiton Section.
E N D
Structure of a YACC File • Has the same three-part structure as Lex • Each part is separated by a %% symbol • The three parts are even identical: • definition section • rules section • code section (copied directly into the generated program)
DefinitonSection • Declare tokens used in the grammar and types of values used on the stack here • Tokens that are single quoted characters like “=“ or “+” need not be declared. • Literal C code can be included in a block in this section using %{…%}
Declaring Tokens • The tokens that are used in the grammar must be declared • Include lines like the one below in the definition section: %token CHARSTRING INT IDENTIFIER %token LPAREN RPAREN
The Rules Section • The rules of the grammar are placed here. • Here is an example of the basic syntax: Expr INTEGER + INTEGER | INTEGER - INTEGER expr : INTEGER + INTEGER {action} | INTEGER – INTEGER {action} ; YACC grammar definition
YACC Actions • Simiar to Lex, actions can be defined that will be performed whenever a production is applied in the stream of tokens. • These are usually included after the production whose action is to be defined. • Since every symbol in the grammar has a corresponding value, it will be necessary to access those values. • Accessing the YACC stack will be the way to do this.
Accessing the Stack • Since YACC generates an LR parser, it will push the symbols that it reads along with their values on a stack until it is ready to reduce. • To access these values, include a dollar sign with a number to get at each value in the production in the action definition.
Refers to the value of the left nonterminal Accessing the Stack expr : INTEGER + INTEGER {$$ = $1 + $3} | INTEGER – INTEGER {$$ = $1 - $3} ;
Where do Tokens and Their Values Come From? • Typically from the lexer. LEX YACC yyparse yylex
Revisiting Lex • The Lex file will have to be modified to work with the YACC parser in two main places. • In the definition section, include this statement: #include “y.tab.h” • That is a header file automatically created by YACC when the parser is generated. • The actions for the rules need to be changed too.
Revisiting Lex Actions • For tokens with a value, assign that value to yylval. YACC can read the value from that variable. • Include a return statement for the token name (this is the same name that is defined at the top of the YACC file). if {return IF;} [1-9][0-9]* {yylval = atoi(yytext); return INTEGER;}
The %union Declaration • Different tokens have different data types. • INTEGER are integers, FLOAT are floats, CHARACTERSTRING are char *, IDENTIFIER are pointers to the entry in the symbol table for that identifier. • The %union will allow the parser to apply the right data type to the right token.
The %union Declaration YACC Definition Section %union { int intValue; float floatValue; } %token <intValue> INTEGER %token <floatValue> FLOAT Lex Rules Section … {yylval.intValue = atoi(yytext); return INTEGER;} … {yylval.floatValue = atof(yytext); return FLOAT;}
References That Might Be Useful • Levine J R , Manson T , Brown D, “Lex & Yacc”, (2Ed , O'Reilly, 1992) • Stephen C. Johnson, “Yacc: Yet Another Compiler-Compiler”,http://www.cs.utexas.edu/users/novak/yaccpaper.htm • Bert Hubert, “Lex and YACC primer/HOWTO”, http://www.tldp.org/HOWTO/Lex-YACC-HOWTO.html#toc6