1 / 34

Compiler Principle and Technology

Compiler Principle and Technology. Prof. Dongming LU Mar. 26th, 2014. 5. Bottom-Up Parsing. PART THREE. Contents. PART ONE 5.1 Overview of Bottom-Up Parsing 5.2 Finite Automata of LR(0) Items and LR(0) Parsing PART TWO 5.3 SLR(1) Parsing 5.4 General LR(1) and LALR(1) Parsing

tad-simpson
Download Presentation

Compiler Principle and Technology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Compiler Principle and Technology Prof. Dongming LU Mar. 26th, 2014

  2. 5. Bottom-Up Parsing PART THREE

  3. Contents PART ONE 5.1 Overview of Bottom-Up Parsing 5.2 Finite Automata of LR(0) Items and LR(0) Parsing PART TWO 5.3 SLR(1) Parsing 5.4 General LR(1) and LALR(1) Parsing 5.7 Error Recovery in Bottom-Up Parsers Part THREE 5.5 Yacc: An LALR(1) Parser Generator

  4. 5.5 Yacc: LALR(1) PARSING GENERATOR

  5. A parser generator is a program taking a specification of the syntax of a language and producing a parser procedure for the language • A parser generator is called compiler-compiler • One widely used parser generator incorporating the LALR(1) parsing algorithm is called YACC

  6. 5.5.1 Yacc Basics

  7. Yacc source filetranslate.y Yacc y.tab.c C compiler y.tab.c a.out a.out output input Parser generator--Yacc Lex source File x.l lex.yy.c Lex C compiler lex.yy.c a.out a.out inputStream Tokens 1/36

  8. Yacc takes a specification file • (usually with a .y suffix); • Yacc produces an output file consisting of C source code for the parser • (usually in a file called y.tab.c or ytab.c or, more recently <filename>.tab.c, where <filename>.y is the input file). • A Yacc specification file has the basic format. • {definitions} • %% • {rules} • %% • {auxiliary routines} • There are three sections separated by lines containing double percent signs

  9. The example: • A calculator for simple integer arithmetic expressions with the grammar exp → exp addop term | termx addop → + | - term → term mulop factor | factor mulop → * factor → ( exp ) | number

  10. %{ #include <stdio.h> #include <ctype.h> %} %token NUMBER %% command :exp { printf (“%d\n”,$1);} ; /*allows printing of the result */ exp: exp ‘+’ term {$$ = $1 + $3;} | exp ‘-‘ term {$$ = $1 - $3;} | term {$$ = $1;} ;

  11. term: term ‘*’ factor {$$ = $1* $3;} | factor {$$ = $1;} ; factor :NUMBER{$$ = $1;} | ‘(‘exp’)’ {$$=$2;} ; %% main ( ) { return yyparse( ); }

  12. int yylex(void) { int c; while( ( c = getchar ( ) )== ‘’ ); /*eliminates blanks */ if ( isdigit(c) ) { unget (c,stidin) ; scanf (“%d”,&yylval ) ; return (NUMBER ) ; } if (c== ‘\n’) return 0; /* makes the parse stop */ return ( c ) ; }

  13. int yyerror (char * s) { fprintf (stderr, “%s\n”,s ) ; return 0; }/* allows for printing of an error message */

  14. The definitions section (can be empty) • Information about the tokens, data types, grammar rules; • Any C code that must go directly into the output file at its beginning • The rules section • Grammar rules in a modified BNF form; • Actions in C code executed whenever the associated grammar rule is recognized • (i.e.. used in a reduction. according to the LALR(1) parsing algorithm) • The metasymbol conventions: • The vertical bar is used for alternatives; • The arrow symbol →is replaced in Yacc by a colon; • A semicolon must end each grammar rule.

  15. The auxiliary routines section ( can be empty): • Procedure and function declarations • A minimal Yacc specification file • consist only of %% followed by grammar rules and actions.

  16. Notes: • Two typical #include directives are set off from other Yacc declarations in the definitions section by the surrounding delimiters %{ and %}. • Yacc has two ways of recognizing token. • Single-character inside single quotes as itself • Symbolic tokens as a numeric value that does not conflict with any character value

  17. Notes: • %start command in the definition section will indicate that command be start symbol, otherwise the rule listed first in the rule section indicates the start symbol. • The variable yylval is defined internally by Yacc; • yyparse is the name of the parsing procedure; • yylex is the name of Lex scanner generator.

  18. 5.5.2 Yacc Options

  19. (1) –d option • It will produce the header file usually named y.tab.h or ytab.h. • Yacc needs access to many auxiliary procedures in addition to yylax and yyerror; They are often placed into external files rather than directly in the Yacc specification file. • Making Yacc-specific definitions available to other files. • For example, command in the file calc.y • yacc –d calc.y

  20. (1) –d option Produce (in addition to the y.tab.c file) the file y.tab.h (or similar name), whose contents vary, but typically include such things as the following: #ifndef YYSTYPE #define YYSTYPE int #endif #define NUMBER 258 extern YYSTYPE yylval;

  21. (2) –v option • Produces yet another file, with the name y. output (or similar name). • This file contains a textual description of the LALR( l )parsing table that is used by the parser. • As an example: consider the skeletal version of the Yacc specification above. • yacc -v calc .y

  22. (2) –v option %taken NUMBER %% command :exp ; exp: exp ‘+’ term | exp ‘-‘ term | term ; term: term ‘*’ factor | factor ; factor :NUMBER | ‘(‘exp’)’ ;

  23. 5.5.3 Parsing Conflicts and Disambiguating Rules

  24. Yaccand ambiguous rules Default rules for Disambiguating: • reduce-reduce conflict: preferring the reduction by the grammar rule listed first in the specification file • shift-reduce conflict: preferring the shift over the reduce

  25. Error recovery in Yacc The designer of a compiler will decide: • Which non-terminals may cause an error • Add error production A  error  for the non-terminals • Add semantic actions for these error productions Yaccregards the error productions as normal grammar productions

  26. s : C 1·A2 A· b A ·error . . . A C1A·2 . . . stack . . . A b . . . . . . . . a . . s Ab· . . . . . . find an error Error recovery procedure • Pop the state from stack,until we find A ·error  • Push ‘error’ into stack • If  is ,do the reduction,and scan the input symbols until we find the correct symbol. • If  isnot ,find  ,push  into stack and use A ·error  todo the reduction.

  27. lines : lines expr ‘\n’ {printf ( “%g \n”, $2 ) } | lines ‘\n’ | // | error ‘\n’ {printf ( “重新输入上一行”); yyerrok;} ;

  28. End of Part Three THANKS

More Related