1 / 21

Lecture #11, Feb. 19, 2007

Lecture #11, Feb. 19, 2007. ml-Yacc Actions when reducing Making ml-yacc work with ml-lex Boiler plate. Assignments. Reading Chapter 4, Sections 4.1 Context Sensitive Analysis 4.2 Intro to Type Systems Pages 151-170 Quiz on Wednesday? Homework #9 is due Wednesday.

moriah
Download Presentation

Lecture #11, Feb. 19, 2007

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture #11, Feb. 19, 2007 • ml-Yacc • Actions when reducing • Making ml-yacc work with ml-lex • Boiler plate

  2. Assignments • Reading • Chapter 4, Sections • 4.1 Context Sensitive Analysis • 4.2 Intro to Type Systems • Pages 151-170 • Quiz on Wednesday? • Homework #9 is due Wednesday. • Project 2 is assigned today. It is posted on the web site.

  3. Sml-yacc parser generator • Sml-yacc specifications contain 3 parts separated by %% <user declarations> %% <declarations about the grammar> %% <grammar rules>

  4. Declarations about the grammar • All begin with a single % followed by a key word. • Some declarations are required! • You MUST name the specification • %name XXX • you MUST describe the nonterminals and terminals of the grammar. • %term .... • %nonterm ... • The description of terminals and non-terminals requires you give the type of any attribute that they may have. • You Must have a %pos declaration. This declares the type of "positions". (more about this later).

  5. %term and %nonterm • These things look like algebraic datatype declarations. • We will build an example parser for Regular Expressions. (* user declarations *) datatype Re = empty of int | simple of string * int | concat of Re * Re | closure of Re | union of Re * Re; val count = ref 0; fun next() =(count := (!count)+1; !count); %% (* declarations about the grammar *) %name XXX %term EOF | STAR | BAR | LP | RP | HASH | SINGLE of string %nonterm exp of Re %pos int %% (* grammar rules *) . . .

  6. Description of Example • Symbols are represented by EOF, BAR, ... • None of them have any attributes except SINGLE which has a string attribute which represents the single character that we want to recognize. • There is only one non-terminal, exp, and it has one attribute which is of type Re. • Note that the Re type is defined in the user declarations section. • The %pos declaration say that a position is an integer. This is for error reporting.

  7. Recall how a Bottom up Parse Works E ::= E + T1 | T2 T ::= T * F3 | F4 F ::= ( E )5 | id6 stack Input Action x + y shift x + y reduce 6 F + y reduce 4 T + y reduce 2 E + y shift E + y shift E + y reduce 6 E + F reduce 4 E + T reduce 1 E accept

  8. Grammar Rules Section • Grammar rules which describe the grammar that is to be recognized • They also tell what to do whenever a "reduce" action is encountered. • A grammar rule has the form: • <non-terminal> : <rhs> ( action ) <optional more rules for that non-terminal> • For example: exp: SINGLE ( simple(SINGLE,next() ) ) | HASH ( empty (next()) ) • The “action” is a value that is associated with the lhs of the production when it is pushed on the stack. • Its can “depend” upon the values of the symbols in the rhs (which are already on the stack).

  9. Example Showing Grammar Rules (*user declarations (Re) omitted here*) %% (* declarations about the grammar *) %name XXX %term EOF | STAR | BAR | LP | RP | HASH | SINGLE of string %nonterm exp of Re %pos int %% exp: SINGLE ( simple(SINGLE ,next()) ) | HASH ( empty(next()) ) | LP exp RP ( exp ) | exp STAR ( closure exp ) | exp exp ( concat(exp1,exp2) ) | exp BAR exp ( union(exp1,exp2) )

  10. Complete Example datatype Re = empty of int | simple of string * int | concat of Re * Re | closure of Re | union of Re * Re; %% %name XXX %term EOF | STAR | DUMMY | BAR | LP | RP | HASH | SINGLE of string %nonterm go of Re | exp of Re %pos int %start go %eop EOF %verbose %left LP SINGLE HASH %left BAR %left DUMMY %right STAR

  11. Complete Example continued %% go: exp EOF ( exp ) exp: SINGLE ( simple(SINGLE,next()) ) | HASH ( empty( next() ) ) | LP exp RP ( exp ) | exp STAR ( closure exp ) | exp exp %prec DUMMY ( concat(exp1,exp2) ) | exp BAR exp ( union(exp1,exp2) )

  12. Boiler Plate • To get this all to work we need a lexical analyzer that can produce terminal symbols with the correct attributes for the %term directive. • We can use sml-lex to do this, but instead of defining our own token type we will use the one which is automatically defined by the %term declaration in sml-yacc. • In order to do this we need the following BOILER-PLATE in the user declarations part of the sml-lex source file. Boiler plate in Sml-Lex source file type pos = int type svalue = Tokens.svalue type ('a,'b) token = ('a,'b) Tokens.token type lexresult = (svalue,pos) token open Tokens val lineno = ref 0 val reset_lineno = fn () => lineno := 1 val eof = fn () => EOF(!lineno,!lineno) fun error (e,l : int,_) = . . .

  13. More Boiler Plate • We must also place the following as the FIRST line in the ML-lex definitions section. %header (functor XXXLexFun (structure Tokens: XXX_TOKENS)); • It is very important that the "type pos = int" be the same type as the %pos declaration in the sml-yacc source file, and that the "XXX" in the %header declaration in the sml-lex source file BE THE SAME as the %name declaration in the sml-yacc source file. yacc file lexfile type pos = int %header (functor XXXLexFun (structure Tokens: XXX_TOKENS)) %% %pos int %name XXX %%

  14. Tying it all together • The file "XXX.cm" ties all the pieces together. • This file has many occurrences of the string XXX, they must all be changed to the same string as in the %name directive of the sml-yacc source file. • To build a parser we do the following: • Start up sml and then use the compile-manager as follows

  15. New Boiler plate for Parser CommonTypes.sml structure CommonTypes = struct (* Put type declarations here that you *) (* want to appear in both the parser *) (* and lexer. You can open this structure *) (* else where inside your application as well *) end; XXX.cm group is CommonTypes.sml XXX.lex XXX.grm driver.sml (* Other user defined sml files go here *) $/basis.cm (* system library files *) $/smlnj-lib.cm $/ml-yacc-lib.cm

  16. The Driver file Driver.sml (* ************** Driver file **************** *) structure Driver = struct (* ******* Tie all the libraries together ******** *) structure regexpLrVals = regexpLrValsFun(structure Token = LrParser.Token); structure regexpLex = regexpLexFun(structure Tokens = regexpLrVals.Tokens); structure regexpParser = Join(structure ParserData = regexpLrVals.ParserData structure Lex = regexpLex structure LrParser = LrParser); (* ******** Build a lexer and Parser *************** *) val verboselex = ref false; Fun parse s fromfile = . . . end (* struct Driver *)

  17. The .lex files XXX.lex open CommonTypes; type pos = int type svalue = Tokens.svalue (* the type token is from the %term in XXX.grm *) type ('a,'b) token = ('a,'b) Tokens.token type lexresult = (svalue,pos) token (* Defines constructor functions for "token" *) open Tokens val lineno = ref 0 val reset_lineno = fn () => lineno := 1 . . . (* YOUR USER DECLARATIONS (if any) GO HERE *) %% %header (functor XXXLexFun(structure Tokens:XXX_TOKENS)); (* YOUR Lex-Definitions (if any) GO HERE *) %% (* YOUR RULES GO HERE *)

  18. The .grm file XXX.grm open CommonTypes; (* YOUR USER DECLARATIONS (if any) GO HERE *) %% (* declarations about the grammar *) %name XXX %term EOF | ... %nonterm go of ? | ... %pos int %start go %eop EOF %verbose (* YOUR GRAMMAR DECLARATIONS LIKE %left ETC. (if any) GO HERE *) %% go: ... EOF ( ... ) (* YOUR ADDITINAL GRAMMAR RULES GO HERE *)

  19. Putting it all together • Start sml in the directory where all the files are • Then type: CM.make “XXX.cm” • The Open the driver Library • This imports the function • parse :: string -> bool -> answer_type

  20. Standard ML of New Jersey v110.57 [built: Mon Nov 21 21:46:28 2005] - CM.make "regexp.cm"; [scanning regexp.cm] [D:\programs\SML110.57\bin\ml-lex regexp.lex] Number of states = 12 Number of distinct rows = 2 Approx. memory size of trans. table = 258 bytes [parsing (regexp.cm):regexp.lex.sml] [library $/ml-yacc-lib.cm is stable] [library $SMLNJ-ML-YACC-LIB/ml-yacc-lib.cm is stable] [loading (regexp.cm):regexp.grm.sig] [loading (regexp.cm):CommonTypes.sml] [loading (regexp.cm):regexp.grm.sml] [compiling (regexp.cm):regexp.lex.sml] [code: 9617, data: 705, env: 1871 bytes] [loading (regexp.cm):driver.sml] [New bindings added.] val it = true : bool - open Driver; opening Driver val parse : string -> bool -> Driver.regexpParser.result val verboselex : bool ref end -

  21. Boiler Plate Files • The Final BOILER PLATE files, that you can fill in, replacing XXX with the name of your parser, and filling in the ...'s with some code or rules can be found in the directory: http://www.cs.pdx.edu/~sheard/course/Cs321/LexYacc/boilerplate/ SML-version/boilerplate • You will find the 5 files • "XXX.lex" • "XXX.grm" • "XXX.cm“ • CommonTypes.sml • Driver.sml • The outline of these files is included here for your convenience • The complete example is in the file http://www.cs.pdx.edu/~sheard/course/Cs321/LexYacc/regexpParser/

More Related