240 likes | 262 Views
Overview of summer student work at CERN PH-ED, focusing on the Confluence compiler and structure. Includes an example code and description of files organization, parser and abstract syntax trees.
E N D
CONFLUENCE Compiler Structure CERN PH-ED Summer Student Work
Overview .CF .FNF .VHDL .C CERN PH-ED Summer Student Work
Example ccounter scounter is component ccounter +Width -output with constone next is constone <- {const Width 1 $} next <- (output '+' constone) output <- {reg Width next $} end scounter <- {ccounter _ _} scounter.Width <- 10 scounter.output <- {output "la sortie" $} struct simulator_s { struct { struct { unsigned long * sortie; // output sortie : 10 bits, 1 words unsigned long * clock; // input clock : 1 bits, 1 words } top; } signals; unsigned long memory[11];}; typedef struct simulator_s *simulator_t; // Simulator Initialization void init_simulator(simulator_t); // Simulator Cycle Calculation void calc_simulator(simulator_t); CERN PH-ED Summer Student Work
Example Global variables declaration ccounter scounter is component ccounter +Width -output with constone next is constone <- {const Width 1 $} next <- (output '+' constone) output <- {reg Width next $} end scounter <- {ccounter _ _} scounter.Width <- 10 scounter.output <- {output "la sortie" $} Component declaration : ports are declared without types Component description Component instantion = evaluation of the component + hardware generation CERN PH-ED Summer Student Work
Files organisation of confluence source code • Confluence 0.10.5/src/ • Cfeval : contains the core of the compiler • Parser.mly is the parser description written in Ocamlyacc • Lexer.mll is the lexer description written in ocamllex (see ocaml documentation) • Cf.ml is the main function (the one that you call when you want to compile your confluence code) • CfAst.ml contains the structures for the building of the abstract syntax tree • CfParserUtil.ml contains tools to translate confluence program into AST • CfCompiler contains the recursive compilation functions • CfTypes contains the tool functions and the data types for the compiler • Cf_fnf contains the functions aimed at writing the .fnf file, with their specific data structures • Fnflib : contains the program converting fnf into VHDL etc • Misc : contains several tools used in the compiler source code • A few libraries from Ocaml are used (see the manual) • Hashtbl, List, Array … CERN PH-ED Summer Student Work
In the cf.ml file we get : Function parse_cmd_args : checks the options, find the file name, deals with output file stuffs : Gives back (file name, compile_only, file output) Function main () : takes the result of parse_cmd_args. Acts in 3 stages : Parsing the text of the program to build the syntax tree Compile : translate the tree into instruction to be executed by the computer itself Execute the tasks and write the output file cf [options] [file] [arguments] The Cf function $ cf test_counter.cf let parse_cmd_args ... Let main() = ... ast = CfParserUtil.parse_program program CfLexer.token CfParser.file ... task = CfCompiler.compileApplication ast ... CfTypes.readyTask task; ... CfTypes.executeTasks (); ... CERN PH-ED Summer Student Work
Parser and Abstract Syntax Trees • Abstract Syntax Tree is the data structure in which the text of your program is translated, so that it makes sense for the compiler. • The structure of the AST comes from the parser itself, which recognizes the tokens (keywords) of the language. The data structure of trees is defined in CfAsts. • This is an example of what the parser looks like. • We have tokens (keywords of the language) in upper case, and references to lower structures in lower case. • Tokens are linked with keywords in the lexer. • Between braces are the instructions for the building of the AST statements : { [] } | statements statement { $1 @ [$2] } ; statement : name_space { CfAst.ApplyStmt $1 } | ifelse { $1 } | component_named { $1 } | application { CfAst.ApplyStmt $1 } | connect { CfAst.ConnectStmt $1 } ; name_space : LOCAL locals IS statements END { CfParserUtil.app... } ; CERN PH-ED Summer Student Work
Parsing what the file contains ast = CfParserUtil.parse_program program CfLexer.token CfParser.file Parse_Program (parser, lexer, file) does : • Create a fifo of files to parse, and adds the program to it (this list will contain all the different library locations in addition to your main program) • Parser and lexer are generated from the description using ocamlyacc and ocamllex in the installation procedure. (at the beginning of the installation, you can see ocamlyacc …) • The recursive Parse_all_files function takes a Parse_channel function as argument • It keeps in mind informations about position, parent file, etc… • Takes the file in a list To_Parse, check if the file is linked in the Hashtbl ASTs, and if not, parse the file using Parse_channel on it, and binds the result with files in ASTs • It links the file with its parent list in the hashtabl subs • It stores the parsed files in Files list • Builds and gives back the main AST CERN PH-ED Summer Student Work
Building the main AST • Call the recursive function Insert_sub_asts : it is a cross-recursive function in 2 parts : • Insert_sub_asts itself : takes a ast_apply (of type expr, subtype apply) and a file as arguments. Calls the function add_sub_env on (ast_apply, buildsubast file,”_file_”^ file) • Addsubenv (apply_parent, apply_sub, sub_name) takes the component part of the apply, and add a new Connect statement with name sub_name in the statements of the component, which connects to the apply_sub expression • Build_sub_asts : builds the ASTs for subfiles and libraries • This function actually builds the AST : it inserts in ast_apply the ast corresponding to the hd of files list and recursively… • The tools for the building of the AST are provided in CfAst and CfParserUtil type expr = Apply of Loc.loc * string * expr * expr list | Connect of Loc.loc * expr * expr | Name of Loc.loc * string | Comp of Loc.loc * string * string list * stmt list | DotName of Loc.loc * expr * string ... | Vector of Loc.loc * string | Record of Loc.loc * (string * expr) list CERN PH-ED Summer Student Work
Component f +a –b is … b <- {output 6 $} Examples of ASTs ConnectStmt Connect Name (f) Comp (f, [a;b], stmts) Connect Name (b) DotPosition Apply (“”, Name output, [Integer 6, Free]) 2 CERN PH-ED Summer Student Work
c = a + b Examples of ASTs Connect Infix operators in confluence : They are defined as confluence components in Base.cf, from the primitives. In the parser you get a list of all the infix operators. When encountered in the parsing process, they are translated into exactly the same AST as if they had been invoked in the standard way. This translation is completed by the application_of_infix in CfParserUtil. (have a look in base.cf) Name (c) DotPosition Apply (“”, Name (+), [a;b;Free]) 3 This refers to the component (+) named before. This component has been defined in Base.cf CERN PH-ED Summer Student Work
Base.cf is set as default environnement in the installation. It is the default library of commands. When calling a basic operator in a confluence program, it refers to one of the Base.cf component. It is possible to use other libraries : begin a confluence program using the keyword ENVIRONNEMENT “name.cf” which will add name.cf the the fifo of files to parse. It is a way to quote programs in confluence Base.cf contains all the functions corresponding to infix or prefix operators, defined using the primitives. The primitives themselves are defined in CfPrims, and they are implemented in OCaML. Operators definition : the Base.cf library Source code level Primitives Base.cf component, variable names User level CERN PH-ED Summer Student Work
Infix Operators Management • In confluence : static predefined list of infix operators. (line 317, CfParser) • The infix operators syntax is expr INFIXOP expr • From an expression like a + b the Parser will use CfParserUtil.application_of_infix to generate artificially the same thing as {(+) a b $} • So we have all the tools to define our own infix operators • It is not possible to write anything like expr INDENTIFIER expr , because it will create ambiguities in the parsing (look at the expression definition) In OCaML the way to define infix operators is to write : let (#$) a b = a + b;; then #$ can be used as an infix operator in expressions like a #$ b. The name of an infix operator cannot be a standard name with lowercase characters, in that it could generate ambiguities and conflicts -> one has to use symbol identifiers. CERN PH-ED Summer Student Work
val_ident: LIDENT { $1 } | LPAREN operator RPAREN { $2 } ; operator: PREFIXOP { $1 } | INFIXOP0 { $1 } | INFIXOP1 { $1 } | INFIXOP2 { $1 } | INFIXOP3 { $1 } | INFIXOP4 { $1 } | PLUS { "+" } | MINUS { "-" } | MINUSDOT { "-." } | STAR { "*" } | EQUAL { "=" } | LESS { "<" } | GREATER { ">" } | OR { "or" } | BARBAR { "||" } | AMPERSAND { "&" } | AMPERAMPER { "&&" } | COLONEQUAL { ":=" } ; At that point we see that here there are two totally different ways to describe functions, which are disjoined, because no operator occurrence could refer to a standard identifier. In principal one could do the same in confluence, because in confluence the identifier definition includes all the lowercase character strings, but excludes all the symbol strings, except those corresponding to base.cf components OCaML Parser for infix operators CERN PH-ED Summer Student Work
Defines the Root environnement Defines the run () function : expr (renv, CfTypesNewFree) which just initialize the recursion, and gives back the recording (renv, run ()) Defines expr as compExpr (EnvRoot,AstApply) compExpr is the recursive function that traverse the AST. Each node and each leaf is converted into tasks to be executed. Tasks to be executed consists in 3 sequences : call for the tasks of the son trees (recursive traversal) create a new task from the current node, from the tasks of the son node express this task in terms of taskedExpr, which means : ready this task to be added to the list When you eventually will evaluate this point of the compiled tree the function will only put the prepared tak into the list, so that it is going to be executed in the next loop of execution. So in that way, you only get one task in the task list, in that the following tasks to be executed are generated by the previous ones. It is also a way to manage the order of execution of instructions. task = CfCompiler.compileApplication ast Compilation let compileApplication astApply = try let expr = compExpr EnvRoot astApply in ... CERN PH-ED Summer Student Work
The compilation function 1st step let rec compExpr cenv ast = let expr = match ast with | CfAst.Apply (loc, ann, comp, args) -> compApply cenv loc ann comp args | CfAst.Connect (loc, expr0, expr1) -> compConnect cenv loc expr0 expr1 | CfAst.Cond (loc, p, t, f) -> compCond cenv loc p t f | CfAst.Name (loc, name) -> compName cenv loc name | CfAst.DotPosition (loc, sys, position) -> compDotPosition cenv loc sys position | CfAst.Comp (loc, ann, ports, stmts) -> compComponent cenv loc ann ports stmts … | CfAst.Integer (loc, i) -> compInteger cenv loc i | CfAst.Vector (loc, s) -> compVector cenv loc s | CfAst.Record (loc, fields) -> compRecord cenv loc fields in let taskedExpr renv variable = CfTypes.readyTask (renv, fun () -> expr renv variable) in taskedExpr and compStmts = … and compComp = … and compName = … … 3rd step 2nd step CERN PH-ED Summer Student Work
The CfType module • The CfType module contains the auxiliary functions for the compiler, like functions manipulating types, creating structure from raw datas… The main part of these functions are error management and various checking functions or specific type management functions. • It contains also the types of the datas the tasks will deal with : those types are quite similar to what we had in the AST, but they have a different nature : they are really part of result itself, so they are aimed at describing the results of the compilation of the AST. • They also are trees : • In CfTypes you also have the environment management : the type env is defined, and the structure of environment is described. value = Free of slot list ref * variable list ref | Integer of Intbig.intbig | Float of float | Boolean of bool | Vector of Cf_fnf.producer | Vector0 | Record of int * string array * variable array | System of renv | Comp of renv * Loc.loc * int * string * string array * (renv -> unit) | Property of property CERN PH-ED Summer Student Work
Environment management type renv = { renvId : Cf_fnf.system; renvParent : renv; (* parent env *) renvCompLoc : Loc.loc; (* loc of the component description*) renvAppLocs : Loc.loc list; (* loc the component evaluations*) renvPorts : variable;} (* ports with names and values *) • The environments are organised in layers with parent and son. The root environment is its own parent (recursive reference). • The renvPort variable is usually a Record consisting in two arrays : 1 of variable names, 1 of values. • The environment is the place where systems are translated in netlists. The environment is also the place where informations about the fnf netlist are stored. CERN PH-ED Summer Student Work
The unification function • A unification of 2 terms is an operation where the function tries to substitute a variable by an expression. This process is used in the compilation to effectively replace an instruction by its value • The unification function is the major point of the CfType module. (see The functional approach to programming by Guy Cousineau and Michel Mauny) let rec unify var0 var1 upSet = ... match (val0, val1) with (Free (slots0, frees0), Free (slots1, frees1)) -> let slotsNew = List.rev_append !slots0 !slots1 in ... | (Free (slots, frees), value) | (value, Free (slots, frees)) -> List.iter (fun var -> var := value; freeVariableDetermined ()) !frees; List.iter (fun slot -> sync slot) !slots | (Record (arity0, names0, variables0), Record (arity1, names1, variables1)) -> ... CERN PH-ED Summer Student Work
Execution of the tasks • Computes Readytasks task, which adds the tasks (renv, run()) defined to the tasklist • Runs recursively all the tasks of the list • Checks errors Instruction n-1 Instruction n Instruction n Instruction n+1 Instruction n+1 Instruction n+1 Task Evaluation of functions, modification of the instruction list … Task Evaluation of functions, modification of the instruction list … CERN PH-ED Summer Student Work
Output file • Computes Cf_fnf.output_fnf channel : this program translate the components, slots, cells defined into a text file of .fnf format (scope "top" "top" ( (dangle 0) (const 1 "0") (const 2 "1") (buf 3 10 6) (const 4 "0000000001") (add 5 10 3 4) (ff 6 10 11 7) (mux 7 10 1 8 9) (mux 8 10 2 6 5) (const 9 "0000000000") (output 10 "la sortie" 10 3) (input 11 "clock" 1) )) CERN PH-ED Summer Student Work
Type system of confluence • There is no type system in confluence, except the one coming from Ocaml itself : this means that there is no typing algorithm in confluence. • The data structures in the confluence compiler are Ocaml types. They do not correspond to confluence data types, even if they are very similar. • A way to define a typing algorithm in confluence is to start from the AST and take the tree traversal function defined for the compilation. Equiped with that, we have to implement : • Confluence types : type cf_type = etc… • The type constraints for primitives • The algorithm generating type constraints • The algorithm solving the type constraints CERN PH-ED Summer Student Work
Conclusion • The type system remains to be done. (I’m working on it) • From it we could work on overloading for operators and components • References : • www.confluent.org (confluence home page, with manual and links) • http://caml.inria.fr/pub/docs/manual-ocaml/index.html (ocaml manual, contains lexer and parser description, libraries documentation ... ) • The Functional Approach to Programming by Guy Cousineau, Michel Mauny • Modern Compiler Implementation in ML by Andrew Appel CERN PH-ED Summer Student Work