390 likes | 520 Views
Daniela da Cruz Pedro Rangel Henriques Universidade do Minho Braga - Portugal. Project Context. This project emerged in the context of another project: CoCo/RF Partners: Universidade da Beira Interior (UBI), University of Linz, and Universidade do Minho (UM)
E N D
Daniela da Cruz Pedro Rangel Henriques Universidade do Minho Braga - Portugal gEPL / DI-UM
Project Context This project emerged in the context of another project: CoCo/RF • Partners: Universidade da Beira Interior (UBI), University of Linz, and Universidade do Minho (UM) • Aim: to port the Compiler Generator CoCo/R (developped at ULinz) to OCaml (more precisely, to F#) gEPL / DI-UM
Project Aims • To explore as much as possible one of the present implementations of CoCo/R (we choose C# version), to understand the generator, the generated compiler, and the CoCoL specifications. • To develop a complete compiler for an imperative and structured programming language. gEPL / DI-UM
Compiler Translates a Source Program, written in LISS, into Assembly code of the Target Machine • First Version (simple LISS spec) • MSP (very simple virtual stack machine) • Second Version (full LISS spec) • VM (powerful virtual stack machine) gEPL / DI-UM
Compiler • Top-Down parser • Pure Recursive-Descent • Solving LL() Conflits with a lookahead(n) strategy • Syntax-Directed Translator (static / dynamic semantic rules executed during parsing), supporting Inherited and Synthesized Attributes • Implemented in C# gEPL / DI-UM
Compiler • Generated by the Compilers Compiler • CoCo/R (C# implementation) • LISS syntax and semantics was specified by and AG written in CoCoL • Input files • Liss.ATG +SymbolTable.cs+VMcodegen.cs • Output file • Liss.exe gEPL / DI-UM
The Programming Language LISS LISS stands for Language of Integer, Sequences and Sets Liss is a high-level toy language, appropriate to teach basic skills on imperative (procedural) structured programming. The language follows the verbose Pascal style, with long keywords (case-insensitive) gEPL / DI-UM
The Programming Language LISS LISS was designed with the main goal of being a challenging case-study for Compiler courses. With a proper Syntax (simplyfying the syntatic analysis), LISS has an unsual Semantic definition! It requires a powerful Semantic Analysis, and a clever Machine-code Generation. gEPL / DI-UM
The Programming Language LISS The design of LISS emphasizes: • The static (compile time) and dynamic (run time) type checking • The scope analysis • The code generation strategies (translation schemas) to support non-standard Data-types, I/O and Control Statements gEPL / DI-UM
Errors Detection program Errors { Declarations a := 4, b -> Integer; d := true, flag -> boolean; array1 := [[1,2],[2,3]], vector -> Array size 4,3; seq1 :=<<1,2,3,4>> -> Sequence; Statements // here we can't make "8/d", because "d" is boolean type b = 6 + 8 * 5 - 8/d; // "array1" it's a vector bi-dimensional, so we can't only give one index a = array1[1]; // we can't give a boolean type to result of indexing array1 b = array1[0,0]; // must flag an error because final tail w'll be empty a = head(tail(tail(tail(tail(seq1))))); } gEPL / DI-UM
Variable Declarations Variables of any type can be initial declared. All variables are initialized with default values (0,empty,false) Initial values can be assigned in declarations i, j, count=100 -> INT; vec=[[[1,2,3],[4,5],[6]]] -> ARRAY size 4,3,6 lst1, lst2=<<9,8,7,6>>, lst=<<11>> -> SEQ; Codes, C3, C2, C1={y|y>250} -> SET; flag=True, exists, found=True -> Bool gEPL / DI-UM
Data Types: Integers • Algebraic (+ - * /) • Successor (suc) & Predecessor (pred) • Relational Operators ( = != < <= > >= ) • I/O: • Read read( i ) • Write write( i ); writeLn( a*b/2 ) gEPL / DI-UM
Data Types: Integers program IntegerTest { Declarations intA := 4, intB, intC := 6 -> Integer; i, j, k -> Integer; Statements // arithmetic operations intA = -3 + intB * (7 + intc); writeLn(intA); // input read(i); read(j); writeLn( i/j ); /* Inc/Dec operations */ writeLn( pred(INTc) ); writeLn( suc(INTc) ); } gEPL / DI-UM
Data Types : IntegersConstraints - Division by zero is not defined gEPL / DI-UM
Data Types: Static Sequences (Multi-Dimensional Arrays) • Indexing vec[i] = vec[2] + vec[j-4] vec3[i,j,k] = i*j*k • Assignment vec2 = vec1 • Length length( vec2 ) • I/O: • Write write( vec ) gEPL / DI-UM
Data Types : Static SequencesConstraints • Any index must be in the values range • The number of indexes must agree with dimension gEPL / DI-UM
Data Types: Static Sequences program ArrayTest { Declarations vector1 := [1,2,3], vector2 -> Array size 5; array1 := [[1,2],[4]] -> Array size 4,2; array2 := [ [[1],[5]], [[2,2],[3]] ] -> Array size 4,3,2; Statements a = array2[1,2,3]; b = array2[1,0,a*2]; array2[2,b,a] = 15; writeLn(array2); vector2 = vector1; } gEPL / DI-UM
Data Types: Dynamic Sequences • Linked Lists • Empty List • List with some elements gEPL / DI-UM
Data Types: Dynamic Sequences Opers: • Insert & Delete cons( 2,lst ); del( 2,lst ) • Head & Tail i = head(lst); list = tail(lst); • Member & IsEmpty if ( isMember(2,lst) )…; while ( isEmpty(lst) )…; gEPL / DI-UM
Data Types: Dynamic Sequences (cont.) • Indexing lst[3]; names[2*i-j] • Length length( lst ); • Assignment lst2 = lst1; copy( lstSrc , lstDest ); • I/O: • Write write( lst ) gEPL / DI-UM
Data Types: Dynamic Sequences program Seq { Declarations seq1 :=<<10,20,30,40,50>>, seq3 := <<1,2>>, seq2 -> Sequence; Statements // Selection Operations a = head(tail(tail(seq1))); seq2 = TAIL(seq3); seq1 = tail(seq1); // add & delete an element of a sequence cons(3*4+a,seq2); cons(a*int1*int2,seq2); del(30,seq1); // tests (empty & void) b = isEmpty(seq2); writeLn(b); if ( member(1,seq3) ) then writeLn(“Is member list"'); // indexing a sequence int1 = seq1[2*head(tail(seq3))]; // write the sequence seq3 = seq1; write(seq3); } gEPL / DI-UM
Data Types : Dynamic SequenceConstraints - gEPL / DI-UM
Data Types: Sets • Sets are: • Defined in comprehension Codes = { x | x>=100 && x<500 } • Represented (in memory) in a binary tree gEPL / DI-UM
Data Types: Sets • Union (++) & Intersection (**) C3 = C1++C2; C3 = C1**C2 • Member (in) while ( N in Codes ) • I/O: • Write write( C3 ) gEPL / DI-UM
Data Types: Sets program Sets { Declarations bool := true, flag, flag2 -> BOOLEAN; e, f := { x | x > 7}, g := {x | x < 8 || x > 15 && x < 13 } -> Set; Statements -- sets e = f ++ g; f = g ** f; g = g ** { x | x > 6 }; flag = a << e; flag2 = a << g; bool = 8 << {x | x < 10 && x > 7 }; } gEPL / DI-UM
Data Types: Booleans • Boolean Operators • && • || • not gEPL / DI-UM
Data Types: Booleans program BoolTest { Declarations intA := 4, intB, intC := 6 -> Integer; bool, flag := false -> booLEaN; Statements bool = intA < 8; writeLn(bool); /* logic operations */ flag = (intB != intA) && (intA > 7) || bool; writeLn(flag); bool = !( (intA == intB)||(intA != intC)&&(intC < 6) ) || flag; writeLn(bool) } gEPL / DI-UM
SubPrograms Subprograms, with zero ormore parameters, can be • Functions (return a value) • Procedures (don’t return a value) can be declared • at the same Level • Nested (any deeper) can be called any where they are visible. gEPL / DI-UM
SubPrograms subProgram calculate() :: integer { Declarations res := 6 -> integer; index -> INTeger; subprogram factorial(n -> integer) :: integer { Declarations res := 1 -> integer; Statements while (n > 0) { res = res * n; n = n -1; } return res; } for ( index in 0…4 ) { res = factorial(a); writeLn(res); } } gEPL / DI-UM
SubPrograms • On most top level, with previous example, we can: intA = calculate(); • But, we can’t: • intA = factorial(6); gEPL / DI-UM
SubPrograms subprogram factorial(n -> integer) :: integer { Declarations res := 1 -> integer; Statements while (n > 0) { res = res * n; pred(n); } return res; } subProgram calculate(m -> array size 4) :: array size 4 { Declarations fac := 6 -> integer; res := -16 -> integer; Statements for (a in 0..3) stepUP 1 { m[a] = factorial(fac + a); } return m; } gEPL / DI-UM
Control Statements • If () Then {} [ Else {} ] • if d == true then { if( flag ) tHen { a = 6;} else { b = 9; } } • if !(c[2]==0) then { a = 5; write(a); } else { b=7; c[5] = 5; write(b); } gEPL / DI-UM
Control Statements • While () {} • while(isMember(10,seq2)) { delete(10,seq2); writeLn(seq2); } • while(length(array) != 10) { array[i] = i; suc(i); } gEPL / DI-UM
Control Statements For For i in v1[..v2 [stepup/stepdown N]] [satisfying Exp] gEPL / DI-UM
VM Architecture • Virtual Machine with: • Instructions Stack ( program) • Calling stack - save pointers pairs (i,f): • i – save pc • f – save fp • Execution stack ( global/local/working memory) • Two Heaps • Four registers (pc, sp, fp, gp) gEPL / DI-UM
VM Architecture gEPL / DI-UM
VM - Instruction Set • Data Transfer • Push / Load • Pop • Store • Alloc / Free • IO • Read • Write gEPL / DI-UM
VM - Instruction Set • Control • Jump • Call • Return • Miscellaneous • Type Conversion • Check • Start • Stop • Err gEPL / DI-UM
Project Documentation • Technical Report on LISS Compiler Development, written in NoWeb, includes • LISS Specification (AG in CoCol)+ • Sample LISS Programs (Tests) • Compiler’s Internal Data Structures • Target Machine Description (VM) • Translation Schemas gEPL / DI-UM