580 likes | 755 Views
Translation to Intermediate Code Chapter 7. Cheng-Chia Chen. Intermediate Representation (IR). A kind of abstract machine language Express target machine operations without machine specific detail independent of the details of the source language compiler front-end:
E N D
Translation to Intermediate CodeChapter 7 Cheng-Chia Chen
Intermediate Representation (IR) • A kind of abstract machine language • Express target machine operations without machine specific detail • independent of the details of the source language • compiler front-end: • lexical analysis parsing semantic analysis translation to IR • back-end • optimization on IR code generation • (instruction selection ; register allocation)
Java ML Pascal C C++ Sparc MIPS Pentinum Itanium Java ML Pascal C C++ IR Compiler for several languages and machines • Good IR • must be convenient for semantic analysis • must be convenient to translate into real machine code • has clear and simple meaning • ie., SIMPLE but GENERAL! #Transformations:MxN M+N
Intermediate Representation Tree • package tree; // IR tree • abstract class Exp • computation of some value (with side-Effect) • CONST(int value) • integer constant value • NAME(Label label) // prog loc • symbolic constant label ( corresponding to assembly language label, i.e., a program loc ) reference. • TEMP(temp.Temp temp) // data loc in register • a temporary (i.e., a loc for storing data); like registers in real machine but the abstract machine has infinitely many temporaries. • BINOP(int binop, Exp left, Exp right) • (left +/* right)
tree.Exp (continued) • MEM(Exp exp) // data loc in memory • the content of wordsize bytes of memory at address exp. • means “store” when at left child of MOVE and means “fetch” otherwise. • CALL(Exp func, List<Exp>args) // func is a NAME • procedure call func(args) • func is evaluated before args, which is evaluated left to right. • has side effect (caller-save, RV, argument registers etc affected) • ESEQ(Stm stm, Exp exp) //abbreviated as [stm : exp] • evaluate stm with side effect , then return exp. • has side effect
abstract class tree.Stm • perform side effect and control flow. • MOVE(Exp dst, Exp src) // two forms • MOVE(TEMP t, e) // t e • evaluate e and move it to temporary t • MOVE(MEM(e1), e2) // M(e1) e2 • evaluate e1 to get an address a, store the result of e2 into wordsize • bytes of memory at address a. • EXP(Exp exp) // ExpStm(Exp exp) or [ e :] • evaluate exp and discard the result • useful only if exp has side effect. • LABEL(Label label) // declare the loc that label denotes • Define the constant label “label” to be the current machine code address. • like a label definition in assembly. • can use NAME(label) as the target of jumps, calls,etc. • Why LABEL and NAME ? Avoid an object to be both an Exp and a Stm.
abstract class tree.Stm • JUMP(Exp exp, List<Label> targets) // also for switch • transfer control to address exp, where exp must be one value in targets. • exp may be a label reference i.e., NAME(lb) or may be an address that can be calculated by other Exp. • to jump to known Label lb, use JUMP(NAME lb, [ lb ] ) • CJUMP(int relop, Exp left, Exp right, Label iftrue, Label iffalse) • evaluate left, right; compare left and right using relop relation; • if result true goto iftrue else goto iffalse. • possible relop: EQ,NE for integer, signed integer: LT,GT,LE,GE. • unsigned: ULT,ULE,UGT,UGE. • SEQ(Stm left, Stm right) // abbreviated as [left, right] • execute left and then execute right.
Intermediate Representation Tree (cont’d) other classes • ExpList(Exp head, ExpList tail) //replaced by List<Exp> • StmList(Stm head, StmList tail) // replaced by List<Stm> Other constants: public class BINOP { final static int PLUS, MINUS, MUL, DIV, AND, OR, LSHIFT, RSHIFT,ARSHIFT,XOR; … } pubilc class CJUMP { final static int EQ, NE, LT, GT, LE,GE, ULT, ULE, UGT,UGE; …} Notes: IR can specify only the body of each function; there is no provision for procedure and function definitions. • procedure entry/exit will be added later as special glue that is different for each target language.
Translate into Tree syntaxtree.Exp translate.Exp tree.Exp Ex(),Nx(),Cx() Exp(),Stm() IdentifierExp() -> Ex.unEx() ->tree.MEM() Assign() -> Nx.unNx() ->tree.MOVE() Plus() -> Ex.unEx() ->tree.BINOP() Plus() -> Ex.unCx() ->tree.CJUMP() LessThan() -> Cx.unCx() -> tree.CJUMP() LessThan() -> Cx.unEx() ->tree.ESEQ() And() -> Cx.unCx() ->tree.CJUMP()
The Idea • How might an expression be used ? • Every expression will be used differently. • some return value, some return none, some will transfer control. • Need to define methods for an Object-oriented interface to expressions. package translate; pubic abstract class Exp { abstract tree.Exp unEx(); //Exp2Exp abstract tree.Stm unNx(); // Exp2Non abstract tree.Stm unCx(temp.Label t, temp.Label f); //Exp2ConditionalJump …}
Some subclasses of translate.Exp • Ex : Wrap a tree.Exp • Nx: Wrap a tree.Stm • Cx: Wrap a conditional or combination of Exps and Stms. • e.g;: comparison operations < (e1, e2). • If you pass it two labels: say, t for ture and f for false, then it will return you a tree.Stm that will jump to t if Exp is evaluted to true and jump to f if Exp is evaluted to false. • e.g: a > b || c < d
CJUMP(>,a,b, tt, zz) LABEL(zz:) CJUMP(<,c,d, tt, ff) a>b true false zz: c<d true false tt : ff: Example: a > b || c < d tree.Stm unCx(Label tt, Label ft) { Label zz = new Label(); return new SEQ ( new CJUMP(CJUMP.GT, a, b, tt, zz), new SEQ ( new LABEL(zz), new SEQ ( new CJUMP(CJUMP.GT, c, d, tt, ff )))
Ex: subclass of translate.Exp • an ordinary expression yielding a single value class Ex extends Exp { tree.Exp exp; Ex(tree.Exp e) {exp=e;} tree.Exp unEx() {return exp;} // if(exp!=0)goto t else goto f tree.Stm unCx(Label t, Label f) { if (exp instanceof CONST){ if (((CONST)exp).value!=0) return new JUMP(t); else return new JUMP(f);} else return new CJUMP(CJUMP.NE,exp, new CONST(0),t,f); } tree.Stm unNx() {return new tree.ExpStm(exp);} }
Nx: subclass of Translate.Exp • Expression that yields no value class Nx extends Exp { tree.Stm stm; Nx(tree.Stm s) {stm=s;} tree.Exp unEx(){return new tree.ESEQ(stm, new CONST(0));} //-> never happen tree.Stm unCx(Label t, Label f) { throw new Error("Nx.unCx"); //Never happen } tree.Stm unNx() {return stm;} }
Cx: subclass of Translate.Exp • A “conditional” expression that jumps to either t or f: • eg: flag := (a<b && c<d) new Cx(a<b && c<d).unEx() = 1 if true, 0 if false abstract class Cx extends Exp { abstract tree.Stm unCx(Label t, Label f); tree.Stm unNx() { Label tf = new Label(); return new Tree.SEQ(unCx(tf,tf), new Tree.LABEL(tf)); }
tree.Exp unEx() { Temp r = new Temp(); Label t1 = new Label(); Label f1 = new Label(); return new ESEQ( new SEQ(MOVE(new TEMP(r), new CONST(1)),// r 1 new SEQ(unCx(t1,f1),// if(?) goto t1 else goto f1 new SEQ(new LABEL(f1),// f1: new SEQ(new MOVE(new TEMP(r), new CONST(0)),//r 0 new LABEL(t1))))),// t1: new TEMP(r));// return r } }
Translate Simple Variables • A simple variable v declared in the current stack frame (i.e., formal parameter or local ) will be translated to : MEM(BINOP(PLUS, TEMP fp, CONST k)) or abbreviated as MEM(+(TEMP fp, CONST k). where • k is the offset of v within the frame • TEMP fp is the frame pointer register
Frame class : all machine-dependent stuff Package frame; public class Frame { … abstract public Temp FP(); abstract public int wordSize(); } public abstract class Access{ public abstract tree.Exp exp(tree.Exp framePtr) } • The exp() method of Frame.Access is used to turn a Frame.Access into atree expression.
Tree to access the local variable in Frame The framePtrargument of exp(.) is the address of the stack frame that the Access lives in. For miniJava: 1. If v is inFrame(k) then • v.exp(new TEMP(frame.FP())) yields • MEM(+(TEMP fp, CONST k)) 2. if v is InReg(Temp txxx) => • v.exp(…) yields TEMP(txxx). • Note both inFrame(k) and InReg(t) are Frame.Access objects, Hence by calling their common exp() method, we can yield different results while translation without having to distinguish each other (simply call v.exp(.) ).
Array Variables • In MiniJava array variables behave like pointers (scalars ) • new array values are created and initialized by the constructor new int[n], where n is the number of elements and 0 is the initial value of each element. • MiniJava objects are also pointers (scalars) • object assignment is the pointer assignment. • Cf: Array of C, Pascal (structure values) • int[20] a, b ; … • a = b // C address copy ; Pascal content copy • Cf: Records of C, Pascal (structure values) • like above.
l-value v.s. r-value • Not all Exps can appear at the left hand side of an assignment. (or MOVE in IR) • examples: • x+1 f(2) – 3; 5 x – 2 ; f(x) + 2 10 (x) • MOVE(5, e); MOVE(+(2, fp), e) (x) • f(x) x; (?) • x 5 ; a[j+2] 4; p.age 5 (0) • MOVE( MEM(…), e); MOVE(TEMP(t), e) (0) • L-values are (results of) expressions that can appear at the left hand side of an assignment. • x, a[j+1], p.age • R-values are expressions that are not L-values and hence can ONLY appear at the right hand side of an assignment. • 5, f(2) -3, • Expression = L-Value U R-Value
Structured L-values • Meaning of L-values is context-sensitive. • x = x + 1 ; • left location ; right content • l-value : left of assignment -> location • eg. x, p.y, a[i+2] • r-value + l-value: right of assignment -> contents • eg. a+3, f(x) + all l-values • All the variables and l-values in MiniJava are scalar • no case like A := B // copy B array into A array • if not, we need size of variables in Access • We need to modify MEM(Exp) to MEM(Exp, Size) • C struct, PASCAL array and record are structured l-value.
Subscripting and field selection • a[i] (i-low) x int_size + a // = size x i + (a – low x size) • a : array[low .. high] of Integer. // Pascal style • a.f a+ offset(f) • a: record { f1: Type1, f2:Type2,…}; • if a is global constant addr, a – low x int_size = Wcan be precomputed. • if low = 0 and a = MEM(e) => a[i] = MEM(+(MEM(e), BONOP(MUL, i, CONST(int_size)); • Technically, l-value should be represented as an address. • I.e., as r-value ==> Content(addr). as l-value => ADDR(addr) • In the book, l-value is represented as MEM[addr] with the knowing that • MEM means both store(in lhs) and fetch(elsewhere).
Array boundary check • Compiler need to emit code to check array boundary. • but if it can guarantee the safety, it may omit the checking code. • In MiniJava, we require to store the length of an array: • int len = … ; • int[ ] a = new int[ len ] ; • We can use a[0] to store the length of a and store elements of a from a[1]. • a.length a • a[0] a + int_size • a[k] a + (k+1) x int_size = (a + int_size) + k x int_size • and if k > MEM(a) array out of bound!
Arithmetic • AST operator IR Tree operator • Plus( e1, e2) BINOP(+, |[ e1 ]|, |[ e2]| ) • Times(e1, e2) ? • no unary arithmetic operator • e.g. – 4 ==> 0 – 4. • ==> BINOP(MUNIS, CONST(0), CONST(4)). • ~ n ==> 0 XOR n • ==> BINOP(XOR, CONST(0), |[ n ]| ) • where |[ n ]| means the translation of n.
Boolean operations on logical expressions • In miniJava • true stored as 1 • false stord as 0 AST to IR translations: • true new EX( CONST(1) ) • .unEx() = CONST(1); .unCx(tt, ff ) = Jump tt • false new Ex ( CONST(0) ) • op( b1, b2 ) BINOP( op, |[ b1 ]|, |[ b2 ]| ). • op is AND, OR, XOR etc. • not(b1) BINOP( XOR, CONST(111…11), |[ b1 ]| ). • Bit operations : << , >>, >>>, &, |, ~ • Analogous to boolean operations.
Conditionals • What is the result of a comparison like “ e1 < e2 “ ? • 1. store in a boolean temporary. // rlt = ( e1 < e2 ) ; • 2. Used to change control flow (will be a Cx expression ). • if( e1 < e2 ) then s1 else s2 ; • A statement that will JUMP to any true-or-false destination • AST to IR : • Simple Cx expression from AST comparison operator • CJUMP operator. • x < 5 RelCX( <, |[ X ]|, CONST(5) ) = cx with • cx.unCx(tt,ff) returns CJUMP(LT, |[ X ]|, CONST(5), tt, ff) • cx.unEx() returns [ r 1, unCx(t,f), f: r 0, t: : r ]
Relative Operator: RelCx() package translate; import temp.Label; public class RelCx extends Cx { int op; tree.Exp left, right; RelCx(int o, Tree.Exp l, Tree.Exp r) { op = o; left = l; right = r; } Tree.Stm unCx(Label t, Label f) { return new tree.CJUMP(op, left, right, t, f); } }
For relational operator, we have seen RelCx class, which has relop, left Exp and right Exp. • &(and), |(or) and !(not) are translated into If-Then-Else AST. • (A & B) A? B: false • (A | B ) A? : true:B • not A A? : false : true • If e1 then e2 else e3, where • e1 is an Cx, e2 and e3 are Ex.
IfThenElse(Exp e1, Exp e2, Exp e3).unEx() • Make two labels t and f to which conditional will branch. • tr(e1).unCx(t,f) • Allocate temp r, and after label t, move e2 to r. Jump to label join. • SEQ(Label(t),Move(TEMP(r), tr(e2).unEx()), JUMP(join)) • After label f, move e3 to r. Jump to label join. SEQ(Label(f), Move(TEMP(r), tr(e3).unEx(),JUMP(join)) • join: TEMP(r) • If e2 and e3 are statements, replace unEx() by unNx().
IfThenElse(Exp e1, Exp e2, Exp e3).unCx(tt, ff) • e1, e2 and e3 are (boolean) Exp • Make two labels t and f to which conditional will branch. • tr(e1).unCx(t,f) • Adda new Label t +… • [ t: ,tr(e2).unCx(tt, ff), f:tr(e3).unCx(ff, tt) ] • Aa a result : [ tr(e1).unCx(t,f) , t: , tr(e2).unCx(tt, ff), f: ,tr(e3).unCx(ff, tt) ]
Ex : IFThenElse(e1, e2,e3).unCx(tt,ff) • if ( x<5 ) then a > b else false • SEQ( CJUMP(LT, x, CONST(5), t, f), • SEQ( LABEL(t), • CJUMP(GT, a, b, tt, ff))) or • [ CJUMP(LT, x, 5, t, f ), • t: CJUMP(GT, a, b, tt, ff) • f: JUMP ff. • ] note: tt and ft come from uncx(tt,ff) method call.
If Then Else Exp class IfThenElseExp extends Exp { Exp cond, a, b; Label t = new Label(); Label f = new Label(); Label join = new Label(); IfThenElseExp(Exp cc, Exp aa, Exp bb) { cond=cc; a=aa; b=bb;} Tree.Stm unCx(Label tt, Label ff) { return new Tree.SEQ(cond.unCx(t,f), new Tree.SEQ(new Tree.LABEL(t), new Tree.SEQ(a.unCx(tt,ff), new Tree.SEQ(new Tree.LABEL(f), b.unCx(tt,ff))))); }
Exp unEx() { Temp r = new Temp(); return new ESEQ( new SEQ(cond.unCx(t,f), new SEQ(new Tree.LABEL(t), new SEQ(new MOVE(new TEMP(r),a.unEx()), new SEQ(new JUMP(join), new SEQ(new LABEL(f), new SEQ(new MOVE(new TEMP(r), b.unEx()), new LABEL(join))))))), new TEMP(r)); } //[ cond? r a.unEx() : r b:unEx() ; return r]
Stm unNx() { return new SEQ(cond.unCx(t,f), new SEQ(new LABEL(t), new SEQ(a.unNx(), newSEQ(new JUMP(join), new SEQ(new LABEL(f), new SEQ(b.unNx(), new LABEL(join))))))); } } // cond? a.unNx() : b.unEx()
String • For a string literal lit: • Translator needs • make a new Labellab, and • return the tree tree.NAME(lab). • should also put the assembly fragment • frame.string(lab,lit) • onto a global list of data fragment to be handled by code emitter. • All string operations are performed by runtime system function. • allocate space, return pointer. • compiler does’t need to know its representation
Record and Array construction • new a { f1 =e1,f2=e2,…,fn=en } • create and initialize n-element record. • records may outlive the procedure activation • => allocate space at heap area. • => no freeing action, delegating to GC. • create n-word memory by calling external function and get a pointer TEMP(r) to the heap area. Then move n values e1,…en to location 0, 1w, 2w, …,(n-1)w relative to r. • ESEQ ( • TEMP(r) CALL(NAME(malloc), CONST(n*w)) • M(TEMP(r) + 0W) e1; • … • M(TEMP(r) + (n-1)W) en; • TEMP(r) )
Array Allocation : a = new int [e] • Determine how much space is needed for the array: -- =((length of array +1)x(size of integer)) --keep the length of array with array itself. -- for bound checking( a[n] ? ); and for array length access (a.length ). • Call an external function to allocate on the heap. • Generate code for saving the length of the array at offset 0. • Generate the code for initializing each of the values in the array to zero starting at the 1st element. ESEQ ( TEMP(r) CALL(NAME(Label(“malloc”), e*w+ w)) M(TEMP(r) + 0W) e; {for(int i = 1 to e) M(TEMP(r) + i*w) CONST(0); } TEMP(r) )
Calling runtime-system function • ex: call external fun named “InitArray” for array initialization with arguments a,b. • generate a general call such as static Label initArray = new Label(“initArray”); new CALL( new NAME(initArray), new List[] { a, b } ); • or target-machine specific details encapsulated into a fucntion provided by the frame: public abstract class Frame{ … abstract public tree.Exp externalCall(String func,List<tree.Exp> args); } aFrame.externalCall(“initArray”, args )
Implementing externalCall • depends on the relationship b/t MiniJava’s procedure call convention and that of external function. • The simplest look like tree.Exp externalCall(String s, tree.ExpList args) { return new tree.CALL( new tree.NAME(new temp.Label(s)), args)); } • may need to adjust static links, label underscores etc. • => new tree.CALL( new tree.NAME(new temp.Label(“__”+s)), [slink | args]));
while(cond, body).unNx() • while (cond) do body • AST2IR translation: JUMP(test) top: <body> // tr(<body>).unNx() test: if <cond> then top else done // tr(<cond>).unCx(top,done) done: • if break appear within the body(exp) , the translation is JUMP(done). • if continue appear within the body(exp) , the translation is JUMP(test).
The translation transStm(WhileStm w, Label done’ // break target ) { Label done = new Label(); tree.Exp condition = transExp(w.cond, done’); tree.Exp body = transStm(w.body, done); return whileExp(cond, body, done); } transStm(BreakStm b, Label done) { return new Nx( new JUMP( done ) ); }
While Loop public Exp whileExp(Exp cond, Exp body, Label done) { Label test = new Label(); Label top = new Label(); return new Nx(new Tree.SEQ(new Tree.JUMP(test), new Tree.SEQ(new Tree.LABEL(top), new Tree.SEQ(body.unNx(), new Tree.SEQ(new Tree.LABEL(test), new Tree.SEQ(cond.unCx(top,done), new Tree.LABEL(done))))))); }
Function Call • Function call f(a1, …, an) CALL(NAME lf, [e1,e2,…,en]) where lf : label for f • In general, add static link as an extra argument CALL(NAME lf,[sl,e1,e2,…,en]) • In an OOP language, we need this (current object pointer) as an argument. • For a private methodp.m(a1, …, an): CALL(NAME lc$m, [p,e1,e2,…,en]) • For dynamic methods, we need dipatch tables….
Static Links • When a variable x is declared at an outer level of static scope, the static link must be used. MEM(+(CONST kn, MEM(+(CONST kn-1, …. MEM(+(CONST k1, TEMP FP))…)))) where k1, ….kn-1 are the static link offsets kn is the offset of var x in local frame • exp() method in Frame.Access need to calculate the chain of static links to dereference. • i.e., simply call env.get(“x”).getAccess().exp().
Function call with Static Links • function call f(a1,a2,…,an) • add static link as an extra parameter • CALL(new NAME(labf), [sl, a1,a2,…,an]).
Declarations • For each variable declaration within function body (including parameters) • Allocate additional space in the current frame (of the current level) • For each function declaration • Keep a new “fragment” of Tree code for function’s body
Variable Definition • Translator should return an augmented type environment: • update symbol table • Initializing variables: • translated into a Tree exp that must be put just before the body of function =>return a Translate.Exp containing assignment expressions. • Translator will return “no-op” (eg. Ex(CONST(0)) if applied to function and type declarations.
Function Definition • function => prologue body epilogue • prologue • Pseudo-instruction to announce the beginning of the function. (Assembly dependent) • label definition for function name • instructions to adjust the stack pointer (allocate new frame) • instructions to save escaping arguments. including static link, moving non-escaping arguments into fresh temporary registers • save return address, callee-saved registers,…