240 likes | 247 Views
An exploration of code generation techniques and register tracking algorithms for optimizing machine code. Includes examples and explanations.
E N D
CODE GENERATION Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University Taipei, TAIWAN
Introduction (1/2) • simplest -- macro expansion • expand each intermediate tuple into an equivalent sequence of target machine instructions • example (*,B,C,T1) (+,E,A,T6) ( Load B, R1 ) ( Load E, R1 ) (*,D,E,T2) (+,T6,C,T7) ( * C, R1 ) ( + A, R1 ) (+,T1,T2,T3) (:=,T7,F) ( Load D, R2 ) ( + C, R1 ) (:=,T3,A) ( * E, R2 ) ( Store F, R1 ) (+,D,E,T8) ( + R2, R1 ) (-,D,B,T4) (:=,T8,A) ( Store A, R1 ) ( Load D, R1 ) (+,C,T4,T5) ( + E, R1 ) (:=,T5,D) ( Load D, R1 ) ( Store A, R1 ) ( - B, R1 ) ( Load C, R2 ) ( + R1, R2 ) ( Store D, R2 ) • assuming : cost = 2 for (LOAD,S,R), (STORE,S,R), (OP,S,R) cost = 1 for (OP,R,R) • total cost = 34 • not optimized, e.g., swapping C and T4
Introduction (2/2) • optimized code • difficult to achieve (even define) • alternatives • left to code optimizer (global optimization) • heuristics (local optimization) • within a basic block • problems (at least) • instruction selection • addressing mode selection • register allocation
function( space, time ) space -- variable, register time -- execution sequence Live -- whose value will be used later Dead -- whose value is useless later determination -- by a backward pass usage -- e.g., to free a dead register is cheaper than to free a live register example of Live/ Dead ( *, B, C, T1) ( *, D, E, T2) ( +, T1, T2, T3 ) ( :=, T3, A ) ( -, D, B, T4 ) ( +, C, T4, T5 ) ( :=, T5, D) ( +, E, A, T6 ) ( +, T6, C, T7) ( :=, T7, F ) ( +, D, E, T8) ( :=, T8, A) Live or dead
Example code generation (1/5) • Generate code for integer add: (+,A,B,C) Possible operand modes for A and B are: (1) Literal (stored in value field) (2) Indexed (stored in adr field as (Reg,Displacement) pair; indirect=F) (3) Indirect (stored in adr field as (Reg,Displacement) pair, indirect=T) (4) Live register (stored in Reg field) (5) Dead register (stored in Reg field) Possible operand modes for C are: (1) Indexed (stored in adr field as (Reg,Displacement) pair, indirect=F) (2) Indirect (stored in adr field as (Reg,Displacement) pair, indirect=T) (3) Live register (stored in Reg field) (4) Unassigned register (stored in Reg field, when assigned) (a) Swap operands (knowing addition is commutative) if (B.mode == DEAD_REGISTER || A.mode == LITERAL) Swap A and B; /* This may save a load or store since addition overwrites the first operand. */
Example code generation (2/5) (b) “Target” the result of the addition directly into C (if possible). switch (C.mode) { case LIVE_REGISTER: Target = C.reg; break; case UNASSIGNED_REGISTER: if (A.mode == DEAD_REGISTER) C.reg = A.reg; /* Compute into A's reg, then assign it to C. */ else Assign a register to C.reg; C.mode = LIVE_REGISTER; Target = C.reg; break; case INDIRECT: case INDEXED: if (A.mode == DEAD_REGISTER) Target = A.reg; else Target = v2; /* vi is the i-th volatile register. */ break; }
Example code generation (3/5) (c) Map operand B to right operand of add instruction (the "Source") if (B.mode == INDIRECT) { /* Use indexing to simulate indirection. */ generate(LOAD,B.adr,v1,“”); /* v1 is a volatile register. */ B.mode = INDEXED; B.adr = (address) { .reg = v1; .displacement = 0; ); } Source = B;
Example code generation (4/5) (d) Now generate the add instruction switch (A.mode) { /* “Fold” the addition. */ case LITERAL: generate(LOAD,#(A.val+B.val),Target,“”);break; /* Load operand A (if necessary). */ case INDEXED:generate(LOAD,A.adr,Target,“”);break; case LIVE_REGISTER: generate(LOAD,A.reg,Target,“”);break; case INDIRECT:generate(LOAD,A.adr,v2,“”); t.reg = v2; t.displacement = 0; generate(LOAD,t,Target,“”); break; case DEAD_REGISTER: if (Target != A.reg) generate(LOAD,A.reg,Target,“”); break; } generate(ADD,Source,Target,“”);
Example code generation (5/5) (e) Store result into C (if necessary) if (C.mode == INDEXED) generate(STORE,C.adr,Target,“”); else if (C.mode == INDIRECT) { generate(LOAD,C.adr,v3,“”); t.reg = v3; t.displacement = 0; generate(STORE,t,Target,“”); }
Register tracking (1/8) • associating more than one variables to a registerwhen they have the same value • LOAD -- when valuable • STORE -- postponed as late as possible • status of value on register • Live or Dead • Store or NotStore • extra-cost to free a register = Sassociated variables V costV • 0 -- ( D, NS ) or ( D, S ) • 2 -- ( L, NS ) • 4 -- ( L, S )
Register tracking (2/8) • procedure Assignment (:=,X,Y): if (X is not already in a register) Call get_reg() and generate a load of X if (Y, after this tuple, has a status of (D,S) ) generate(STORE,Y,Reg,“”) else Append Y to Reg's association list with a status of (L,S) /* The generation of the STORE instruction has been postponed */
Register tracking (3/8) machine_reg get_reg(void) { / * Any register already allocated to the current tuple is NOT AVAILABLE * for allocation during this call. */ if (there exists some register R with cost(R) == 0) Choose R else { C = 2; while (TRUE) { if (there exists at least one register with cost == C) { Choose that register, R, with cost C that has the most distant next reference to an associated variable or temporary break; } C += 2; } for each associated variables or temporaries V with status == (L,S) or (D,S) generate(STORE,V,R,“”); } return R; }
Register tracking (4/8) (OP,U,V,W) where OP is noncommutative: if (U is not in some register, R1) Call get_reg() generate(LOAD,U,R1,“”); else /* R1's current value will be destroyed */ for each associated variables or temporaries X with status == (L,S) or (D,S) generate(STORE,X,R1,“”); if (V is in a register, R2) /* including the possibility that U == V */ generate(OP,R2,R1,“”) else if (get_reg_cost() > 0 || V is dead after this tuple) generate(OP,V,R1,“”) else { /* Invest 1 unit of cost so that V is in a register for later use */ R2 = get_reg() generate(Load,V,R2,“”) generate(OP,R2,R1,“”) } Update R1's association list to include W only.
Register tracking (5/8) (OP,U,V,W) where OP is commutative: if (cost((OP,U,V,W)) <= cost((OP,V,U,W))) generate(OP,U,V,W); /* using noncommutative code generator */ else generate(OP,V,U,W); /* using noncommutative code generator */ with cost((OP,U,V,W)) = (U is in a register ? 0 : get_reg_cost() + 2) /* Cost to load U into R1 */ + cost(R1) /* Cost of losing U */ + (V is in a register || U == V ? 1 : 2) /* Cost of reg-to-reg vs. storage-to-reg */
Register tracking (6/8) • example
Pascal’s P-code (1/4) • (in, input, output), M: machine code, A: Assembly, Pc: P-code, Pa: Pascal P-code interpreter (A, Pc, *) Fibonacci generator (A, , *) 1,1,2,3, 5,8,... Assembler (M, A, M) P-code interpreter (M, Pc, *) Fibonacci generator (M, , *) hardward
Pascal’s P-code (2/4) • (in, input, output), M: machine code, A: Assembly, Pc: P-code, Pa: Pascal Pascal compiler (Pa, Pa, Pc) Fibonacci generator (Pa, , *) 1,1,2,3, 5,8,... P-code interpreter (A, Pc, *) Pascal compiler (Pc, Pa, Pc) Fibonacci generator (Pc, , *) Assembler (M, A, M) P-code interpreter (M, Pc, *) hardward
Pascal’s P-code (3/4) • (in, input, output), M: machine code, A: Assembly, Pc: P-code, Pa: Pascal Pascal compiler (Pa, Pa, Pc) Pascal compiler’ (Pa, Pa, M) Fibonacci generator (Pa, , *) P-code interpreter (A, Pc, *) Pascal compiler (Pc, Pa, Pc) Pascal compiler’ (Pc, Pa, M) 1,1,2,3, 5,8,... Assembler (M, A, M) P-code interpreter (M, Pc, *) Fibonacci generator (M, , *) hardward
Pascal’s P-code (4/4) • (in, input, output), M: machine code, A: Assembly, Pc: P-code, Pa: Pascal Pascal compiler (Pa, Pa, Pc) Pascal compiler’ (Pa, Pa, M) Fibonacci generator (Pa, , *) P-code interpreter (A, Pc, *) Pascal compiler (Pc, Pa, Pc) Pascal compiler’ (Pc, Pa, M) 1,1,2,3, 5,8,... Assembler (M, A, M) P-code interpreter (M, Pc, *) Pascal compiler’ (M, Pa, M) Fibonacci generator (M, , *) hardward
API [m. indep] [m. dep] (C) (J) byteCode interpreter (java.exe + *.dll) (M, B, *) Java’s byteCode (1/2) • (in, input, output), M: machine code, C: C or C++, B: byteCode, J: java • javac.exe = java.exe + sun.tools.javac.Main() Java VM, byteCode interpreter [m. indep] [m. dep] (C, B, *) Java compiler (J, J, B) Fibonacci generator (J, , *) 1,1,2,3, 5,8,... Java compiler (B, J, B) Fibonacci generator (B, , *) API (B) cc (M, C, M) hardward
API [m. indep] [m. dep] (C) (J) byteCode interpreter (java.exe + *.dll) (M, B, *) Java’s byteCode (2/2) • (in, input, output), M: machine code, C: C or C++, B: byteCode, J: java • javac.exe = java.exe + sun.tools.javac.Main() Java VM, byteCode interpreter [m. indep] [m. dep] (C, B, *) Java compiler (J, J, B) Fibonacci generator (J, , *) Java compiler (B, J, B) Fibonacci generator (B, , *) 1,1,2,3, 5,8,... API (B) J IT cc (M, C, M) Fibonacci generator (M, , *) hardward
QUIZ • QUIZ: Term Project