690 likes | 713 Views
This chapter explores various levels of Intermediate Representations (IRs) in compilers, covering High-Level IRs, Medium-Level IRs, Low-Level IRs, and Multi-Level IRs. It discusses the design issues, examples from MIPS and PA-RISC compilers, the need for multiple representations, and the significance of internal versus external representations. Additionally, it delves into abstract syntax trees, different HIRs, medium-level IR characteristics, and low-level IR specifics in compiler development.
E N D
Intermediate Representations(Irs Chapter 4) Mooly Sagiv Schrierber 313 03-640-7606 http://www.math.tau.ac.il/~sagiv/courses/acd.html
Outline • Issues in IR design • High-Level IRs • Medium-Level IRs • Low-Level IRs • Multi-Level IRs • MIR, HIR, and LIR • ICAN Representations • Other IRs • Conclusions
Compiler Structure String of characters Scanner tokens Parser Symbol table and access routines OS Interface AST Semantic analyzer IR Code Generator Object code
Intermediate Language Selection • Low Vs. High level control flow structures • Flat Vs. Hierarchical (tree) • Machine Vs. High level of instructions • (Symbolic) Registers Vs. Stack • Normal forms (SSA) • Intermediate forms: Control Flow Graph, Call Graph, Program Dependence Graph • Issues: Engineering, efficiency, portability, optimization level, taste
LIR s2 s1 s4 s3 s6 s5 L1: if s2 >s6 goto L2 s7 addr a s8 4*s9 s10 s7+s8 [s10] 2 s2 s2 + s4 goto L1 L2: MIR v v1 t2 v2 t3 v3 L1: if v >t3 goto L2 t4 addr a t5 4*i t6 t4+t5 *t6 2 v v + t2 goto L1 L2: IRs in the Book HIR for v v1 by v2 to v3 do a[i] :=2 endfor
Issues in IR Design • Portability • Optimization level • Complexity of the compiler • Reuse of legacy compiler parts • Compilation cost • Multi vs. One IR levels • Compiler maintenance
ExampleMIPS Compiler UCODE Stack Based IR Load/Store Based Architecture
Translator Optimizer Medium Level IR Translator UCODE Stack Based IR Code generator ExampleMIPS Compiler UCODE Stack Based IR Medium Level IR Load/Store Based Architecture
ExamplePA-RISC (HP-RISC) UCODE Stack Based IR Load/Store Based Architecture
ExamplePA-RISC (HP-RISC) UCODE Stack Based IR Translator Very low IR (SLLIC) Optimizer Very low IR (SLLIC) Code generator Load/Store Based Architecture
Why do we need multiple representations? • Lower representations expose more computations • more effective “standard” optimizations • examples: strength reduction, loop invariats, ... • Higher representations provide more “non-determinism” • more effective parallelization (reordering) • data cache optimizations
ExampleArrays LIR r1 [fp-4] r2 r1+2 r3 [fp-8] t4 r3*20 r5 r2+r4 r6 4*r5 r7 fp-216 f7 [r7+r6] MIR t1 j+2 t2 i*20 t3 t1+t2 t4 4*t3 t5 addr a t6 t5+t4 t7 *t6 C-code float a[20][10]; ... ... a[i][j+2] addr(a) +4 (i*20 + j +2) HIR t a[i, j+2]
ExternalRepresentation • Internal IR representation is used in the compiler • External representation is needed for: • Compiler debugging • Cross-module integration • Design issues • Representing pointers • Unique representation of temporaries • Compaction
Outline • Issues in IR design • High-Level IRs • Medium-Level IRs • Low-Level IRs • Multi-Level IRs • MIR, HIR, and LIR • ICAN Representations • Other IRs • Conclusions
Abstract Syntax Trees • Compact source representation • No punctuation symbols • Tree defines hierarchy • Used for Front-Ends • Sometimes include symbol table pointers • Can be translated into HIR • Can be also used for compaction
ident f indent a ident ident c b Example AST function body paramlist declist paramlist C-CODE int f(int a, int b) { int c; c = a + 2; print(c); } stmtList end ident end c stmtList = + call end const indent ident arglist a 2 print indent end c
Other HIRs • Lambda expressions • Normal linear forms: • Preserve control flow structures and arrays • Simplified control flow structures • Eliminate GOTOs • Continuations
Outline • Issues in IR design • High-Level IRs • Medium-Level IRs • Low-Level IRs • Multi-Level IRs • MIR, HIR, and LIR • ICAN Representations • Other IRs • Conclusions
Medium Level IR • Source and target language independent • Machine independent representation for program variables and temporaries • Simplified control flow constructs • Portable • Sufficient in many optimizing compilers: MIR, Sun-IR
Outline • Issues in IR design • High-Level IRs • Medium-Level IRs • Low-Level IRs • Multi-Level IRs • MIR, HIR, and LIR • ICAN Representations • Other IRs • Conclusions
Low Level IR • One to one correspondence with machine • Deviations from the machine • Alternative code, e.g., MULTIPLY • Addressing modes • Side effects? • Instruction selection in the last phase • Appropriate compiler data structure can hide dependence
Side Effect Operations(PA-RISC) MIR L1: t2 *t1 t1 t1+4 ... t3 t3+1 t5 t3 < t4 if t5 goto L1 PA-RISC (Option 1) LDWM 4(0, r2), r3 ... ADDI 1, r4, r4 COMB, < r4, r5, L1 PA-RISC (Option 2) LDWX r2(0, r1), r3 ... ADDIB, < 4, r2, r5, L1
LIR in Tiger /* assem.h */ typedef enum {I_OPER, I_LABEL, I_MOVE} AS_instr_kind; struct AS_instr_ { AS_instr_kind kind; union struct {string assem; Temp_tempList dst, src; AS_targets jumps; } OPER; struct {string assem; Temp_label label;} LABEL; struct {string assem; Temp_tempList dst, src;} MOVE; } u; };
Outline • Issues in IR design • High-Level IRs • Medium-Level IRs • Low-Level IRs • Multi-Level IRs • MIR, HIR, and LIR • ICAN Representations • Other IRs • Conclusions
Multi-Level Intermediate Representations • Multiple representations in the same language • Compromise computation exposure and high level description • SUN-IR: Arrays can be represented with multiple subscripts • SLLLIC: MULTIPLY and DEVIDE operations
Outline • Issues in IR design • High-Level IRs • Medium-Level IRs • Low-Level IRs • Multi-Level IRs • MIR, HIR, and LIR • ICAN Representations • Other IRs • Conclusions
Example C-code MIR make_node: begin receive p(val) receive n(val) q call malloc, (8, int) *q.next nil *q.value n *p.next q return end void make_node(p, n) struct node *p; int n; {struct node *q; q = malloc(sizeof(struct node)); q->next = nil; q->value=n; p->next = q; }
insert_node: begin receive n(val); receive l(val) t1 * l.value; if n <= t1 goto L1 t2 *l.next; if t2 != nil goto L2 call make_node, (l, type1; n, int) return L2: t4 *l.next call insert_node, (n, int, t4, type1) return L1: return end C-code void insert_node( n, l) int n; struct node *l; {if (n > l.value) if (l->next == nil) make_node(l, n); else insert_node(n, l->next); }
MIR Issues PA-RISC MIR • MIN does not usually exist • Both value and “location” computation for Boolean conditions t1 t2 min t3 MOVE r2, r1 COM, >= r3, r2 MOVE r3, r1 t3 t1<t2 if t3 goto L1 if t1 < t2 goto L1
HIR • Obtained from MIR • Extra constructs • Array references • High level constructs
MIR v opd1 t2 opd2 t3 opd3 if t2 > 0 goto L2 L1: if v < t3 goto L3 instructions; v v + t2 goto L1 L2: if v > t3 goto L3 instructions; v v + t2 goto L2 L3: HIR for v opd1 by opd2 to opd3 instructions endfor
insert_node: begin receive n(val); receive l(val) t1 * l.value if n > t1 then t2 *l.next; if t2 = nil then call make_node, (l, type1; n, int) return else t4 *l.next call insert_node, (n, int, t4, type1) return; fi; fi; end C-code void insert_node( n, l) int n; struct node *l; {if (n > l.value) if (l->next == nil) make_node(l, n); else insert_node(n, l->next); }
LIR • Obtained from MIR • Extra features: • Low level addressing • Load/Store • Eliminated constructs • Variables • Selectors • Parameters
insert_node:begin s800 s1; s801s2 s802[s801+0];if s800<=s802 goto L1 s803[s801+4];if s803!=nil goto L2 s1 s801;s2 s800 call make_node, ra return L2: s1s800; s2 [s801+4] call insert_node, ra return L1: return end C-code void insert_node( n, l) int n; struct node *l; {if (n > l.value) if (l->next == nil) make_node(l, n); else insert_node(n, l->next); }
Outline • Issues in IR design • High-Level IRs • Medium-Level IRs • Low-Level IRs • Multi-Level IRs • MIR, HIR, and LIR • ICAN Representations • Other IRs • Conclusions
Representing MIR in ICAN • An MIR program can be (internally) represented as an abstract syntax tree • The general construction • A (union) type for every non-terminal • An enumerated type “kind” for every production • A tuple for every production • Other ideas • Flatten the hierarchy in some cases • Use functions to abstract MIR properties(simplifies semantic manipulations)
ICAN Tuples for MIR Instruction (Table 4.7) Label: <kind:label, lbl:Label> receive VarName(ParamType) <kind:receive, left:VarName, ptype:ParamType> VarName Operand1 Binop Operand2 <kind:binasgn, left: varName, opr: Binop, opd1: Operand1, opd2:Operand2> VarName Unop Operand <kind:unasgn, left: VarName, opr: Unop, opd:Operand> VarName Operand <kind:valasgn, left: VarName, opd: Operand> ...
IRoper = enum{ add, || + sub, || - (unary) mul, || * (binary) div, || / mod, min, max, eql, neql, less, lseq, grtr, gteq, || =, !=, <, <=, >, >= shl, shr, shra, and, or, xor ind, || * pointer-dereference indelt, || *. dereference to a field neg, || - (unary) not, || ! addr, val, cast || (type cast) .. Table 4.6
MIRkind = enum {label, receive, binasgn, unasgn, ..., sequence} Opkind = enum { var, const, type} ExpKind = enum { binexp, unexp, noexp, listexp} Exp_Kind : MirKind ExpKind Has_Left: MirKind boolean Exp_Kind := {<label, noexp>, <receive, noexp>, <binassgn, binexp> <unasgn, unexp>, ... <callexp, listexp>, ... <sequence, noexp>} Has_Left := {<label, false>, <receive, true>, <binasgn, true>, <unasgn, true>, <valasgn, true>, <condasgn, true> <castasgn, true>, ...., <unif, false>, ...}