1 / 69

Intermediate Representations in Compiler Design

This chapter explores various levels of Intermediate Representations (IRs) in compilers, covering High-Level IRs, Medium-Level IRs, Low-Level IRs, and Multi-Level IRs. It discusses the design issues, examples from MIPS and PA-RISC compilers, the need for multiple representations, and the significance of internal versus external representations. Additionally, it delves into abstract syntax trees, different HIRs, medium-level IR characteristics, and low-level IR specifics in compiler development.

scudder
Download Presentation

Intermediate Representations in Compiler Design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Intermediate Representations(Irs Chapter 4) Mooly Sagiv Schrierber 313 03-640-7606 http://www.math.tau.ac.il/~sagiv/courses/acd.html

  2. Outline • Issues in IR design • High-Level IRs • Medium-Level IRs • Low-Level IRs • Multi-Level IRs • MIR, HIR, and LIR • ICAN Representations • Other IRs • Conclusions

  3. Compiler Structure String of characters Scanner tokens Parser Symbol table and access routines OS Interface AST Semantic analyzer IR Code Generator Object code

  4. Intermediate Language Selection • Low Vs. High level control flow structures • Flat Vs. Hierarchical (tree) • Machine Vs. High level of instructions • (Symbolic) Registers Vs. Stack • Normal forms (SSA) • Intermediate forms: Control Flow Graph, Call Graph, Program Dependence Graph • Issues: Engineering, efficiency, portability, optimization level, taste

  5. LIR s2 s1 s4  s3 s6  s5 L1: if s2 >s6 goto L2 s7  addr a s8  4*s9 s10  s7+s8 [s10]  2 s2  s2 + s4 goto L1 L2: MIR v v1 t2  v2 t3  v3 L1: if v >t3 goto L2 t4  addr a t5  4*i t6  t4+t5 *t6  2 v  v + t2 goto L1 L2: IRs in the Book HIR for v v1 by v2 to v3 do a[i] :=2 endfor

  6. Issues in IR Design • Portability • Optimization level • Complexity of the compiler • Reuse of legacy compiler parts • Compilation cost • Multi vs. One IR levels • Compiler maintenance

  7. ExampleMIPS Compiler UCODE Stack Based IR Load/Store Based Architecture

  8. Translator Optimizer Medium Level IR Translator UCODE Stack Based IR Code generator ExampleMIPS Compiler UCODE Stack Based IR Medium Level IR Load/Store Based Architecture

  9. ExamplePA-RISC (HP-RISC) UCODE Stack Based IR Load/Store Based Architecture

  10. ExamplePA-RISC (HP-RISC) UCODE Stack Based IR Translator Very low IR (SLLIC) Optimizer Very low IR (SLLIC) Code generator Load/Store Based Architecture

  11. Why do we need multiple representations? • Lower representations expose more computations • more effective “standard” optimizations • examples: strength reduction, loop invariats, ... • Higher representations provide more “non-determinism” • more effective parallelization (reordering) • data cache optimizations

  12. ExampleArrays LIR r1 [fp-4] r2 r1+2 r3 [fp-8] t4  r3*20 r5  r2+r4 r6  4*r5 r7  fp-216 f7  [r7+r6] MIR t1 j+2 t2  i*20 t3  t1+t2 t4  4*t3 t5  addr a t6  t5+t4 t7  *t6 C-code float a[20][10]; ... ... a[i][j+2] addr(a) +4 (i*20 + j +2) HIR t  a[i, j+2]

  13. ExternalRepresentation • Internal IR representation is used in the compiler • External representation is needed for: • Compiler debugging • Cross-module integration • Design issues • Representing pointers • Unique representation of temporaries • Compaction

  14. Outline • Issues in IR design • High-Level IRs • Medium-Level IRs • Low-Level IRs • Multi-Level IRs • MIR, HIR, and LIR • ICAN Representations • Other IRs • Conclusions

  15. Abstract Syntax Trees • Compact source representation • No punctuation symbols • Tree defines hierarchy • Used for Front-Ends • Sometimes include symbol table pointers • Can be translated into HIR • Can be also used for compaction

  16. ident f indent a ident ident c b Example AST function body paramlist declist paramlist C-CODE int f(int a, int b) { int c; c = a + 2; print(c); } stmtList end ident end c stmtList = + call end const indent ident arglist a 2 print indent end c

  17. Other HIRs • Lambda expressions • Normal linear forms: • Preserve control flow structures and arrays • Simplified control flow structures • Eliminate GOTOs • Continuations

  18. Outline • Issues in IR design • High-Level IRs • Medium-Level IRs • Low-Level IRs • Multi-Level IRs • MIR, HIR, and LIR • ICAN Representations • Other IRs • Conclusions

  19. Medium Level IR • Source and target language independent • Machine independent representation for program variables and temporaries • Simplified control flow constructs • Portable • Sufficient in many optimizing compilers: MIR, Sun-IR

  20. Outline • Issues in IR design • High-Level IRs • Medium-Level IRs • Low-Level IRs • Multi-Level IRs • MIR, HIR, and LIR • ICAN Representations • Other IRs • Conclusions

  21. Low Level IR • One to one correspondence with machine • Deviations from the machine • Alternative code, e.g., MULTIPLY • Addressing modes • Side effects? • Instruction selection in the last phase • Appropriate compiler data structure can hide dependence

  22. Side Effect Operations(PA-RISC) MIR L1: t2 *t1 t1  t1+4 ... t3  t3+1 t5  t3 < t4 if t5 goto L1 PA-RISC (Option 1) LDWM 4(0, r2), r3 ... ADDI 1, r4, r4 COMB, < r4, r5, L1 PA-RISC (Option 2) LDWX r2(0, r1), r3 ... ADDIB, < 4, r2, r5, L1

  23. LIR in Tiger /* assem.h */ typedef enum {I_OPER, I_LABEL, I_MOVE} AS_instr_kind; struct AS_instr_ { AS_instr_kind kind; union struct {string assem; Temp_tempList dst, src; AS_targets jumps; } OPER; struct {string assem; Temp_label label;} LABEL; struct {string assem; Temp_tempList dst, src;} MOVE; } u; };

  24. Outline • Issues in IR design • High-Level IRs • Medium-Level IRs • Low-Level IRs • Multi-Level IRs • MIR, HIR, and LIR • ICAN Representations • Other IRs • Conclusions

  25. Multi-Level Intermediate Representations • Multiple representations in the same language • Compromise computation exposure and high level description • SUN-IR: Arrays can be represented with multiple subscripts • SLLLIC: MULTIPLY and DEVIDE operations

  26. Outline • Issues in IR design • High-Level IRs • Medium-Level IRs • Low-Level IRs • Multi-Level IRs • MIR, HIR, and LIR • ICAN Representations • Other IRs • Conclusions

  27. XBNF for MIR

  28. XBNF for Receive Instruction

  29. XBNF for Assignments

  30. XBNF for Control Flow Instructions

  31. XBNF for Call/Return Instruction

  32. XBNF for Sequence(Volatile Instructions)

  33. XBNF for Constants

  34. XBNF for Identifiers

  35. Example C-code MIR make_node: begin receive p(val) receive n(val) q  call malloc, (8, int) *q.next  nil *q.value  n *p.next  q return end void make_node(p, n) struct node *p; int n; {struct node *q; q = malloc(sizeof(struct node)); q->next = nil; q->value=n; p->next = q; }

  36. insert_node: begin receive n(val); receive l(val) t1 * l.value; if n <= t1 goto L1 t2  *l.next; if t2 != nil goto L2 call make_node, (l, type1; n, int) return L2: t4  *l.next call insert_node, (n, int, t4, type1) return L1: return end C-code void insert_node( n, l) int n; struct node *l; {if (n > l.value) if (l->next == nil) make_node(l, n); else insert_node(n, l->next); }

  37. MIR Issues PA-RISC MIR • MIN does not usually exist • Both value and “location” computation for Boolean conditions t1 t2 min t3 MOVE r2, r1 COM, >= r3, r2 MOVE r3, r1 t3 t1<t2 if t3 goto L1 if t1 < t2 goto L1

  38. HIR • Obtained from MIR • Extra constructs • Array references • High level constructs

  39. XBNF for HIR

  40. MIR v opd1 t2 opd2 t3 opd3 if t2 > 0 goto L2 L1: if v < t3 goto L3 instructions; v  v + t2 goto L1 L2: if v > t3 goto L3 instructions; v  v + t2 goto L2 L3: HIR for v opd1 by opd2 to opd3 instructions endfor

  41. insert_node: begin receive n(val); receive l(val) t1 * l.value if n > t1 then t2  *l.next; if t2 = nil then call make_node, (l, type1; n, int) return else t4  *l.next call insert_node, (n, int, t4, type1) return; fi; fi; end C-code void insert_node( n, l) int n; struct node *l; {if (n > l.value) if (l->next == nil) make_node(l, n); else insert_node(n, l->next); }

  42. LIR • Obtained from MIR • Extra features: • Low level addressing • Load/Store • Eliminated constructs • Variables • Selectors • Parameters

  43. XBNF for LIR

  44. XBNF for LIR (Contd.)

  45. insert_node:begin s800 s1; s801s2 s802[s801+0];if s800<=s802 goto L1 s803[s801+4];if s803!=nil goto L2 s1  s801;s2 s800 call make_node, ra return L2: s1s800; s2  [s801+4] call insert_node, ra return L1: return end C-code void insert_node( n, l) int n; struct node *l; {if (n > l.value) if (l->next == nil) make_node(l, n); else insert_node(n, l->next); }

  46. Outline • Issues in IR design • High-Level IRs • Medium-Level IRs • Low-Level IRs • Multi-Level IRs • MIR, HIR, and LIR • ICAN Representations • Other IRs • Conclusions

  47. Representing MIR in ICAN • An MIR program can be (internally) represented as an abstract syntax tree • The general construction • A (union) type for every non-terminal • An enumerated type “kind” for every production • A tuple for every production • Other ideas • Flatten the hierarchy in some cases • Use functions to abstract MIR properties(simplifies semantic manipulations)

  48. ICAN Tuples for MIR Instruction (Table 4.7) Label: <kind:label, lbl:Label> receive VarName(ParamType) <kind:receive, left:VarName, ptype:ParamType> VarName  Operand1 Binop Operand2 <kind:binasgn, left: varName, opr: Binop, opd1: Operand1, opd2:Operand2> VarName  Unop Operand <kind:unasgn, left: VarName, opr: Unop, opd:Operand> VarName Operand <kind:valasgn, left: VarName, opd: Operand> ...

  49. IRoper = enum{ add, || + sub, || - (unary) mul, || * (binary) div, || / mod, min, max, eql, neql, less, lseq, grtr, gteq, || =, !=, <, <=, >, >= shl, shr, shra, and, or, xor ind, || * pointer-dereference indelt, || *. dereference to a field neg, || - (unary) not, || ! addr, val, cast || (type cast) .. Table 4.6

  50. MIRkind = enum {label, receive, binasgn, unasgn, ..., sequence} Opkind = enum { var, const, type} ExpKind = enum { binexp, unexp, noexp, listexp} Exp_Kind : MirKind  ExpKind Has_Left: MirKind  boolean Exp_Kind := {<label, noexp>, <receive, noexp>, <binassgn, binexp> <unasgn, unexp>, ... <callexp, listexp>, ... <sequence, noexp>} Has_Left := {<label, false>, <receive, true>, <binasgn, true>, <unasgn, true>, <valasgn, true>, <condasgn, true> <castasgn, true>, ...., <unif, false>, ...}

More Related