270 likes | 407 Views
CS412/413. Introduction to Compilers and Translators Spring ’99 Lecture 12: Intermediate Representation and the Translation Function. Administration. Prelim 1 on Monday in class topics covered: regular expressions, tokenizing, context-free grammars, LL & LR parsers, static semantics
E N D
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 12: Intermediate Representation and the Translation Function
Administration • Prelim 1 on Monday in class • topics covered: regular expressions, tokenizing, context-free grammars, LL & LR parsers, static semantics • No class Wednesday March 3 • Programming Assignment 2 due Friday March 5 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Where we are Source code (character stream) Lexical analysis regular expressions Token stream Syntactic Analysis grammars Abstract syntax tree Semantic Analysis static semantics Abstract syntax tree + types Intermediate Code Generation translation functions Intermediate Code CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Intermediate Code • Abstract machine code - simpler • Allows machine-independent code generation, optimization Pentium AST Java bytecode Alpha CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Intermediate Code • Abstract machine code • Allows machine-independent code generation, optimization Pentium optimize AST IR Java bytecode Alpha CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Intermediate Code • High-level vs. low-level IR • High-level IR preserves high-level language constructs • structured flow, variables, methods • essentially, extended version of AST • allows high-level optimization • Low-level IR (ala Appel) is abstract machine code • unstructured jumps, registers, memory loc’ns • allows low-level optimization • convenient for translation to real machine code CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Translations • Goal: get program closer to machine code without losing information needed to do useful optimizations Pentium optimize optimize AST H-L IR L-L IR Java bytecode Alpha CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Intermediate Code • High-level IR AST • Some AST nodes may be replaced with other AST nodes • New kinds of nodes not occurring in parse tree may be added • Will consider later for high-level optimizations (e.g. inlining) CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Low-level IR • Last time: variables in high-level code are mapped to stack locations or regs • Stack loc’ns in function’s stack frame • Stack frame is region between frame pointer (fp) and stack pointer (sp • Local variables, temporaries to negative offsets • Arguments, static link are temporaries in previous stack frame • If stack frame constant size, can use (positive) offsets from sp -- (fp - sp) = const. args fp locals temps sp CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Low-Level IR code • Intermediate Representation is a tree of nodes representing abstract machine instructions • Statement-like instructions return no value, are executed in a particular order (e.g. MOVE, SEQ, CJUMP) • Expression-like instructions return a value, have non-determinism (ADD, SUB) CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
IR expressions • CONST(i) : the integer constant i • TEMP(t) : a temporary register t. The abstract machine has an infinite number of these • OP(e1, e2) : one of the following operations • PLUS, MINUS, MUL, DIV, MOD • AND, OR, XOR, LSHIFT, RSHIFT, ARSHIFT • MEM(e) : contents of memory locn w/ address e • CALL(f, l) : result of fcn f applied to arguments l • ESEQ(s, e) : result of e after stmt s is executed • NAME(n) : address of the statement labeled n CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
IR statements • MOVE(e, dest) : move result of e into dest • dest = TEMP(t) : assign to temporary t • dest = MEM(e) : assign to memory locn e • EXP(e) : evaluate e, discard result • SEQ(s1, s2) : execute s1 and then s2 • JUMP(e) : jump to address e • CJUMP(e, l1, l2) : jump to l1or l2depending on whether e is true or false • LABEL(n) : a labeled statement (may be used in NAME, JUMP, CJUMP) CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Translation • How do we translate an AST/High-level IR into this low-level IR representation? CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Variables • Local variables, arguments mapped to offsets from frame pointer (negative, positive resp.) • Local variable v located at offset k -- reference to v in AST becomes IR expression MEM(PLUS(FP, k)) args fp locals temps MEM sp + v FP CONST(k) CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Assignment • Assignment v = E translates to a MOVE(e, dest) node, where e is the translation of expression E, and dest is the location of v. MOVE 2 MEM x = 2; + FP CONST(k) CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Statements • A sequence of two statements translates to a SEQ node: • If s1 translates to IR tree T1 and s2 to T2 • Then s1;s2 translates to SEQ(T1 , T2) SEQ s1;s2 T1 T2 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Translation functions • Introduce function T[ E ] to represent the translated version of an expression or statement E. Translation rule for a sequence: T[s1; s2] = SEQ(T[s1], T[ s2]) • Assignment: T[ v = E ] = ? CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Translation function • How to translate a variable? T [v] = MEM(PLUS(FP, CONST( k ))) • Where does k come from? CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Need environment • Answer: put it in the symbol table • Translation function takes another argument A. T[ E, A] or T[ S, A] • For each variable v in scope, the environment contains entry “location v : k” location v : k A T [v , A ] = MEM(PLUS(FP, CONST( k ))) CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Translation rules • Can write rules for remainder of AST • Like type-checking: provides recursive recipe for translation code location v : k A T [v , A ] = MEM(PLUS(FP, CONST( k ))) T[s1; s2] = SEQ(T[s1] , T[ s2]) T[ v = E ] = MOVE( T [ E ] , T [ v ] ) CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Translation Code • Like type-checking: add method to AST nodes that does the translation abstract class ASTNode { IRNode translate(SymTab A) { … } } CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Translating a block T[s1; s2 , A] = SEQ(T[s1 , A] , T[ s2 , A]) class Block { Stmt [ ] stmts; IRNode translate(SymTab A) { int n = stmts.length; IRNode ret = new Seq(stmts[n-2].translate(A), stmts[n-1].translate(A)); while (n > 2) { n--; ret = new Seq(stmts[n-2].translate(A), ret); } return ret; } CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Array index expressions • Array index expressions may be used both as LHS and RHS of assignment. • Translate to a MEM node pointing to the appropriate array element • Question: how to find appropriate element? • Need to decide on a representation for arrays CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Array representation • Arrays can’t change in size, so store as contiguous series of elements. • Also need length of array v: array[int] = new int[3] = 0; v[1] = 5; v[2] = 10; 3 0 v 5 10 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Translating array index T [ E1[ E2 ] , A] = MEM(PLUS(T[E1, A], (MUL(PLUS(CONST(1), T[ E2, A ]), CONST(4)) i.e. MEM(E1 + 4*(E2+1)) CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Bounds checking • Need to check that array index is within bounds! • Use ESEQ node • E1[ E2 ] translates to ESEQ bounds checking code MEM + ... T[E1 ] CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Conclusions • Language constructs can be translated to a small IR representation • Translation process can be described conveniently as a translation function T[ E, A ] • Translation function corresponds to natural recursive implementation that builds IR nodes bottom-up • Next time: translating structured statements CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers