Languages and Compiler Design II Re-Introduction from CS 321

Languages and Compiler Design IIRe-Introduction from CS 321 Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU Spring 2010 rev.: 4/16/2010 CS322

Agenda • The Competent Compiler Designer • Your Background • Overall Compiler Organization • Front-End vs. Back-End • Topics Discussed • Course Objectives and Load • Intermediate Representation (IR) • IR versus other Artificial Languages - various • IR Samples - various CS322

The Competent Compiler Designer • Areas of expertise of highly qualified compiler designer: • Mastery of at least one high-level language; e.g. able to write a program to measure, whether reference parameters are passed by address or by copy-in, copy-out • Master some assembly language; real or hypothetical • Very thorough understanding of parsing • Thorough comprehension of the Run-Time Stack • Competence emitting object code, either IR, asm source, or direct binary; deep knowledge of target computer • This is what you learn in CS 322, Spring 2010 • Detailed coverage of the run-time stack; so rigorous that you know how to address local names that are global in some lexically embedding scope • Code generation; but acceptable to assume an unlimited number of registers; and try again, when running out  • Interpretation of machine instructions • Thorough comprehension of several architectures: Sparc, IR, quads, JVM CS322

Your Background • At the start of CS 322, you: • Have a Java front-end that reads MINI source, parses the source, emits syntax errors, and builds an AST • Have written a scanner, or used a scanner generator (FE) • Have designed a parser, or are experienced using parser generators (FE) • Understand the programming language MINI quite well • Can emit suitable error messages for MINI programs, when lexical or syntax errors occur in MINI sources • Have a Java programming infrastructure in place to emit IR • Next in CS 322, you learn how to: • Emit intermediate code (IR), and improve it • Translate the intermediate representation (IR Tree) of the MINI source into Sparc machine code • Interpret the intermediate representation (IR Tree) of MINI • Understand object code: Quads, Basic Blocks, cfg, Stack Frames CS322

Overall Compiler Organization Source Program Scanner Tokens Symbol Table Syntax Analyzer Syntax Tree Error Handler Semantic Analyzer CS 321 Abstract Syntax Tree CS 322 Intermed. Code Gen Raw IR Code IR Optimizer Better IR Code Code generator Raw Object Code Optimizer Optimized Object Code CS322

Front-End vs. Back-End • Front-End (FE) input: ASCII character sequence • FE tokenizes • FE parses, builds and uses symbol table, emits errors • FE generates abstract syntax tree (AST) • Middle-End (ME) input: AST and symbol table • ME generates raw intermediate representation (IR) • Back-End (BE) input: intermediate representation • BE improves IR • BE generates real machine code • BE optimizes machine code • BE allocates registers, or stack instructions • BE generates (optimized) object code • Either asm ASCII, and then assembler generates binary • Or BE generates binary code directly CS322

Topics Discussed • Intermediate representations (IR) • Mapping source -> AST -> IR -> better IR -> code • can also be done directly • Stack, AKA Run-time stack • Stack Frame, AKA Activation Record • Stack Marker – the fixed portion of the Activation Record • Stack Pointer AKA TOS, Frame Pointer AKA Base Pointer • Interpretation • Instruction Selection • Register Allocation via graph coloring • Basic Block, Control Flow Graph (cfg) –optional • Data Flow Analysis –optional • Garbage collection –optional CS322

Course Objectives • Different IRs: three-address code, stack code, tree code • Program interpretation • Run-Time organization, stack frames, recursive calls, return, function return value, parameter passing • Different parameter passing mechanisms • E.g. by value, by reference, by value-result, by name (thunk) • Though MINI uses solely passing by value, AKA value-parameter • Understand scoping rules, and use of dynamic- as well as static-link in run-time stack • Simple register allocation, e.g. via graph coloring • BE implementation, with instruction selection and CG CS322

Course Workload • Programming Projects, 100 points each except Sparc • 2 on IR generation • 1 on IR->asm language generation • 1 on IR interpretation • 1 on sparc asm language --small, 50 points total • Exams • Midterm 250 points (25%) • Final 300 points (30%) • Extra Credit • Up to 50 points (5%) max CS322

Intermediate Representation (IR) • Definition: IR is a semantics-preserving, language- and target-independent representation of the source program used by the compiler to allow transformations leading to improved object code mapping • Desirable Properties: • Source independence – allows sharing of ME or BE with compilers for multiple languages • Target independence – allows optimizations for multiple target architectures, though not target-specific optimizations • Intermediate level – enables several, crucial optimizations to be performed, e.g. dataflow analysis • Typical IR Forms: • Three address code – 1 operation, 1 target, 2 source operands • Stack code – 1 op, 0 – 1 implied target (stack), 1 or more sources on the stack (few exceptions, locating operands in memory) • Tree code – Abstract Syntax Tree (AST) form of root nodes with sub-trees; usually binary tree CS322

IR Versus High-Level Languages • IR has Limited control constructs • Control flow limited to jump, conditional jump, call, return • Limited expressions • Typically number of operands fixed • if more needed, use multiple expressions • Limited types • Few scalar types, but directly mappable onto target • Arrays, structures mapped onto linear, abstract addressing space • Booleans, enumerations, sub-ranges not represented • Limited scopes • Few addressing spaces, e.g. static addresses or offsets from base pointer • No more symbolic names • Functions and procedures have no names, just addresses and parameter profiles/specifications • Many Temps • Holding intermediate, temporary values; location in target machine t.b.d. CS322

IR Versus Assembly Language • IR is not reflecting real target memory • No reference to static space, top-of-stack, machine registers • Some target machines have “sections”, others linear memory • Not tied to a concrete Function/Procedure interface • Parameter passing handled abstractly • But IR can be mapped onto many target machines • Not limited by number –or type– of registers • Use virtual registers, see contrast with Intel x86, 4 GPRs • Later map virtual- to real registers, or onto stack • If needed, IR can use Float, Int, Double, Control, other regs • No use of HW regs for temps in IR • Temps later can be mapped onto machine registers, onto stack CS322

IR Versus Assembly Language, Cont’d Assembly Language (ASM) in contrast • Directly reflects target machine language • Asm ASCII source program is sequence of instruction mnemonics • Mapped by assembler from ASCII source into binary object • Such binaries are yet to be linked, from multiple objects into a single, relocatable executable object • And IO routines and other functions (math, transcendentals, built-ins, trigos) are also to be linked in • Generally, one ASM instructions maps into one machine opcode CS322

Three-Address IR • Three-address referred to as Quadruples, or Quads • Three-address code consists of: • 1 operation –AKA opcode, or instruction or action • 1target –AKA destination, an address or reg • 2source operands, can be memory, regs, literals, omitted • Target and sources must be addressable objects, but sources can also be literals • One or both sources could be missing, or be implied • E.g. increment instruction: implies one integer operand: 1 • Or jump instruction: only needs target, operands skipped • For debug purpose, IR operands may have added info CS322

Three-Address IR, Cont’d • Quads are numbered, and can be jump destinations • Sample three-address IRs (Quads): t1 := b * vreg5 -- vreg5 is virtual register t2 := a + b -- a, b addressable objects a := t1 + t2 -- re-use of temps t1 and t2 vreg5 := -c -- only one operand • Properties of Quads • Simple and readable • Lack physical constraints, I.e. can have 100s of regs • Explicit operands and intermediate results, suitable for local optimizations • Program structure is implicit, hence Quads not suitable for all global optimizations CS322

Concrete Three-Address Code Notational conventions: t – represents a temporary (AKA temp) a – represents some memory address a n – represents literal n, AKA constant: some type, e.g. integer x, y – represent operands x and y, can be temp, literal, address l – represents a label, symbolic quad address (lower-case ell) Lm – represents a location ‘L’ in memory at address ‘m’ Arithmetic operations: t1 := x op y -- target is t1, x and y operands t2 := op x -- unary operator t3 := y -- assignment, maps into move Control Flow operations: goto l -- jump, AKA branch; target quad l if x relop y goto l -- conditional jump to l CS322

Concrete Three-Address Code, Cont’d Method call, return op, and return value param x -- first parameter is x param y -- next and last parameter is y call func1, 2 -- call void func1, pass 2 params call func2, 2, t –- call func2, return value in t return x -- return value x, bind to t Method parameter use and local variable use t1 := paramPtr[1] -- 1st arg into temp t1 t2 := paramPtr[2] -- 2nd arg into temp t2 t3 := varPtr[1] -- 1st local into temp t3 t4 := varPtr[2] -- 2nd local into temp t3 Two built-in pointers, paramPtr and varPtr, refer to the parameter array and the local variables array associated with a method CS322

Concrete Three-Address Code, Cont’d Memory access: t1 := a[0]-- load content at memory a0 into t1 t2 := a[y]-- location a[] offset y loaded into t2 b[0] := x -- store x into memory at b0 b[y] := x -- store x at memory location by AKA b[] offset y Conventions for calls and arithmetic ops: • Arithmetic ops cannot be nested • Relational operations only in conditional jump • Have only limited control flow operations • Actual parameters to functions are listed left to right, one at a time, immediately preceding the call • Number of actual matches number of formals. Note: different from Java, allowing smaller number of actuals; like C++ CS322

Three Address Example class test08 { public static void main( String[] a ) { System.out.println( new Body().go() ); } } //end test08 class Body { public int value( int i, int j, int k ) { return i+j+k; } public int go(){ return value( 1, 1, 1 ) + this.value( 2, 2, 2); } } //end Body Corresponding three-address code: main: Body_go: t1 := 1 * wdsize; t9 := paramPtr[0]; param t1; param t9; call malloc, 1, t2; param 1; param t2; param 1; call Body_go, 1, t3; param 1; param t3; call Body_value, 4, t10; call prInt, 1; param t9; return; param 2; Body_value: param 2; t4 := paramPtr[1]; param 2; t5 := paramPtr[2]; call Body_value, 4, t11; t6 :- paramPtr[3]; t12 := t10 + t11; t7 := t4 + t5; return t12; t8 := t7 + t6; return t8; CS322

Stack Machine IR • For a stack machine (AKA stack architecture, SA) all memory is the stack. SA accesses memory only via push and pop operations • Operations are executed on top-of-stack (TOS) elements • SA operands are implied, except for push and pop • Operation with 2 operands pops off 2 elements from TOS, conducts operations, then pushes result onto TOS • All such operations could be inherently slow, due to memory access. In reality, the topmost n (n is small number, n = 4..16) elements are held in cache • SA ideally suited for code generation, as no register detail needs to be tracked • Also temps are on stack, as many as needed CS322

Stack Machine IR, Cont’d SA code sample shows the simplicity of this IR: Source: sum = a + 4 * b ^ 2 SA IR: push a -- find a in mem push_lit #4 -- tos holds a, 4 push b -- 3 elems on tos push_lit #2 -- now 4 on tos expo -- a, 4, and b*b mult -- a and 4*b*b add -- 1 element on tos pop sum -- tos empty now CS322

Tree IR • Abstract syntax tree (AST) IR has root-node and subtrees • AST is a binary tree • Root is the program node PROG, effectively a sequence of function nodes, represented as a tree • Note that AST model used for MINI is modified from textbook • Other nodes include: expression node EXP subtrees: operands statement node STMT subtrees: stmt of exp function node FUNC subtrees: stmt or func function list node FUNClist subtrees: func CS322

Tree IR, Cont’d For MINI we use tree: IR A sample high-level source: a := b * c + b * d Corresponding abstract syntax tree: := a + * * b c b d CS322

MINI Tree IR Nodes, List Binop: + - * / && || Relop: == != < <= > >= CS322

MINI Tree IR Nodes, Detail • EXP: BINOP( int binop, EXP left, EXP right ) Dyadic operation; binop language defined, e.g. + for add, etc. • EXP: CALL( NAME func, EXPlist args ) argument list (actual parameters) evaluated and then passed to called function; first arg is # of actuals, second and last is target for return value; note that actuals are already passed via PARAM • EXP: MEM( EXP exp ) Fetch value from address, specified by exp. • EXP: NAME( String label ) Symbolic label used as jump target; if called without String, a label is created • EXP: TEMP( int num ) represents a temporary, can map into register or memory; num is the temp’s number. If called without arg, temp is created CS322

MINI Tree IR Nodes, Detail Cont’d • EXP: PARAM( int idx ) Actual parameter is passed to method. idx is its index in list • EXP: VAR( int idx ) Local variable of a method. idx is its index in var-list. First index is 1 • EXP: CONST( int val ) integer constant of value val • EXP: ESEQ( STMT stmt, EXP exp ) stmt is executed first, then exp is evaluated. Final resulting value for ESEQ is that of exp • EXP: EXPlist() List of expressions only used for argument of CALL or CALLST CS322

MINI Tree IR Nodes, Detail Cont’d • STMT: MOVE( EXP dest, EXP src ) compute src and store in dest, which is MEM or TEMP node • STMT: JUMP( NAME target ) transfer control to fixed label target, defined by NAME node • STMT: CJUMP( int relop, EXP l, EXP r, NAME target ) transfer control to target, provided relation holds between l and r • STMT: LABEL( NAME n ) a LABEL is being defined, will serve as target for jumps • STMT: CALLST( NAME func, EXPlist args ) Call func. Args have been passed before and have been evaluated • STMT: RETURN( EXP exp ) transfer control back to caller, whose address is found on stack, and return exp to calling environment Where is the location of return value? CS322

MINI Tree IR Nodes, Detail Cont’d • STMT: STMTlist() list of statements processed in left-to-right (l-2-r) order • FUNC: FUNC( String id, STMTlist stmts, int vcnt, int acnt ) A function definition named id. Vcnt is the number of locals, and acnt is the number of formals. Stmts is the statement list • FUNClist: FUNClist() list of multiple functions, defined in l-2-r order • PROG: PROG( FUNClist funcs ) Is the IR root node, defining the MINI program. Funcs is the sequence of functions defined in the program. Note: There are no separate compilation units in MINI! CS322

MINI Tree IR Sample class test08 { public static void main( String[] a ) { System.out.println( new Body().go() ); } } //end test08 class Body { public int value( int i, int j, int k ) { return i+j+k; } public int go(){ return value( 1, 1, 1 ) + this.value( 2, 2, 2); } } //end Body Corresponding tree code: main( locals=0, max_args=1 ) { [ CALLST( NAME prInt ) ( ( CALL (NAME Body_go) ( ( ESEQ [ MOVE( TEMP 101 ) ( CALL ( NAME malloc ) ( ( NAME wSZ ))) ] ( TEMP 101 ))))) ] } //end main Body_value( locals=0, max_args=0 ) { [ RETURN( BINOP + ( BINOP + ( MEM( PARAM 1 )) ( MEM( PARAM 2 ))) ( MEM( PARAM 3 ))) ] } //end Body_value Body_go( locals=0, max_args=4 ) { [ RETURN( BINOP + ( CALL( NAME Body_value ) ( ( MEM( PARAM 0 ) ) ( CONST 1 ) ( CONST 1 ) ( CONST 1 ))) ( CALL( NAME Body_value ) ( ( MEM( PARAM 0 ) ) ( CONST 2 ) ( CONST 2 ) ( CONST 2 )))) ] } //end Body_go CS322

Canonicalized IR Trees MINI’s IR tree language canonicalizes IR trees according to following constraints: • No nested CALL nodes. Calls inside expressions are factored out and replaced by temp, to hold the result; then use temp: [ MOVE( TEMP 101 ) ( BINOP + ( CALL( NAME A )( CONST 1 ))) ( CONST 2)) ] => [MOVE( TEMP 102 )( CALL( NAME A )( ( CONST 1 ))) ] [MOVE( TEMP 101 )( BINOP + ( TEMP 102 )( CONST 2 )) ] • No ESEQ nodes. Embedded statements are extracted from Statements: [ MOVE( TEMP 104 ) ( ESEQ [ MOVE( TEMP 103 )( CALL( NAME malloc)( ( CONST 4 ))) ] ( TEMP 103 )) ] => [MOVE( TEMP 103 )( CALL( NAME malloc )( ( CONST 4 ))) ] [MOVE( TEMP 104 )( TEMP 103 ) ] CS322

Sample of Canonicalized IR Trees Original IR tree code: main( locals=0, max_args=1 ) { [ CALLST( NAME prInt ) ( ( CALL (NAME Body_go) ( ( ESEQ [ MOVE( TEMP 101 ) ( CALL ( NAME malloc ) ( ( NAME wSZ ))) ] ( TEMP 101 ))))) ] } // main Body_value( locals=0, max_args=0 ) { [ RETURN( BINOP + ( BINOP + ( MEM( PARAM 1 )) ( MEM( PARAM 2 ))) ( MEM( PARAM 3 ))) ] } // Body_value Body_go( locals=0, max_args=4 ) { [ RETURN( BINOP + ( CALL( NAME Body_value ) ( ( MEM( PARAM 0 )) ( CONST 1 )( CONST 1 ) ( CONST 1 ))) ( CALL( NAME Body_value ) ( ( MEM( PARAM 0 )) ( CONST 2 ) ( CONST 2 ) ( CONST 2 ))))] } // Body_go Canonicalized IR tree code: main( locals=0, max_args=1 ) { [ MOVE( TEMP 101 )( CALL( NAME malloc )( ( NAME wSZ ))) ] [ MOVE( TEMP 100 )( CALL( NAME body_go )( ( TEMP 101 ))) ] [ CALLST( NAME prInt ) ( ( TEMP 100 )) ] } // main Body_value( locals=0, max_args=0 ) { [ RETURN( BINOP + ( BINOP + ( MEM( PARAM 1 ) ) ( MEM( PARAM 2 ))) ( MEM( PARAM 3 ))) ] } // Body_value Body_go( locals=0, max_args=4 ) { [ MOVE( TEMP 102 )( CALL( NAME Body_value )( ( MEM( PARAM 0 )) ( CONST 1 )( CONST 1 )( CONST 1 ))) ] [ MOVE( TEMP 103 )( CALL( NAME Body_value )( ( MEM( PARAM 0 )) ( CONST 2 )( CONST 2 )( CONST 2 ))) ] [ RETURN( BINOP + ( TEMP 102 ) ( TEMP 103 )) ] } // Body_go CS322

Languages and Compiler Design II Re-Introduction from CS 321