170 likes | 364 Views
R Byte Code Optimization Compiler (1). March 7. 2012. Outline. Basic Compiling Process Compiler Structure Decoding Pass Type Annotation Pass Unbox Opportunity Identification Pass. Basic Byte-Code Optimization Compiling Process. Typical Compiler Structure. Optimization Rules.
E N D
R Byte Code Optimization Compiler (1) March 7. 2012
Outline • Basic Compiling Process • Compiler Structure • Decoding Pass • Type Annotation Pass • Unbox Opportunity Identification Pass
Basic Byte-Code Optimization Compiling Process • Typical Compiler Structure OptimizationRules Profile Table Org Byte-Code seq New Byte-Code seq Base Stmts Opt Stmts Several Passes INTSXP Decode pass BC_STMTP BC_STMTP Encode pass INTSXP Addr of GETVAR 1 Addr of LDCONST 2 Addr of ADD 3 Addr of SETVAR 4 Addr of POP Addr of GETFUN 5 Addr of MAKEPROM 6 Addr of CALL 7 Addr of RETRUN Work on a new IR for the R byte-Code Addr of GETVAR 1 Addr of LDCONST 2 Addr of ADD 3 Addr of SETVAR 4 Addr of POP Addr of GETFUN 5 Addr of MAKEPROM 6 Addr of CALL 7 Addr of RETRUN
Compiler Structure – New Components Required • A new IR is needed • The current R byte-code is just instructions • opcode and operand in int array structure • No addition information attached • Hard to manipulate • A Very Simple IR for Current Optimization Requirements • Attach profile information • Attach type information • Support simple type inference • Support simple unbox opportunity identification • Basic Compiler Infrastructure • Passes definition • Engine for run all the passes • Stmt Printer: printing stmt as human readable text
Compiler Structure – IR • IR and Stmts typedefstruct { enumOP_CODE opcode;//the op_code char* op_name; //name of the instruction unsigned operand_num; //number of operands unsigned stack_use; //number of operands consumed on stack unsigned stack_gen; //number of operands produced on stack intneed_profile; //whether the instruction need profile void* addr;//used for decode and encode } BC_INSTR, *BC_INSTRP;//the instruction typedefstruct { BC_INSTRPinstr; //pointer to the instruction unsigned pc;//pc value, relative pc value intoperands[4]; //we only support max 4 operands. In fact, only switch has 4 operands inttype;//e.g. 0, unknown, logic, int, real, non-scalar. If the code gen more than one stac inttype_source; //e.g. fixed or from const, profile based, derived(from reasoning) intoutput_shape; //Whether the output need box/unbox. 2bits, [need box][may unbox] } BC_STMT, *BC_STMTP;
Compiler Structure – Passes and Engine • Pass function type and Examples • Skeleton code to run a pass int(*pass_fun)(BC_STMTP, SEXP, unsigned*, PT_STACKP) roc_type_annotate_pass(BC_STMTP stmt, //The current statement SEXP constants, //Constant table unsigned * profile_table, //Profile table PT_STACKP type_stack); //Current simulated stack void roc_run_pass(PT_LISTP stmts, SEXP constants, unsigned * profile_table, int(*pass_fun)(BC_STMTP, SEXP, unsigned*, PT_STACKP)) { PT_STACKP type_stack = rou_create_pointer_stack(); //Prepare the simulated stack for (unsigned i = 0; i < stmts->length; i++) { BC_STMTP stmt = rou_pointer_arraylist_get(stmts, i); (*pass_fun)(stmt, constants, profile_table, type_stack); //call the pass function //Update the stack intstack_use = stmt->instr->stack_use; int stack_gen = stmt->instr->stack_gen; for (int i = 0; i < stack_use; i++) { rou_pointer_stack_pop(type_stack); } for (inti = 0; i < stack_gen; i++) { rou_pointer_stack_push(type_stack, stmt); } } rou_remove_pointer_stack(type_stack); }
Compiler Structure – One Pass • Skeleton Code for One Pass introc_type_annotate_pass(BC_STMTP stmt, SEXP constants, unsigned * profile_table, PT_STACKP type_stack) { SEXP arg0; unsigned* prof_cell; BC_STMTP op1_stmt, op2_stmt; enum OP_CODE op_code = stmt->instr->opcode; switch (op_code) { case LDCONST_OP: ... ... break; case GETVAR_OP: ... ... break; case DDVAL_OP: ... ... break; default: ... ... break; } return xxx; }
Decoding Pass • Transform original int Array into Stmts • Transform addr back to opcode • Organize opcode and operands into stmts #Instructions Vec 7L, # code version Addr of LDCONST.OP, 1L, Addr of SETVAR.OP, 2L, Addr of POP.OP, Addr of GETVAR.OP, 2L, Addr of LDCONST.OP, 3L, Addr of ADD.OP, 4L, Addr of SETVAR.OP, 5L, Addr of POP.OP, Addr of GETFUN.OP, 6L, Addr of MAKEPROM.OP, 7L, Addr of CALL.OP, 8L, Addr of RETURN.OP PC STMT 1 LDCONST, 1 Type:Unknown, Type Source:Fixed 3 SETVAR, 2 Type:Unknown, Type Source:Fixed 5 POP Type:Unknown, Type Source:Fixed 6 GETVAR, 2 Type:Unknown, Type Source:Fixed 8 LDCONST, 3 Type:Unknown, Type Source:Fixed 10 ADD, 4 Type:Unknown, Type Source:Fixed 12 SETVAR, 5 Type:Unknown, Type Source:Fixed 14 POP Type:Unknown, Type Source:Fixed 15 GETFUN, 6 Type:Unknown, Type Source:Fixed 17 MAKEPROM, 7 Type:Unknown, Type Source:Fixed 19 CALL, 8 Type:Unknown, Type Source:Fixed 21 RETURN Type:Unknown, Type Source:Fixed
Type Annotation Pass • A very simple type inference engine • Input: • profile table, and some simple rules • Output: • Type of the object on top of the stack after executing the stmt • Optimize the runtime profile by simple type inference • Reduce the runtime profile requirements GETVAR , 2 ADD , 4 LDCONST , 1 CALL, 8 [Stack TOP] SEXP [...] SEXP [...] [Stack TOP] SEXP [...] [Stack TOP] SEXP [???] [Stack TOP] SEXP [...] Must to profile to get the type Just Check the constant’s type statically If we know the types of the objects on top of the stack before add, we could reason the output’s type statically
Type Annotation Pass – Output Example PC STMT 1 LDCONST, 1 Type:Real Scalar, Type Source:Constant 3 SETVAR, 2 Type:Real Scalar, Type Source:Derived 5 POP Type:Non-Simple Type, Type Source:Derived 6 GETVAR, 2 Type:Real Scalar, Type Source:Profiled[0, 0, 1, 0] 8 LDCONST, 3 Type:Real Scalar, Type Source:Constant 10 ADD, 4 Type:Real Scalar, Type Source:Derived 12 SETVAR, 5 Type:Real Scalar, Type Source:Derived 14 POP Type:Non-Simple Type, Type Source:Derived 15 GETFUN, 6 Type:Non-Simple Type, Type Source:Fixed 17 MAKEPROM, 7 Type:Non-Simple Type, Type Source:Fixed 19 CALL, 8 Type:Real Scalar, Type Source:Profiled[0, 0, 1, 0] 21 RETURN Type:Real Scalar, Type Source:Derived From profile, the values are counts of [Logical scalar, Int Scalar, Real Scalar, Non-simple type] Derived Type. Because the two objects on stack are all real scalar
Unbox Opportunity Identification Pass • Two bits to mark one opcode’s output should be boxed or unboxed • [Must boxed, May Unboxed] • If only “May unboxed” is set, we can add unbox after the stmt • Identify some opcode(e.g. ADD) that could work on unboxed objects • Identify the source stmts that generate the objects on top of the stack • Mark the stmts as “May unboxed” • Identify some opcode (e.g. SETVAR, RETURN) that must work on boxed objects • Identify the source stmts that generate the object on top of the stack • Mark the stmts as “Must boxed” ADD could work on unboxed objects, because the two objects on stack are real scalar Mark the source stmt as “May unboxed” LDCONST, 3 [Stack TOP] SEXP [...] SEXP [...] ADD , 4 GETVAR, 2
Unbox Opportunity Identification Pass – Output Example PC STMT 1 LDCONST, 1 Type:Real Scalar, Type Source:Constant Output shape:Box 3 SETVAR, 2 Type:Real Scalar, Type Source:Derived 5 POP Type:Non-Simple Type, Type Source:Derived 6 GETVAR, 2 Type:Real Scalar, Type Source:Profiled[0, 0, 1, 0] Output shape:Unbox 8 LDCONST, 3 Type:Real Scalar, Type Source:Constant Output shape:Unbox 10 ADD, 4 Type:Real Scalar, Type Source:Derived Output shape:Box 12 SETVAR, 5 Type:Real Scalar, Type Source:Derived 14 POP Type:Non-Simple Type, Type Source:Derived 15 GETFUN, 6 Type:Non-Simple Type, Type Source:Fixed 17 MAKEPROM, 7 Type:Non-Simple Type, Type Source:Fixed 17 MAKEPROM, 7 Type:Non-Simple Type, Type Source:Fixed 19 CALL, 8 Type:Real Scalar, Type Source:Profiled[0, 0, 1, 0] Output shape:Box 21 RETURN Type:Real Scalar, Type Source:Derived Could unbox, because only used in a unbox situation Could unbox, because only used in a unbox situation Must box after the add, because it will be used by “SETVAR” Must box after the add, because it will be used by “RETURN”
Implementation Status and Challenges • Status • Implemented the simple compiler infrastructure as describe • Implemented the three passes as described • Work well on the first example (RealAdd) • Challenges • There is nothing in R – Need implement them all • R is Pure ANIS C implemented • Not C++, No OO, No STL, … • My current simple run pass engine can only support single Basic Block • Handling control flows – Need a lot of additional effort • Identify all Basic Blocks, • Reverse poster order traverse, • Iteration until stable, … • Questions? • Is there a stack VM based compiler infrastructure available?