520 likes | 683 Views
Basic Blocks. Mooly Sagiv Schrierber 317 03-640-7606 Wed 10:00-12:00 html://www.math.tau.ac.il/~msagiv/courses/wcc.html. Already Studied. Source program (string). lexical analysis. Tokens. syntax analysis. Abstract syntax tree. semantic analysis. Abstract syntax tree. Translate.
E N D
Basic Blocks Mooly Sagiv Schrierber 317 03-640-7606 Wed 10:00-12:00 html://www.math.tau.ac.il/~msagiv/courses/wcc.html
Already Studied Source program (string) lexical analysis Tokens syntax analysis Abstract syntax tree semantic analysis Abstract syntax tree Translate Tree IR
Mismatches between IR and Machine Languages • CJUMP jumps into two labels • But typical machine instructions use one targetBEQ Ri, Rj, L • Optimizing IR programs is difficult due to side effects in expressions • ESEQ nodes • Call nodes • Call nodes within expressions prevent passing arguments in registers
Mismatches between IR and Machine Languages • Call nodes within expressions prevent passing arguments in registers binop plus call call Name f1 exp1 Name f2 exp2
Why can’t we be smarter? • Avoid two-way jumps • Do not use ESEQ expressions
Three Phase Solution • Rewrite the tree into a list of canonical trees without SEQ or ESEQ nodes • Group the list into basic blocks • Order basic blocks into a set of traces • CJUMP is immediately followed by false label
nfact example function nfactor (n: int): int= if n = 0 then 1 else n * nfactor(n-1)
MOVE ESEQ ESEQ LABEL l0 ESEQ TEMP t103 CJUMP l0 l1 CONST 0 EQ TEMP t128 ESEQ MOVE ESEQ JUMP TEMP t129 CONST 1 ESEQ NAME l2 LABEL l1 ESEQ MOVE TEMP t129 BINOP LABEL l2 TIMES CALL TEMP t128 TEMP t129 nfactor BINOP MINUS CONST 1 TEMP t128
SEQ SEQ SEQ LABEL l0 SEQ MOVE SEQ CJUMP SEQ l1 JUMP l0 EQ CONST 1 CONST 0 LABEL l1 SEQ TEMP t129 NAME l2 TEMP t128 SEQ LABEL l2 MOVE SEQ MOVE MOVE MOVE BINOP TEMP t131 TEMP t128 TEMP t129 TEMP t103 TEMP t131 TIMES TEMP t130 CALL BINOP TEMP t129 TEMP t130 nfactor MINUS TEMP t128 CONST 1
LABEL(l3) CJUMP(EQ, TEMP t128, CONST 0, l0, l1) LABEL( l0) MOVE(TEMP t129, CONST 1) JUMP(NAME l2) LABEL( l1) MOVE(TEMP t131, TEMP t128) MOVE(TEMP t130, CALL(nfactor, BINOP(MINUS, TEMP t128, CONST 1))) MOVE(TEMP t129, BINOP(TIMES, TEMP t131, TEMP t30)) JUMP(NEME l2) LABEL( l2) MOVE(TEMP t103, TEMP t129) JUMP(NAME lend)
LABEL(l3) CJUMP(EQ, TEMP t128, CONST 0, l0, l1) LABEL( l1) MOVE(TEMP t131, TEMP t128) MOVE(TEMP t130, CALL(nfactor, BINOP(MINUS, TEMP t128, CONST 1))) MOVE(TEMP t129, BINOP(TIMES, TEMP t131, TEMP t130)) /* JUMP(NAME l2) */ LABEL( l2) MOVE(TEMP t103, TEMP t129) JUMP(NAME lend) LABEL( l0) MOVE(TEMP t129, CONST 1) JUMP(NAME l2)
Outline • The Cannon Interface • Phase 1: Removal of ESEQ nodes • Phase 2: Basic Blocks • Phase 3: Order traces • CJUMP is followed by a false label
/* canon.h */ typedef struct C_stmListList *C_stmListList; struct C_block {C_stmListList stmLists; Temp_label label;} struct C_stmListList_ { T_stmList head; C_stmListList tail;} T_stmList C_linearize(T_stm stm); /* Eliminate ESEQs */ struct C_block C_basicBlocks(T_stmList stmList); T_stmList C_traceSchedule(struct C_block b); /* main.c */ static void doProc(FILE *out, F_frame frame, T_stm body) { T_stmList stmList; AS_instrList iList; stmList = C_linearize(body); /* Removes ESEQs */ stmList = C_traceSchedule(C_basicBlocks(stmList)); iList = F_codegen(frame, stmList); /* 9 */ }
Canonical Trees (Phase 1) • Rewrite the tree • No SEQ and ESEQ • The parent of each CALL is either EXP or MOVE(TEMP t, …) • Apply “meaning preserving” rewriting rules • Sometimes generates temporaries
ESEQ ESEQ e SEQ s1 ESEQ s2 s1 e s2
BINOP ESEQ BINOP s ESEQ e2 op e2 e1 op s e1
ESEQ MEM s e MEM ESEQ s e
SEQ JUMP s e JUMP ESEQ s e
CJUMP op ESEQ e2 l1 l1 s e1
ESEQ BINOP s e2 e1 op BINOP e1 ESEQ op e2 s When s and e1 commutes
Which statements commute? • In general very difficult • Example • The compiler decides conservatively MEM MOVE e1 MEM e3 e2
Which statements commute? static bool commute(T_stm x, T_exp y) { if (isNop(x)) return TRUE; if (y->kind == T_NAME || y->kind == T_CONST) return TRUE; return FALSE; }
When s and e1 may not commute ESEQ BINOP ESEQ MOVE e1 ESEQ op TEMP t s e1 BINOP e2 s e2 op TEMP t
When s and e1 may not commute CJUMP op e1 ESEQ l1 l1 s e2 SEQ SEQ MOVE CJUMP TEMP t e1 s e2 TEMP t l1 l1 op
MOVE ESEQ ESEQ LABEL l0 ESEQ TEMP t103 CJUMP l0 l1 CONST 0 EQ TEMP t128 ESEQ MOVE ESEQ JUMP TEMP t129 CONST 1 ESEQ NAME l2 LABEL l1 ESEQ MOVE TEMP t129 BINOP LABEL l2 TIMES CALL TEMP t128 TEMP t129 nfactor BINOP MINUS CONST 1 TEMP t128
SEQ MOVE TEMP t103 ESEQ LABEL l0 ESEQ CJUMP l0 l1 CONST 0 ESEQ MOVE EQ TEMP t128 ESEQ JUMP TEMP t129 CONST 1 ESEQ NAME l2 LABEL l1 ESEQ MOVE TEMP t129 BINOP LABEL l2 TIMES CALL TEMP t128 TEMP t129 nfactor BINOP MINUS CONST 1 TEMP t128
SEQ SEQ MOVE ESEQ TEMP t103 LABEL l0 CJUMP l0 l1 CONST 0 ESEQ MOVE EQ TEMP t128 ESEQ JUMP TEMP t129 CONST 1 ESEQ NAME l2 LABEL l1 ESEQ MOVE TEMP t129 BINOP LABEL l2 TIMES CALL TEMP t128 TEMP t129 nfactor BINOP MINUS CONST 1 TEMP t128
SEQ SEQ SEQ MOVE LABEL l0 MOVE CJUMP TEMP t103 ESEQ l1 l0 CONST 1 EQ ESEQ JUMP CONST 0 TEMP t129 TEMP t128 ESEQ NAME l2 LABEL l1 ESEQ MOVE TEMP t129 BINOP LABEL l2 TIMES CALL TEMP t128 TEMP t129 nfactor BINOP MINUS CONST 1 TEMP t128
SEQ SEQ SEQ LABEL l0 SEQ MOVE MOVE CJUMP ESEQ l1 TEMP t103 JUMP l0 EQ CONST 1 CONST 0 TEMP t129 NAME l2 ESEQ TEMP t128 LABEL l1 ESEQ MOVE TEMP t129 BINOP LABEL l2 TIMES CALL TEMP t128 TEMP t129 nfactor BINOP MINUS CONST 1 TEMP t128
SEQ SEQ SEQ LABEL l0 SEQ MOVE SEQ CJUMP l1 JUMP MOVE l0 EQ CONST 1 CONST 0 LABEL l1 TEMP t129 NAME l2 TEMP t103 ESEQ TEMP t128 ESEQ MOVE TEMP t129 BINOP LABEL l2 TIMES CALL TEMP t128 TEMP t129 nfactor BINOP MINUS CONST 1 TEMP t128
SEQ SEQ SEQ LABEL l0 SEQ MOVE SEQ CJUMP SEQ l1 JUMP l0 EQ CONST 1 CONST 0 LABEL l1 MOVE TEMP t129 NAME l2 TEMP t128 TEMP t103 ESEQ MOVE TEMP t129 BINOP LABEL l2 TIMES CALL TEMP t128 TEMP t129 nfactor BINOP MINUS CONST 1 TEMP t128
SEQ SEQ SEQ LABEL l0 SEQ MOVE SEQ CJUMP SEQ l1 JUMP l0 EQ CONST 1 CONST 0 LABEL l1 MOVE TEMP t129 NAME l2 TEMP t128 TEMP t103 ESEQ MOVE TEMP t129 BINOP LABEL l2 TIMES TEMP t128 ESEQ TEMP t129 MOVE TEMP t130 TEMP t130 CALL BINOP MINUS nfactor TEMP t128 CONST 1
SEQ SEQ SEQ LABEL l0 SEQ MOVE SEQ CJUMP SEQ l1 JUMP l0 EQ CONST 1 CONST 0 LABEL l1 MOVE TEMP t129 NAME l2 TEMP t128 MOVE TEMP t103 TEMP t129 ESEQ ESEQ ESEQ MOVE TEMP t131 MOVE TEMP t128 LABEL l2 BINOP TEMP t131 TIMES TEMP t130 TEMP t130 CALL BINOP TEMP t129 nfactor MINUS TEMP t128 CONST 1
SEQ SEQ SEQ LABEL l0 SEQ MOVE SEQ CJUMP SEQ l1 JUMP l0 EQ CONST 1 CONST 0 LABEL l1 MOVE TEMP t129 NAME l2 TEMP t128 SEQ TEMP t103 ESEQ MOVE MOVE ESEQ TEMP t129 TEMP t131 MOVE TEMP t128 LABEL l2 BINOP TEMP t131 TIMES TEMP t130 TEMP t130 CALL BINOP nfactor TEMP t129 MINUS TEMP t128 CONST 1
SEQ SEQ SEQ LABEL l0 SEQ MOVE SEQ CJUMP SEQ l1 JUMP l0 EQ CONST 1 CONST 0 LABEL l1 MOVE TEMP t129 NAME l2 TEMP t128 SEQ TEMP t103 ESEQ SEQ MOVE MOVE MOVE BINOP TEMP t131 TEMP t128 TEMP t129 LABEL l2 TEMP t131 TIMES TEMP t130 CALL BINOP TEMP t129 TEMP t130 nfactor MINUS TEMP t128 CONST 1
SEQ SEQ SEQ LABEL l0 SEQ MOVE SEQ CJUMP SEQ l1 JUMP l0 EQ CONST 1 CONST 0 LABEL l1 SEQ TEMP t129 NAME l2 TEMP t128 SEQ LABEL l2 MOVE SEQ MOVE MOVE MOVE BINOP TEMP t131 TEMP t128 TEMP t129 TEMP t103 TEMP t131 TIMES TEMP t130 CALL BINOP TEMP t129 TEMP t130 nfactor MINUS TEMP t128 CONST 1
A Theoretical Solution • Apply rewriting rules until convergence • The result need not be unique • Efficiency and termination of the compiler
A Practical Solution • Apply rewriting rules in “one” pass • Two mutually recursive routines • do_stm(s) applies rewritings to s • do_exp(e) applies rewritings to e • reorder(expRefList) • Returns the side effect statements in expRefList • Replaces expressions by temporaries • Code distributed in “cannon.c”
Taming Conditional Brunch • Reorder statements so that CJUMP is followed by a false label • Two subphases: • Partition the statement list into basic blocks(straightline programs starting with a label and ending with a branch) • Reorder basic blocks (Traces)
Phase 2: Basic Blocks • The compiler does not know which branch will be taken • Conservatively analyze the control flow of the program • A basic block • The first statement is a LABEL • The last statement is JUMP or CJUMP • There are no other LABELs, JUMPs, or CJUMPs
An Algorithm for Basic BlocksC_basicBlocks() • Applied for each function body • Scan the statement list from left to right • Whenever a LABEL is found • a new block begins (and the previous block ends) • Whenever JUMP or CJUMP are found • the current block ends (and the next block begins) • When a block ends without JUMP or CJUMP • JUMP to the following LABEL • When a block does not start with a LABEL • Add a LABEL • At the end of the function body jump to the beginning of the epilogue
LABEL(l3) CJUMP(EQ, TEMP t128, CONST 0, l0, l1) LABEL( l0) MOVE(TEMP t129, CONST 1) JUMP(NAME l2) LABEL( l1) MOVE(TEMP t131, TEMP t128) MOVE(TEMP t130, CALL(nfactor, BINOP(MINUS, TEMP t128, CONST 1))) MOVE(TEMP t129, BINOP(TIMES, TEMP t131, TEMP t130)) LABEL( l2) MOVE(TEMP t103, TEMP t129) JUMP(NAME l2) JUMP(NAME lend)
Traces • Reorder basic blocks • Every CJUMP is followed by a false label • Many of the unconditional jumps are followed by the corresponding labels • can be eliminated • A trace • a sequence of basic blocks that are executed sequentially • A program has many overlapping traces • Find a set of traces that exactly covers the program • every block appears in exactly one trace • Minimize the number of traces
An Algorithm for Generating Traces C_traceSchedule()'' Put all the blocks of the program into a list Q whileQ is not empty do Start a new (empty) trace, call it T Remove the head element b of Q whileb is not marked do mark b append b to the end of the current trace T if there is an unmarked successor c of b b := c end of current trace T
T1 LABEL(l3) CJUMP(EQ, TEMP t128, CONST 0, l0, l1) T2 LABEL( l0) MOVE(TEMP t129, CONST 1) JUMP(NAME l2) LABEL( l1) MOVE(TEMP t131, TEMP t128) MOVE(TEMP t130, CALL(nfactor, BINOP(MINUS, TEMP t128, CONST 1))) MOVE(TEMP t129, BINOP(TIMES, TEMP t131, TEMP t130)) JUMP(NAME l2) LABEL( l2) MOVE(TEMP t103, TEMP t129) JUMP(NAME lend)
Finishing-Up • CJUMP followed by a false label is left alone • JUMP (NAME l) that is followed by a label l is removed • CJUMP followed by a true label • replace true and false labels and negate the condition • If CJUMP(cond, a, b, lt, lf) is not followed by lt or lf • Replace by: CJUMP(cond, a, b, lt, l'f)LABEL(l'f)JUMP(NAME lf) • At the end of the process flat basic blocks (trade simplicity for efficiency of the compiler and of the generated code)
T1 LABEL(l3) CJUMP(EQ, TEMP t128, CONST 0, l0, l1) T2 LABEL( l0) MOVE(TEMP t129, CONST 1) JUMP(NAME l2) LABEL( l1) MOVE(TEMP t131, TEMP t128) MOVE(TEMP t130, CALL(nfactor, BINOP(MINUS, TEMP t128, CONST 1))) MOVE(TEMP t129, BINOP(TIMES, TEMP t131, TEMP t130)) JUMP(NAME l2) LABEL( l2) MOVE(TEMP t103, TEMP t129) JUMP(NAME lend)
LABEL(l3) CJUMP(EQ, TEMP t128, CONST 0, l0, l1) LABEL( l1) MOVE(TEMP t131, TEMP t128) MOVE(TEMP t130, CALL(nfactor, BINOP(MINUS, TEMP t128, CONST 1))) MOVE(TEMP t129, BINOP(TIMES, TEMP t131, TEMP t130)) JUMP(NAME l2) LABEL( l2) MOVE(TEMP t103, TEMP t129) JUMP(NAME lend) LABEL( l0) MOVE(TEMP t129, CONST 1) JUMP(NAME l2)
LABEL(l3) CJUMP(EQ, TEMP t128, CONST 0, l0, l1) LABEL( l1) MOVE(TEMP t131, TEMP t128) MOVE(TEMP t130, CALL(nfactor, BINOP(MINUS, TEMP t128, CONST 1))) MOVE(TEMP t129, BINOP(TIMES, TEMP t131, TEMP t130)) /* JUMP(NAME l2) */ LABEL( l2) MOVE(TEMP t103, TEMP t129) JUMP(NAME lend) LABEL( l0) MOVE(TEMP t129, CONST 1) JUMP(NAME l2)
Optimal-Traces • Optimizing compilers locate traces for frequently executed instructions • Minimize the (dynamic) number of jumps • Improve instruction cache performance • Improves register allocation • Optimize loops • Sometimes use • Static heuristics • Profiling information • Dynamic compilation