310 likes | 326 Views
Learn how to optimize compilers with the SSA Construction Algorithm in CISC 673 for Spring 2011 by John Cavazos at the University of Delaware. Topics include inserting Φ-functions and dominance frontiers.
E N D
Optimizing CompilersCISC 673Spring 2011Static Single Assignment III John Cavazos University of Delaware
SSA Construction Algorithm 1. Insert -functions a.) calculate dominance frontiers b.) find global names for each name, build list of blocks that define it c.) insert -functions
Insert -functions • global name n worklist ← Block(n) // blocks in which n is assigned block b ∈ worklist block d in b’s dominance frontier insert a -function for n in d add d to worklist
B0 B1 x (...) B2 B3 B4 B5 x ... B6 x (...) • DF(4) is {6}, so def in 4 forces -function in 6 B7 x (...) • def in 6 forces -function in DF(6) = {7} • def in 7 forces -function in DF(7) = {1} Dominance Frontiers & -Function Insertion • A definition at n forces a -function at m iff • n DOM(m) but n DOM(p) for some p preds(m) • DF(n ) is fringe just beyond region n dominates Dominance Frontiers • def in 1 forces -function in DF(1) = Ø (halt ) For each assignment, we insert the -functions
B0 i > 100 i ... B1 a (a,a) b (b,b) c (c,c) d (d,d) i (i,i) a ... c ... B2 B3 b ... c ... d ... a ... d ... d (d,d) c (c,c) b ... B4 B5 B6 d ... c ... B7 a (a,a) b (b,b) c (c,c) d (d,d) y a+b z c+d i i+1 i > 100 Excluding local names avoids ’s for y & z • With all the -functions • Lots of new ops • Renaming is next Assume a, b, c, & d defined before B0
SSA Construction Algorithm (Less high-level sketch) 2. Rename variables in a pre-order walk over dominator tree (uses counter and a stack per global name) Staring with the root block, b a.) generate unique names for each -function and push them on the appropriate stacks
SSA Construction Algorithm (Less high-level sketch) • Rename variables (cont’d) b.) rewrite each operation in the block i. Rewrite uses of global names with the current version (from the stack) ii. Rewrite definition by creating & pushing new name c.) fill in -function parameters of successor blocks d.) recurse on b’s children in the dominator tree e.)<on exit from block b> pop names generated in b from stacks
SSA Construction Algorithm Adding all the details ... Rename(b) for each -function in b, x (…) rename x as NewName(x) for each operation “x y op z” in b rewrite y as top(stack[y]) rewrite z as top(stack[z]) rewrite x as NewName(x) for each successor of b in the CFG rewrite appropriate parameters for each successor s of b in dom. tree Rename(s) for each operation “x y op z” in b pop(stack[x]) for each global name i counter[i] 0 call Rename(n0) NewName(n) i counter[n] counter[n] counter[n] + 1 push ni onto stack[n] return ni
B0 i > 100 i ... B1 a (a,a) b (b,b) c (c,c) d (d,d) i (i,i) a ... c ... Assume a, b, c, & d defined before B0 B2 B3 b ... c ... d ... a ... d ... d (d,d) c (c,c) b ... B4 B5 B6 d ... c ... B7 a (a,a) b (b,b) c (c,c) d (d,d) y a+b z c+d i i+1 i has not been defined i > 100 Before processing B0 a b c d i Counters Stacks 1 1 1 1 0 a0 b0 c0 d0
B1 a (a0,a) b (b0,b) c (c0,c) d (d0,d) i (i0,i) a ... c ... B2 B3 b ... c ... d ... a ... d ... B4 B5 d ... c ... i > 100 B0 i > 100 i0 ... End of B0 d (d,d) c (c,c) b ... a b c d i B6 Counters Stacks 1 1 1 1 1 B7 a (a,a) b (b,b) c (c,c) d (d,d) y a+b z c+d i i+1 a0 b0 c0 d0 i0
B1 a1 (a0,a) b1 (b0,b) c1 (c0,c) d1 (d0,d) i1 (i0,i) a2 ... c2 ... B2 B3 b ... c ... d ... a ... d ... B4 B5 d ... c ... i > 100 B0 i > 100 i0 ... End of B1 d (d,d) c (c,c) b ... a b c d i B6 Counters Stacks 3 2 3 2 2 B7 a (a,a) b (b,b) c (c,c) d (d,d) y a+b z c+d i i+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 c2 a2
B1 a1 (a0,a) b1 (b0,b) c1 (c0,c) d1 (d0,d) i1 (i0,i) a2 ... c2 ... B2 B3 b2 ... c3 ... d2 ... a ... d ... B4 B5 d ... c ... i > 100 B0 i > 100 i0 ... End of B2 d (d,d) c (c,c) b ... a b c d i B6 Counters Stacks 3 3 4 3 2 B7 a (a2,a) b (b2,b) c (c3,c) d (d2,d) y a+b z c+d i i+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 d2 b2 c2 a2 c3 23
B1 a1 (a0,a) b1 (b0,b) c1 (c0,c) d1 (d0,d) i1 (i0,i) a2 ... c2 ... B2 B3 b2 ... c3 ... d2 ... a ... d ... B4 B5 d ... c ... i > 100 B0 i > 100 i0 ... Before starting B3 d (d,d) c (c,c) b ... a b c d i B6 Counters Stacks 3 3 4 3 2 B7 i ≤ 100 a (a2,a) b (b2,b) c (c3,c) d (d2,d) y a+b z c+d i i+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 c2 a2 24
B1 a1 (a0,a) b1 (b0,b) c1 (c0,c) d1 (d0,d) i1 (i0,i) a2 ... c2 ... B2 B3 b2 ... c3 ... d2 ... a3 ... d3 ... B4 B5 d ... c ... i > 100 B0 i > 100 i0 ... End of B3 d (d,d) c (c,c) b ... a b c d i B6 Counters Stacks 4 3 4 4 2 B7 a (a2,a) b (b2,b) c (c3,c) d (d2,d) y a+b z c+d i i+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 c2 d3 a2 a3 25
B1 a1 (a0,a) b1 (b0,b) c1 (c0,c) d1 (d0,d) i1 (i0,i) a2 ... c2 ... B2 B3 b2 ... c3 ... d2 ... a3 ... d3 ... B4 B5 d4 ... c ... i > 100 B0 i > 100 i0 ... End of B4 d (d4,d) c (c2,c) b ... a b c d i B6 Counters Stacks 4 3 4 5 2 B7 a (a2,a) b (b2,b) c (c3,c) d (d2,d) y a+b z c+d i i+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 c2 d3 a2 a3 d4 26
B1 a1 (a0,a) b1 (b0,b) c1 (c0,c) d1 (d0,d) i1 (i0,i) a2 ... c2 ... B2 B3 b2 ... c3 ... d2 ... a3 ... d3 ... B4 B5 d4 ... c4 ... i > 100 B0 i > 100 i0 ... End of B5 d (d4,d3) c (c2,c4) b ... a b c d i B6 Counters Stacks 4 3 5 5 2 B7 a (a2,a) b (b2,b) c (c3,c) d (d2,d) y a+b z c+d i i+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 c2 d3 a2 a3 c4 27
B1 a1 (a0,a) b1 (b0,b) c1 (c0,c) d1 (d0,d) i1 (i0,i) a2 ... c2 ... B2 B3 b2 ... c3 ... d2 ... a3 ... d3 ... B4 B5 d4 ... c4 ... i > 100 B0 i > 100 i0 ... End of B6 d5 (d4,d3) c5 (c2,c4) b3 ... a b c d i B6 Counters Stacks 4 4 6 6 2 B7 a (a2,a3) b (b2,b3) c (c3,c5) d (d2,d5) y a+b z c+d i i+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 c2 d3 a2 b3 a3 c5 d5 28
B1 a1 (a0,a) b1 (b0,b) c1 (c0,c) d1 (d0,d) i1 (i0,i) a2 ... c2 ... B2 B3 b2 ... c3 ... d2 ... a3 ... d3 ... B4 B5 d4 ... c4 ... i > 100 B0 i > 100 i0 ... Before B7 d5 (d4,d3) c5 (c2,c4) b3 ... a b c d i B6 Counters Stacks 4 4 6 6 2 B7 a (a2,a3) b (b2,b3) c (c3,c5) d (d2,d5) y a+b z c+d i i+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 c2 a2 29
B1 a1 (a0,a4) b1 (b0,b4) c1 (c0,c6) d1 (d0,d6) i1 (i0,i2) a2 ... c2 ... B2 B3 b2 ... c3 ... d2 ... a3 ... d3 ... B4 B5 d4 ... c4 ... i > 100 B0 i > 100 i0 ... End of B7 d5 (d4,d3) c5 (c2,c4) b3 ... a b c d i B6 Counters Stacks 5 5 7 7 3 B7 a4 (a2,a3) b4 (b2,b3) c6 (c3,c5) d6 (d2,d5) y a4+b4 z c6+d6 i2 i1+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 b4 c2 d6 a2 i2 a4 c6 30
B1 a1 (a0,a4) b1 (b0,b4) c1 (c0,c6) d1 (d0,d6) i1 (i0,i2) a2 ... c2 ... B2 B3 b2 ... c3 ... d2 ... a3 ... d3 ... B4 B5 d4 ... c4 ... i > 100 B0 i > 100 i0 ... • After renaming • Semi-pruned SSA form • We’re done … d5 (d4,d3) c5 (c2,c4) b3 ... B6 B7 a4 (a2,a3) b4 (b2,b3) c6 (c3,c5) d6 (d2,d5) y a4+b4 z c6+d6 i2 i1+1 Semi-pruned only names live in 2 or more blocks are “global names”. 31
... X17 x10 ... X17 x11 X17(x10,x11) ... x17 ... x17 SSA Deconstruction At some point, we need executable code • Few machines implement operations • Need to fix up the flow of values Basic idea • Insert copies -function pred’s • Simple algorithm • Works in most cases • Adds lots of copies • Many of them coalesce away
Constant Propagation • Along every path to point p, variable v has same “known” value
Constant Prop Example ⊥⊥⊥ Set Boundary Conditions XYZ 1. ⊥⊥⊥ 2. 3. ⊥⊥⊥ ⊥⊥⊥ 4. ⊥⊥⊥
Constant Prop Example ⊥⊥⊥ XYZ out1= 1⊥⊥ out2= 023 out3= 12⊥ out4= ⊤23 1. 1⊥⊥ ⊥⊥⊥ 2. 3. ⊥⊥⊥ ⊥⊥⊥ We are propagating information through each node. 12⊥ 023 4. ⊥⊥⊥ ⊤23
Sparse Constant Prop Example • Consider what happens when a variable gets updated during constant prop using worklist • Put all successors of CFG node into worklist • But if “x” is not used in immediate successors? • Lot of wasted time propagating data. • Update of “x” only matters at last node
Sparse Constant Prop Example • Instead of propagating data along CFG, what if we just propagate along use-def edges? • When x is updated • propagate data directly to last node • bypasses all intermediate nodes!
Problems with U-D chains • Can be expensive to represent • Each use can have multiple defs • Makes it difficult to keep u/d information accurate • Multiple defs make optimization harder • Use SSA!
SSA vs U-D Chains • We have 16 u-d chains • In SSA form: place a phi-node in middle
Problems with u-d chains • What happens if we statically know direction of branch? • Do no need to propagate information along that path • Easy to do with CFGs • U-D chains • Hard to tell which definitions to ignore
U-D with SSA • SSA form shortens u-d chains • Chains terminate at merge points, rather than crossing them • Can simply ignore information merged from un-taken branches • Much easier to account for irrelevant information