220 likes | 244 Views
OPTIMIZATION. Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University Taipei, TAIWAN. Introduction. local optimization within a basic block may be accompanying with code generation e.g., peephole optimization global optimization
E N D
OPTIMIZATION Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University Taipei, TAIWAN
Introduction • local optimization • within a basic block • may be accompanying with code generation • e.g., peephole optimization • global optimization • over more than one basic blocks • e.g., loop optimization data flow analysis (a technique)
Peephole optimization (1/2) • modify particular pattern in a small window (peephole; 2-3 instructions) • may on intermediate or target code • constant folding (evaluate constant expressions in advance) • ( +, Lit1, Lit2, Result ) Þ( :=, Lit1+Lit2, Result ) • ( :=, Lit1, Result1 ), ( +, Lit2, Result1, Result2 )Þ( :=, Lit1, Result1 ), ( :=, Lit1+Lit2, Result2 ) • strength reduction (replace slow operations with faster equivalents) • ( *, Operand, 2, Result ) Þ( ShiftLeft, Operand, 1, Result ) • ( *, Operand, 4, Result ) Þ( ShiftLeft, Operand, 2, Result ) • null sequences (delete useless operations) • ( +, Operand, 0, Result ) Þ( :=, Operand, Result ) • ( *, Operand, 1, Result ) Þ ( :=, Operand, Result )
Peephole optimization (2/2) • combine operations (replace several operations with one equivalent) • Load A, Rj; Load A+1, Rj+1ÞDoubleLoad A, Rj • BranchZero L1, R1; Branch L2; L1: ÞBranchNotZero L2, R1 • Subtract #1, R1; BranchZero L1, R1 ÞSubtractOneBranch L1, R1 • algebraic laws (use algebraic laws to simplify or reorder instructions) • ( +, Lit, Operand, Result ) Þ( +, Operand, Lit, Result ) • ( -, 0, Operand, Result ) Þ( Negate, Operand, Result ) • special case instructions (use instructions designed for special operand cases) • Subtract #1, R1 Þ Decrement R1 • Add #1, R1 Þ Increment R1 • Load #0, R1; Store A, R1 ÞClear A • address mode operations (use address modes to simplify code) • Load A, R1; Add 0(R1), R2 ÞAdd @A, R2 • Subtract #2, R1; Clear 0(R1) ÞClear -(R1)
Loop optimization (1/6) • due to 90 / 10 rule • example -- for l in 1..100 loop for J in 1..100 loop T1 := Adr( A(l)(J) ); T2 := I * J; for K in 1..100 loop T1(K) := T2 * K; end loop; end loop; end loop; loop invariant expression factorization for l in 1..100 loop for J in 1..100 loop for K in 1..100 loop A(l)(J)(K) := ( I * J ) * K; end loop; end loop; end loop; for l in 1..100 loop T3 := Adr( A(I) ); for J in 1..100 loop T1 :=Adr( T3(J) ); T2 := I * J; for K in 1..100 loop T1(K) := T2 * K; end loop; end loop; end loop; loop invariant expression factorization
Loop optimization (2/6) for l in 1..100 loop T3 := Adr( A(I) ); for J in 1..100 loop T1 :=Adr( T3(J) ); T2 := I * J; for K in 1..100 loop T1(K) := T2 * K; end loop; end loop; end loop; for l in 1..100 loop T3 := Adr( A(I) ); T4 := I; -- Initial value of l*J for J in 1..100 loop T1 := Adr( T3(J) ); T2 := T4; -- T4 holds I*J T5 := T2; -- Initial value of T2*K for K in 1..100 loop T1(K) := T5; -- T5 holds T2*K = I*J*K T5 := T5 + T2; end loop; T4 := T4 + I; end loop; end loop; induction variable elimination
Loop optimization (3/6) for l in 1..100 loop T3 := Adr( A(I) ); T4 := I; for J in 1..100 loop T1 := Adr( T3(J) ); T2 := T4; T5 := T2; for K in 1..100 loop T1(K) := T5; T5 := T5 + T2; end loop; T4 := T4 + I; end loop; end loop; for l in 1..100 loop T3 := Adr( A(I) ); T4 := I; -- Initial value of l*J for J in 1..100 loop T1 := Adr( T3(J) ); T5 := T4; -- Initial value of T2*K for K in 1..100 loop T1(K) := T5; -- T5 holds T2*K = I*J*K T5 := T5 + T4; end loop; T4 := T4 + I; end loop; end loop; copy propagation
Loop optimization (4/6) for l in 1..100 loop T3 := Adr( A(I) ); T4 := I; for J in 1..100 loop T1 := Adr( T3(J) ); T5 := T4; for K in 1..100 loop T1(K) := T5; T5 := T5 + T4; end loop; T4 := T4 + I; end loop; end loop; for l in 1..100 loop T3 := A0 + ( 10000 * l ) - 10000; T4 := I; -- Initial value of l*J for J in 1..100 loop T1 := T3 + ( 100 * J ) - 100; T5 := T4; -- Initial value of T4*K for K in 1..100 loop (T1+K-1):= T5; -- T5 holds T4*K = I*J*K T5 := T5 + T4; end loop; T4 := T4 + I; end loop; end loop; subscripting code expansion
Loop optimization (5/6) for l in 1..100 loop T3 := A0 + ( 10000 * l ) - 10000; T4 := I; for J in 1..100 loop T1 := T3 + ( 100 * J ) - 100; T5 := T4; for K in 1..100 loop (T1+K-1) := T5; T5 := T5 + T4; end loop; T4 := T4 + I; end loop; end loop; T6 := A0 ; -- Initial value of Adr(A(I)) for l in 1..100 loop T3 := T6; T4 := I; -- Initial value of l*J T7 := T3; -- Initial value of Adr(A(l)(J)) for J in 1..100 loop T1 := T7; T5 := T4; -- Initial value of T4*K T8 := T1; -- Initial value of Adr(A(l)(J)(K)) for K in 1..100 loop T8 := T5; -- T5 holds T4*K = I*J*K T5 := T5 + T4; T8 := T8 + 1; end loop; T4 := T4 + I; T7 := T7 + 100; end loop; T6 := T6 + 10000; end loop; induction variable elimination
Loop optimization (6/6) T6 := A0 ; for l in 1..100 loop T3 := T6; T4 := I; T7 := T3; for J in 1..100 loop T1 := T7; T5 := T4; T8 := T1; for K in 1..100 loop T8 := T5; T5 := T5 + T4; T8 := T8 + 1; end loop; T4 := T4 + I; T7 := T7 + 100; end loop; T6 := T6 + 10000; end loop; T6 := A0 ; -- Initial value of Adr(A(I)) for l in 1..100 loop T4 := I; -- Initial value of l*J T7 := T6; -- Initial value of Adr(A(l)(J)) for J in 1..100 loop T5 := T4; -- Initial value of T4*K T8 := T7; -- Initial value of Adr(A(l)(J)(K)) for K in 1..100 loop T8 := T5; -- T5 holds T4*K = I*J*K T5 := T5 + T4; T8 := T8 + 1; end loop; T4 := T4 + I; T7 := T7 + 100; end loop; T6 := T6 + 10000; end loop; copy propa- gation
0 Read(Limit) I := 1 1 I > Limit 2 Read(J) I = 1 3 4 Sum := J Sum := Sum + J 5 I := I + 1 6 Write(Sum) Global data flow analysis (1/2) • to fetch information for global structure, not only for a basic block • data flow graph • node -- basic block • example -- Read ( Limit ) ; for I in 1 .. Limit loop Read ( J ) ; if I = 1 then Sum := J ; else Sum := Sum + J ; end if ; end loop ; Write ( Sum ) ;
Global data flow analysis (2/2) • classification of data flow analyses • any-path v.s. all-path • forward-flow v.s. backward-flow • dependent on different types of information • data flow equations • each basic block has 4 sets, IN, OUT, KILLED, and GEN, whose relationships are specified by data flow equations • equations for all basic blocks need to be satisfied simultaneously • may not unique solution • solution • iterative method • structure method
p p b s s Any-path forward-flow analysis • example -- uninitialized variable (used but undefined) • IN -- uninitialized just before this basic block • OUT -- uninitialized before (including) this basic block • KILLED -- defined • GEN -- out of scope • data flow equations -- • IN(b) = È iÎP(b) OUT(i) • OUT(b) = GEN(b) È ( IN(b) - KILLED(b) ) • IN(first) = universal set • initial condition, i.e., IN(first), is case by case
p p b s s Any-path backward-flow analysis • example -- live variable • OUT -- will be used just after this basic block • IN -- will be used after (including) this basic block • KILLED -- defined • GEN -- used • data flow equations -- • OUT(b) = È iÎS(b) IN(i) • IN(b) = GEN(b) È ( OUT(b) - KILLED(b) ) • OUT(last) = f
p p b s s All-path forward-flow analysis • example -- available expression (to check redundant computation) • IN -- already computed just before this basic block • OUT -- already computed before (including) this basic block • KILLED -- one of operands is re-defined • GEN -- computed subexpression • data flow equations -- • IN(b) = Ç iÎP(b) OUT(i) • OUT(b) = GEN(b) È ( IN(b) - KILLED(b) ) • IN(first) = f
p p b s s All-path backward-flow analysis • example -- very busy expression (worth storing on register) • OUT -- will be used for all cases just after this basic block • IN -- will be used for all cases after (including) this basic block • KILLED -- defined • GEN -- used • data flow equations -- • OUT(b) = Ç iÎS(b) IN(i) • IN(b) = GEN(b) È ( OUT(b) - KILLED(b) ) • OUT(last) = f
S1 S2 S1 S2 Structure method of data flow solution (1/4) • for backward analysis -- I Û O • for forward analysis • I = I1 O = ( I2 - K2 ) È G2 = ( ((I1-K1)ÈG1) - K2 ) È G2 = ( I - (K1ÈK2) ) È (G1-K2)ÈG2 K = K1È K2 G = ( G1 - K2 ) È G2 • I = I1 = I2 O = O1È O2 = ((I1-K1)ÈG1) È ((I2-K2)ÈG2) = ( I - (K1ÇK2) ) È (G1ÈG2) K = K1Ç K2 G = G1È G2 (any path)
S1 S2 (all path) S1 S1 S2 S1 S2 S1 Structure method of data flow solution (2/4) • I = I1 = I2 O = O1Ç O2 = ((I1-K1)ÈG1) Ç ((I2-K2)ÈG2) = × × × = ( I - (K1ÈK2) ) È (G1ÇG2) K = K1È K2 G = G1Ç G2 • any path K = K1– all path K = K1È K2 G = ( G2 - K1 ) È G1 G = G1
0 Read(Limit) I := 1 1 I > Limit 2 Read(J) I = 1 3 4 Sum := J Sum := Sum + J 5 I := I + 1 6 Write(Sum) Structure method of data flow solution (3/4) • example -- uninitialized variable