180 likes | 193 Views
Explore the efficient optimization technique of Interprocedural Symbolic Range Propagation in compiler analysis. Learn the background, terminology, and algorithms involved in propagating symbolic ranges interprocedurally.
E N D
Interprocedural Symbolic Range Propagation for Optimizing Compilers Hansang Bae and Rudolf Eigenmann Purdue University 2005. 10. 22
Outline • Motivation • Symbolic Range Propagation • Interprocedural Symbolic Range Propagation • Experiments • Conclusions Interprocedural Symbolic Range Propagation
Motivation • Symbolic analysis is key to static analysis • X=Y+1 better than X=? • Range analysis has been successful • X=[0,10] better than X=? • Relevant questions • How much can we achieve interprocedurally? • Can interprocedural range propagation outperform other alternatives? Interprocedural Symbolic Range Propagation
Symbolic Range Propagation (Background) • Has been effective for compiler analyses • Abstract interpretation • Provides lower/upper bounds for variables • Sources of information • Variable definition • IF conditionals • Loop variables • Intersect with new source, union at merge Interprocedural Symbolic Range Propagation
SRP – Example Ranges Example Code X=[-INF,INF] X=[1,1] X=[1,1] X=[2,2] X=[1,1] X=[3,3] X=[2,3] X = 1 IF (X.LE.N) THEN X = 2*X ELSE X = X+2 ENDIF … Interprocedural Symbolic Range Propagation
Interprocedural Symbolic Range Propagation (ISRP) • Propagates ranges across procedure calls • Collects ISR at important program points • Entry to a subroutine • After a call site • SRP as the source of information • Iterative algorithm • Context sensitivity by procedure-cloning Interprocedural Symbolic Range Propagation
ISRP – Terminology • Symbolic Range Mapping from a variable to its value range, V = [LB, UB], where LB and UB are expressions • Interprocedural Symbolic Range Symbolic Range valid at relevant program points - subroutine entries and call sites (forward/backward) • Jump Function Set of symbolic ranges expressed in terms of input variables to a called subroutine (actual parameters, global variables) • Return Jump Function Set of symbolic ranges expressed in terms of return variables to a calling subroutine (formal parameters, global variables) Caller Backward ISR Jump Function Forward ISR Return Jump Function Callee Interprocedural Symbolic Range Propagation
ISRP – Algorithm • Propagate_Interprocedural_Ranges() • { • Initialize_Call_Graph() • while (there is any change in ISR) { • foreach Subroutine (bottom-up) { • Get_Backward_Interprocedural_Ranges() • Compute_Jump_Functions() • Compute_Return_Jump_Functions() • } • Get_Forward_Interprocedural_Ranges() • } • } Interprocedural Symbolic Range Propagation
Get_Forward_Interprocedural_Ranges() • Transforms jump functions to ISRs • Clone procedures if necessary • Keeps track of any changes • Get_Backward_Interprocedural_Ranges() • Transforms return jump functions to ISRs • Does nothing for leaf nodes of the call graph • Compute_Jump_Functions() • Computes intraprocedural ranges • Discards non-input-variables to the callee • Compute_Return_Jump_Functions() • Discards non-return-variables to the caller ISRP – Algorithm • Propagate_Interprocedural_Ranges() • Initialize_Call_Graph() • while (there is any change in ISR) { • foreach Subroutine (bottom-up) { • Get_Backward_Interprocedural_Ranges() • Compute_Jump_Functions() • Compute_Return_Jump_Functions() • } • Get_Forward_Interprocedural_Ranges() • } Interprocedural Symbolic Range Propagation
ISRP – Example (1st iteration) foreach subroutine (bottom-up) Get_Backward_ISRs Compute_Jump_Functions Compute_Return_Jump_Functions Get_Forward_ISRs (for call graph) X=[1] X=[1],Y=[2] N=[10,40] N=[10,40],T=[U+N] V=[W+M] J :X=[1],Y=[2] ISR:T=[1],U=[2] J :N=[10,40] ISR:T=[U+N] ISR:M=[10,40] RJ :V=[W+M] PROGRAM MAIN INTEGER X, Y X = 1 Y = 2 α CALL A(X, Y) END SUBROUTINE A(T, U) INTEGER N, T, U DO N = 10, 40 β CALL B(T, U, N) ENDDO END SUBROUTINE B(V, W, M) INTEGER M, V, W V = W + M END Interprocedural Symbolic Range Propagation
ISRP – Example (2nd iteration) foreach subroutine (bottom-up) Get_Backward_ISRs Compute_Jump_Functions Compute_Return_Jump_Functions Get_Forward_ISRs (for call graph) X=[1] X=[1],Y=[2] Y=[2] T=[1],U=[2] T=[1],U=[2],N=[10,40] N=[10,40],U=[2] U=[2] M=[10,40] M=[10,40],V=[W+M] PROGRAM MAIN INTEGER X, Y X = 1 Y = 2 α CALL A(X, Y) END SUBROUTINE A(T, U) INTEGER N, T, U DO N = 10, 40 β CALL B(T, U, N) ENDDO END SUBROUTINE B(V, W, M) INTEGER M, V, W V = W + M END J :X=[1],Y=[2] ISR:Y=[2] ISR:T=[1],U=[2] J :U=[2],N=[10,40] ISR:N=[10,40],T=[U+N] RJ :U=[2] ISR:M=[10,40] RJ :M=[10,40],V=[W+M] ISR:T=[1],U=[2] ISR:M=[10,40],W=[2] Interprocedural Symbolic Range Propagation
ISRP – Example (3rd iteration) foreach subroutine (bottom-up) Get_Backward_ISRs Compute_Jump_Functions Compute_Return_Jump_Functions Get_Forward_ISRs (for call graph) J :X=[1],Y=[2] ISR:Y=[2] ISR:T=[1],U=[2] J :U=[2],N=[10,40] ISR:N=[10,40],U=[2],T=[U+N] RJ :U=[2] ISR:M=[10,40],W=[2] RJ :M=[10,40],W=[2],V=[W+M] X=[1] X=[1],Y=[2] Y=[2] T=[1],U=[2] T=[1],U=[2],N=[10,40] N=[10,40],U=[2] U=[2] M=[10,40],W=[2] M=[10,40],W=[2],V=[W+M] PROGRAM MAIN INTEGER X, Y X = 1 Y = 2 α CALL A(X, Y) END SUBROUTINE A(T, U) INTEGER N, T, U DO N = 10, 40 β CALL B(T, U, N) ENDDO END SUBROUTINE B(V, W, M) INTEGER M, V, W V = W + M END ISR:T=[1],U=[2] ISR:M=[10,40],W=[2] Interprocedural Symbolic Range Propagation
Experiments • Efficacy of ISRP for an optimizing compiler (Polaris) • Test elision and dead-code elimination • Data dependence analysis • Other optimizations for parallelization • 21 Fortran codes from SPEC CFP and Perfect • Best available optimizations in Polaris as Base • Interprocedural expression propagation • Forward substitution • Intraprocedural symbolic range propagation • Automatic partial inlining Interprocedural Symbolic Range Propagation
Result – Test Elision • ISRP found more or same number of cases for 20 codes • Base made an aggressive decision with hard-wired test elision for fpppp Interprocedural Symbolic Range Propagation
Result – Data Dependence Analysis • ISRP disproved more data dependences for 20 codes • Base benefits from forward substitution for FLO52Q • Data dependence analysis benefits from other improved optimizations Interprocedural Symbolic Range Propagation
IF (((-num)+(-num**2))/2.LE.0.AND.(-num).LE.0) THEN ALLOCATE (xrsiq00(1:morb, 1:num, 1:numthreads))!$OMP PARALLEL!$OMP+IF(6+((-1)*num+(-1)*num**2)/2.LE.0)!$OMP+DEFAULT(SHARED)!$OMP+PRIVATE(MY_CPU_ID,MRS,MRSIJ1,MI0,MJ0,VAL,MQ,MP,XIJ00,MI,MJ) my_cpu_id = omp_get_thread_num()+1!$OMP DO DO mrs = 1, (num*(1+num))/2, 1 IF ((num*(1+num))/2.NE.mrs) THEN DO mq = 1, num, 1 DO mi0 = 1, num, 1 10 CONTINUE xrsiq00(mi0, mq, my_cpu_id) = zero ENDDO ENDDO DO mp = 1, num, 1 DO mq = 1, mp, 1 val = xrspq(mq+(mp**2+(-mp)+(-num)+(-num**2)+mrs*num+mrs*num* **2)/2) IF (zero.NE.val) THEN DO mi0 = 1, num, 1 xrsiq00(mi0, mq, my_cpu_id) = xrsiq00(mi0, mq, my_cpu_id)+v *(mp, mi0)*val xrsiq00(mi0, mp, my_cpu_id) = xrsiq00(mi0, mp, my_cpu_id)+v *(mq, mi0)*val 20 CONTINUE ENDDO ENDIF 30 CONTINUE ENDDO 40 CONTINUE ENDDO mrsij1 = ((-num)+(-num**2)+mrs*num+mrs*num**2)/2 DO mi0 = 1, num, 1 DO mj0 = 1, mi0, 1 50 CONTINUE xij00(mj0) = zero ENDDO DO mq = 1, num, 1 val = xrsiq00(mi0, mq, my_cpu_id) IF (zero.NE.val) THEN DO mj0 = 1, mi0, 1 60 CONTINUE xij00(mj0) = xij00(mj0)+v(mq, mj0)*val ENDDO ENDIF 70 CONTINUE ENDDO DO mj0 = 1, mi0, 1 80 CONTINUE xrsij(mj0+(mi0**2+(-mi0))/2+((-num)+(-num**2)+mrs*num+mrs*num ***2)/2) = xij00(mj0) ENDDO 90 CONTINUE ENDDO 100 CONTINUE ELSE DO mq = 1, num, 1 DO mi = 1, num, 1 316 CONTINUE xrsiq(mi, mq) = zero ENDDO ENDDO DO mp = 1, num, 1 DO mq = 1, mp, 1 val = xrspq(mq+(mp**2+(-mp)+(-num)+(-num**2)+mrs*num+mrs*num* **2)/2) IF (zero.NE.val) THEN DO mi = 1, num, 1 xrsiq(mi, mq) = xrsiq(mi, mq)+v(mp, mi)*val xrsiq(mi, mp) = xrsiq(mi, mp)+v(mq, mi)*val 317 CONTINUE ENDDO ENDIF 318 CONTINUE ENDDO 319 CONTINUE ENDDO mrsij1 = ((-num)+(-num**2)+mrs*num+mrs*num**2)/2 DO mi = 1, num, 1 DO mj = 1, mi, 1 320 CONTINUE xij(mj) = zero ENDDO DO mq = 1, num, 1 val = xrsiq(mi, mq) IF (zero.NE.val) THEN DO mj = 1, mi, 1 321 CONTINUE xij(mj) = xij(mj)+v(mq, mj)*val ENDDO ENDIF 322 CONTINUE ENDDO DO mj = 1, mi, 1 323 CONTINUE xrsij(mj+(mi**2+(-mi))/2+((-num)+(-num**2)+mrs*num+mrs*num**2 *)/2) = xij(mj) ENDDO 324 CONTINUE ENDDO 325 CONTINUE ENDIF ENDDO!$OMP END DO NOWAIT!$OMP END PARALLEL DEALLOCATE (xrsiq00) ELSE DO mrs = 1, (num*(1+num))/2, 1!$OMP PARALLEL!$OMP+IF(6+(-1)*num.LE.0)!$OMP+DEFAULT(SHARED)!$OMP+PRIVATE(MI)!$OMP DO DO mq = 1, num, 1 DO mi = 1, (num), 1 306 CONTINUE xrsiq(mi, mq) = zero ENDDO ENDDO!$OMP END DO NOWAIT!$OMP END PARALLEL DO mp = 1, num, 1 DO mq = 1, mp, 1 mrspq = 1+mrspq val = xrspq(mrspq) IF (zero.NE.val) THEN!$OMP PARALLEL!$OMP+IF(6+(-1)*num.LE.0)!$OMP+DEFAULT(SHARED)!$OMP DO DO mi = 1, (num), 1 xrsiq(mi, mq) = xrsiq(mi, mq)+v(mp, mi)*val xrsiq(mi, mp) = xrsiq(mi, mp)+v(mq, mi)*val 307 CONTINUE ENDDO!$OMP END DO NOWAIT!$OMP END PARALLEL ENDIF 308 CONTINUE ENDDO 309 CONTINUE ENDDO mrsij = mrsij0 DO mi = 1, (num), 1!$OMP PARALLEL!$OMP+IF(6+(-1)*mi.LE.0)!$OMP+DEFAULT(SHARED)!$OMP DO DO mj = 1, mi, 1 310 CONTINUE xij(mj) = zero ENDDO!$OMP END DO NOWAIT!$OMP END PARALLEL ALLOCATE (xij1(1:mi, 1:numthreads))!$OMP PARALLEL!$OMP+IF(6+(-1)*num.LE.0)!$OMP+DEFAULT(SHARED)!$OMP+PRIVATE(MY_CPU_ID,MQ,TPINIT,VAL,MJ) my_cpu_id = omp_get_thread_num()+1 DO tpinit = 1, mi, 1 xij1(tpinit, my_cpu_id) = 0.0 ENDDO!$OMP DO DO mq = 1, num, 1 val = xrsiq(mi, mq) IF (zero.NE.val) THEN DO mj = 1, mi, 1 311 CONTINUE xij1(mj, my_cpu_id) = xij1(mj, my_cpu_id)+v(mq, mj)*val ENDDO ENDIF 312 CONTINUE ENDDO!$OMP END DO NOWAIT!$OMP CRITICAL DO tpinit = 1, mi, 1 xij(tpinit) = xij(tpinit)+xij1(tpinit, my_cpu_id) ENDDO!$OMP END CRITICAL!$OMP END PARALLEL DEALLOCATE (xij1) DO mj = 1, mi, 1 mrsij = mrsij+1 313 CONTINUE xrsij(mrsij) = xij(mj) ENDDO 314 CONTINUE ENDDO mrsij0 = mrsij0+(num*(num+1))/2 315 CONTINUE ENDDO ENDIF !$OMP PARALLEL!$OMP+DEFAULT(SHARED)!$OMP+PRIVATE(MRSIJ1,MI0,MJ0,VAL,MQ,MP,XIJ00,XRSIQ00)!$OMP DO DO mrs = 1, (num+num**2)/2, 1 IF ((num+num**2)/2.NE.mrs) THEN DO mq = 1, num, 1 DO mi0 = 1, num, 1 10 CONTINUE xrsiq00(mi0, mq) = zero ENDDO ENDDO DO mp = 1, num, 1 DO mq = 1, mp, 1 val = xrspq(mq+(mp**2+(-mp)+(-num)+(-num**2)+mrs*num+mrs*num** *2)/2) IF (zero.NE.val) THEN DO mi0 = 1, num, 1 xrsiq00(mi0, mq) = xrsiq00(mi0, mq)+v(mp, mi0)*val xrsiq00(mi0, mp) = xrsiq00(mi0, mp)+v(mq, mi0)*val 20 CONTINUE ENDDO ENDIF 30 CONTINUE ENDDO 40 CONTINUE ENDDO mrsij1 = ((-num)+(-num**2)+mrs*num+mrs*num**2)/2 DO mi0 = 1, num, 1 DO mj0 = 1, mi0, 1 50 CONTINUE xij00(mj0) = zero ENDDO DO mq = 1, num, 1 val = xrsiq00(mi0, mq) IF (zero.NE.val) THEN DO mj0 = 1, mi0, 1 60 CONTINUE xij00(mj0) = xij00(mj0)+v(mq, mj0)*val ENDDO ENDIF 70 CONTINUE ENDDO DO mj0 = 1, mi0, 1 80 CONTINUE xrsij(mj0+(mi0**2+(-mi0)+(-num)+(-num**2)+mrs*num+mrs*num**2)/ *2) = xij00(mj0) ENDDO 90 CONTINUE ENDDO 100 CONTINUE ELSE DO mq = 1, num, 1 DO mi = 1, num, 1 306 CONTINUE xrsiq(mi, mq) = zero ENDDO ENDDO DO mp = 1, num, 1 DO mq = 1, mp, 1 val = xrspq(mq+(mp**2+(-mp)+(-num)+(-num**2)+mrs*num+mrs*num** *2)/2) IF (zero.NE.val) THEN DO mi = 1, num, 1 xrsiq(mi, mq) = xrsiq(mi, mq)+v(mp, mi)*val xrsiq(mi, mp) = xrsiq(mi, mp)+v(mq, mi)*val 307 CONTINUE ENDDO ENDIF 308 CONTINUE ENDDO 309 CONTINUE ENDDO mrsij1 = ((-num)+(-num**2)+mrs*num+mrs*num**2)/2 DO mi = 1, num, 1 DO mj = 1, mi, 1 310 CONTINUE xij(mj) = zero ENDDO DO mq = 1, num, 1 val = xrsiq(mi, mq) IF (zero.NE.val) THEN DO mj = 1, mi, 1 311 CONTINUE xij(mj) = xij(mj)+v(mq, mj)*val ENDDO ENDIF 312 CONTINUE ENDDO DO mj = 1, mi, 1 313 CONTINUE xrsij(mj+(mi**2+(-mi)+(-num)+(-num**2)+mrs*num+mrs*num**2)/2) *= xij(mj) ENDDO 314 CONTINUE ENDDO 315 CONTINUE ENDIF ENDDO!$OMP END DO NOWAIT!$OMP END PARALLEL Result – Other Optimizations Reduction Translation Induction Variable Substitution • “Yes” to the questions helps the compiler generate better codes • ISRP helped the compiler make better decisions for 5 codes Interprocedural Symbolic Range Propagation
Conclusions • Interprocedural analysis of symbolic ranges • Based on intraprocedural analysis • Iterative algorithm • ISRP enhances other optimizations • Compilation time increases up to 150% • Exceptions: OCEAN and TRACK Interprocedural Symbolic Range Propagation
Thank you. Interprocedural Symbolic Range Propagation