330 likes | 503 Views
A B C D : Eliminating A rray- B ounds C hecks on D emand. Rastislav Bod í k Rajiv Gupta Vivek Sarkar U of Wisconsin U of Arizona IBM TJ Watson recent experiments by Denis Gopan , U of Wisconsin. Motivation: type safety.
E N D
ABCD:Eliminating Array-Bounds Checks on Demand Rastislav Bodík Rajiv Gupta Vivek Sarkar U of Wisconsin U of Arizona IBM TJ Watson recent experiments by Denis Gopan, U of Wisconsin
Motivation: type safety Pro:type-safe programs don’t “crash” Con:some violations checked at run time Direct cost:executing the checks checks are frequent, expensive Indirect cost: preventingoptimization checks block code motion of side-effect instructions Our goal: safety without performance penalty How? remove redundant checks
Talk outline • Why remove bounds checks? safety without performance penalty • The need for dynamic optimization • existing optimizers not suitable • an ideal dynamic optimizer • ABCD • Experiments • Summary
Existing check optimizers • Emphasis: precision • goal: allchecks removed = staticallytype-safe • theorem prover: [Necula, Lee], [Xu, Miller, Reps] • range propagation: [Harrison, Patterson] • types: [Xi, Phenning] • Properties • too heavy-weight • limited notion of control flow: • how to add profile feedback?
An ideal dynamic optimizer? • A balance between power and economy • powerful just enough only common cases • minimize analysis work efficient IR • reduce IR overhead reuse the IR • Scalable • optimize only hot checks profile-directed +demand-driven • no whole-program analysisuse “local” info + insert (cold) checks
Why optimize on demand? • optimize only the fewhot checks 80 mpegaudio % of dynamic checks checks analyzed(not necessarily removed) 0 10 20 30 40 50 60 70 80 90 100 number of staticchecks
Talk outline • Motivation • An ideal dynamic optimizer • ABCD “tutorial” • Simple... standard SSA • Full ... extended SSA • PRE... profile-directed • ABCDE... work in progress • Experiments • Summary
this talk High-level algorithm for each hot array accessA[i]do -- optimize upper-bound check ABCD( i < A.length ) -- optimize lower-bound check ABCD( 0 i ) end for
1. Simple ABCD i A.length while ( ) { --i ..A[i].. } Simple ABCD = SSA + shortest path
1. Simple ABCD • build SSA • label edges with constraints • analyzeA[ik]: is ik < A.length always true?
A.length 0 i0 A.length –0 i0 i1 i0–0 0 i1 ø 1 0 i2 i1– 1 i2 weight(A.length i2) = 1 i2 A.length –1 Simple ABCD i0 A.length i1ø(i0,i2) i2 i1–1 .. A[i2] ..
1. Simple ABCD • build SSA • label edges with constraints • analyzeA[ik]: input: a bounds checkik < A.length algorithm:find shortest pathp from A.lengthtoik output:check is redundant ifweight(p) > 0
2. Full ABCD for (i=0; i < A.length; i++) ..A[i].. Full ABCD = SSA+++ “shortest” path
2. Full ABCD • build extended SSA: naming of standard SSA not fine-grain enough add dummy -assignments • label edges with constraints • analyze: shortest path optimal path in a hyper-graph
– i A.length–1 Extending the SSA form i0 0 0 A.length 0 i1ø(i0,i2) i0 i1 A.length–1 F 0 T i1 ø –1 0 .. A[i1] .. i2 i1+1 i2
=shortest =longest hyper-graph:has two kinds of nodes Extending the SSA form i0 0 0 A.length 0 i1ø(i0,i3) i0 i1 A.length–1 F 0 T 1 i1 i2 (i1) ø 0 0 .. A[i2] .. i3 i2+1 i2 –1 i3
false if (n <= A.length) unoptimizedloop 3. ABCD with PRE PRE = partial redundancy elimination f(int A[], int n) { for (i=0; i < n; i++) ..A[i].. } ABCD with PRE = Full ABCD + profile feedback
3. ABCD with PRE • build extended SSA • label edges with constraints • analyzeA[ik]: Algorithm:i) find paths with “bad” length ii) fix their length by inserting run-time checks Issues: • What check to insert? • Where to insert the check? • When is insertion profitable?
0 if (n <= A.length – 0) DONE! constraint edges can be addedby inserting run-time checks ABCD with PRE A.length n 0 0 i0 f(int A[], int n) { for (i=0; i<n; i++) .. A[i] .. } 0 1 i1 ø 0 0 –1 i3 i2
if ( n <= A.length–(j-i)) unoptimizedloop 4. ABCDE f(int A[], int n, i, j) { for ( ; i < n; i++, j++) .. A[j] .. } When does ABCD fail?
0 c = j0-i0 c = j2-i2 if (n <= A.length-(j0-i0)) ABCDE A.length n i0 j0 0 0 i1 j1 1 0 0 0 0 i3 j3 –1 –1 i2 j2 for (i=j=0 ; i<n; i++,j++) { .. A[j] .. } if (n <= A.length-(j2-i2))
ABCD is simple • yet powerful ...
A complex example from paper limit = a.length st = –1 while (st < limit) st++ limit – – for (j = st; j < limit; j++) { A[j] += A[j+1] } }
A complex example from paper limit = a.length st = –1 while (st < limit) st++ limit –– for (j = st; j < limit; j++) { A[j] += A[j+1] } }
How powerful is ABCD? SPECjvm Symantec Misc. Java 100% checks removed [% of all dynamic checks]
Classification of hot checks examined “removable” ABCDE checks removed [% of all dynamic checks] ABCD JIT-like AVG
Summary • Current speedup: modest, up toabout 5% • direct cost: in Jalapeno, checks already very efficient • indirect cost: few global optimizations implemented (11/99) • Analysis time • 4 ms / check = visit 10 SSA nodes / check • recall 20 checks yields 80% dynamic coverage • 80 ms to analyze a large benchmark ! • Precision: • can be improved with few extensions (ABCD ABCDE) • remaining checks appear beyond compiler analysis Use ABCD for your bounds-check optimization
Classification of hot checks • for () ...
Summary • optimize only hot statements • demand-driven • reduce analysis work • sparse IR • profile-driven • powerful just enough • tuned simplicity • minimize IR overhead via IR reuse • use SSA • not interprocedural • PRE
Lower-bounds checks examined ABCDE ABCD checks removed [% of all checks (dynamic count)] JIT-esque AVG
Summary of ABCD local constraints constraint format useful statements global constraint system extended SSA inequality graph solving the constraint system traversing the graph partially redundant checks