1 / 20

Global Value Numbering using Random Interpretation

Global Value Numbering using Random Interpretation. Sumit Gulwani George C. Necula CS Department University of California, Berkeley. Global Value Numbering. Problem To detect equivalences of expressions in a program To obtain a complete algorithm under the assumptions:

corby
Download Presentation

Global Value Numbering using Random Interpretation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Global Value Numbering usingRandom Interpretation Sumit Gulwani George C. Necula CS Department University of California, Berkeley

  2. Global Value Numbering • Problem • To detect equivalences of expressions in a program • To obtain a complete algorithm under the assumptions: • Conditionals are non-deterministic • Operators are uninterpreted • F(e1,e2) = F’(e1’,e2’) , F=F’, e1=e1’, e2=e2’ • Existing algorithms • Precise but expensive • Efficient but imprecise • Use randomization to obtain a precise, efficient but probabistically sound algorithm • Complements our POPL ’03 algorithm, which handles only arithmetic

  3. Outline • Two key ideas in the algorithm • The affine join operation • K-linear interpretations • Correctness of the algorithm • Termination of the algorithm

  4. Example x = (a,b) y = (a,b) z = (F(a),F(b)) F(y) = F((a,b)) * x := b; y := b; z := F(b); x := a; y := a; z := F(a); assert(x = y); assert(z = F(y)); • Typical algorithms treat  as uninterpreted • Hence cannot verify the second assertion • The randomized algorithm interprets  • Similar to the randomized algorithm for linear arithmetic

  5. Review: Randomized Algorithm for Linear Arithmetic F T • Between random testing and abstract interpretation • Choose random values for input variables • Execute both branches • Combine the values of a variable at join points using a random affine combination a := 0; b := 1; a := 1; b := 0; T F c := b – a; d := 1 – 2b; c := 2a + b; d := b – 2; assert (c + d = 0); assert (c = a + 1)

  6. Review: The Affine Join Operation • Affine combination of v1 and v2 w.r.t. weight w w(v1,v2)´w v1 + (1-w) v2 • Affine join preserves common linear relationships (e.g. a+b=5) • It does not introduce false relationships w.h.p. • Unfortunately, non-linear relationships are not preserved (e.g. a (1+b) = 8) a := 4; b := 1; a := 2; b := 3; a = 7(2,4) = -10 b = 7(3,1) = 15 (w = 7)

  7. Review: Example • Choose a random weight for each join independently. • All choices of random weights verify the first assertion • Almost all choices contradict the second assertion F T a := 0; b := 1; a := 1; b := 0; w1 = 5 a = 0, b = 1 a = 1, b = 0 a = -4, b = 5 T F c := b – a; d := 1 – 2b; c := 2a + b; d := b – 2; w2 = -3 a = -4, b = 5 c = 9, d = -9 a = -4, b = 5 c = -3, d = 3 a = -4, b = 5 c = -39, d = 39 assert (c + d = 0); assert (c = a + 1)

  8. Uninterpreted Functions e := y | F(e1,e2) • Choose a random interpretation for F • Non-linear interpretation • E.g. F(e1,e2) = r1e12 + r2e22 • Preserves all equivalences in straight-line code • But not across join points • Lets try linear interpretation

  9. e= F F F a b c d (Naïve) Linear Interpretation • Encode F(e1,e2) = r1e1 + r2e2 • Preserves all equivalences across a join point • Introduces false equivalences in straight-line code Encodings e = r1(r1a+r2b) + r2(r1c+r2d) = r12(a)+r1r2(b)+r2r1(c)+r22(d) e’ = r12(a)+r1r2(c)+r2r1(b)+r22(d) e’ = F F F a c b d • E.g. e and e’ have same encodings even though e  e’ • Problem: too few random coefficients!

  10. k-linear Interpretations • Encode F(e1,e2) = R1e1 + R2e2 • Every expression evaluates to a vector of length k • R1 and R2 are random k £ k matrices • 2k2 random variables, k = o(n) • Works since matrix multiplication is not commutative • e = R12(a) + R1R2(b) + R2R1(c) + R22(d) • e’ = R12(a) + R1R2(c) + R2R1(b) + R22(d) F(e1,e2)1 … F(e1,e2)k e11 … e1k e21 … e2k

  11. The Random Interpreter R V: Variables ! Vectors V(e): defined inductively as V(F(e1,e2)) = R1V(e1) + R2V(e2) Vj(e): the jth component of vector V(e) V V V1 V2 False True y := e * V V1 V2 V1 j j V1 = V[y à V(e)] V1 = V V2 = V Vj(y) = w(V1 (y),V2(y)) for all y,j

  12. Outline • Two key ideas in the algorithm • The affine join operation • K-linear interpretations • Correctness of the algorithm • Termination of the algorithm

  13. Completeness and soundness of R • We compare the random interpreter R with a suitable abstract interpreter A • R mimics A with high probability • R is as complete as A • R is (probabilistically) as sound as A

  14. The Abstract Interpreter A S: set of symbolic equivalences S S S1 S2 y := e True False * S1 S S2 S1 S1 = S[y’/y] [ { y = e[y’/y] } S = { e1=e2 | S1) e1=e2, S2) e1=e2 } S1 = S S2 = S

  15. Completeness Theorem • If S ) e1 = e2, then V(e1) = V(e2) • Proof: • Uninterpreted operators are modeled as linear functions • The affine join operation preserves linear relationships

  16. Soundness Theorem • If S ) e1 = e2, then with high probability V(e1)  V(e2) • Error probability · • n: number of function applications • d: size of set from which random values are chosen • t : number of repetitions • If n = 100, d ¼ 232, t = 5, then error probability ·

  17. Outline • Two key ideas in the algorithm • The affine join operation • K-linear interpretations • Correctness of the algorithm • Termination of the algorithm

  18. Loops and Fixed Point Computation • The lattice of sets of equivalences has finite height n. Thus, the abstract interpreter A converges to a fixed point. • Thus, the random interpreter R also converges (probabilistically) • We can detect convergence by comparing the set of symbolic relationships implied by vectors in two successive iterations

  19. Related Work • Efficient but imprecise algorithms • Congruence partitioning [Rosen, Wegman, Zadeck, POPL 88] • Rewrite rules [Ruthing, Knoop, Steffen, SAS 99] - Balanced algorithms [Gargi PLDI 2002] • Precise but inefficient algorithms • Abstract interpretation on uninterpreted functions [Kildall 73] • Affine join operation • Random interpretation for linear arithmetic [Gulwani, Necula POPL 03]

  20. Conclusion and Future Work • Key ideas in the paper (e1,e2) = w e1 + (1-w) e2 • Linearity , Preserves equivalences across a join point • F(e1,e2) = R1e1 + R2 e2 • Vectors) Introduce no false equivalence • Random interpretation vs. deterministic algorithms • Linear arithmetic • O(n2) vs. O(n4) [POPL 2003] • Uninterpreted functions • O(n3) vs. O(n5 log n) [this talk] • Future work • Inter-procedural analysis using random interpretation • Random interpretation for other theories • Combining two random interpreters

More Related