1 / 65

Symbolic Bounds Analysis of Pointers, Array Indices, and Accessed Memory Regions

This paper presents a symbolic bounds analysis technique for extracting symbolic bounds of accessed memory regions in programs using pointers and array indices. It formulates and solves systems of symbolic inequality constraints to infer the bounds. The approach is demonstrated using examples of sorting algorithms.

aash
Download Presentation

Symbolic Bounds Analysis of Pointers, Array Indices, and Accessed Memory Regions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Symbolic Bounds Analysis of Pointers, Array Indices, and Accessed Memory Regions Radu Rugina and Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

  2. Outline • Examples • Key Problem: Extracting Symbolic Bounds for Accessed Memory Regions • Key Technology: Formulating and Solving Systems of Symbolic Inequality Constraints • Results • Conclusion

  3. Example - Divide and Conquer Sort 7 4 6 1 3 5 8 2

  4. 8 2 7 4 6 1 3 5 Example - Divide and Conquer Sort 7 4 6 1 3 5 8 2 Divide

  5. 8 2 7 4 6 1 3 5 Example - Divide and Conquer Sort 7 4 6 1 3 5 8 2 Divide 4 7 1 6 3 5 2 8 Conquer

  6. 8 2 7 4 6 1 3 5 Example - Divide and Conquer Sort 7 4 6 1 3 5 8 2 Divide 4 7 1 6 3 5 2 8 Conquer 1 4 6 7 2 3 5 8 Combine

  7. 8 2 7 4 6 1 3 5 Example - Divide and Conquer Sort 7 4 6 1 3 5 8 2 Divide 4 7 1 6 3 5 2 8 Conquer 1 4 6 7 2 3 5 8 Combine 1 2 3 4 5 6 7 8

  8. “Sort n Items in d, Using t as Temporary Storage” • void sort(int *d, int *t, int n) • if (n > CUTOFF) { • sort(d,t,n/4); • sort(d+n/4,t+n/4,n/4); • sort(d+2*(n/2),t+2*(n/2),n/4); • sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); • merge(d,d+n/4,d+n/2,t); • merge(d+n/2,d+3*(n/4),d+n,t+n/2); • merge(t,t+n/2,t+n,d); • } else insertionSort(d,d+n);

  9. “Sort n Items in d, Using t as Temporary Storage” • void sort(int *d, int *t, int n) • if (n > CUTOFF) { • sort(d,t,n/4); • sort(d+n/4,t+n/4,n/4); • sort(d+2*(n/2),t+2*(n/2),n/4); • sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); • merge(d,d+n/4,d+n/2,t); • merge(d+n/2,d+3*(n/4),d+n,t+n/2); • merge(t,t+n/2,t+n,d); • } else insertionSort(d,d+n); Motivating Problem Exploit parallelism in this code

  10. “Recursively Sort Four Quarters of d” • void sort(int *d, int *t, int n) • if (n > CUTOFF) { • sort(d,t,n/4); • sort(d+n/4,t+n/4,n/4); • sort(d+2*(n/2),t+2*(n/2),n/4); • sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); • merge(d,d+n/4,d+n/2,t); • merge(d+n/2,d+3*(n/4),d+n,t+n/2); • merge(t,t+n/2,t+n,d); • } else insertionSort(d,d+n); Divide array into subarrays and recursively sort subarrays

  11. 7 4 6 1 3 5 8 2 “Recursively Sort Four Quarters of d” • void sort(int *d, int *t, int n) • if (n > CUTOFF) { • sort(d,t,n/4); • sort(d+n/4,t+n/4,n/4); • sort(d+2*(n/2),t+2*(n/2),n/4); • sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); • merge(d,d+n/4,d+n/2,t); • merge(d+n/2,d+3*(n/4),d+n,t+n/2); • merge(t,t+n/2,t+n,d); • } else insertionSort(d,d+n); Subproblems Identified Using Pointers Into Middle of Array d d+n/4 d+n/2 d+3*(n/4)

  12. 7 4 6 1 3 5 8 2 “Recursively Sort Four Quarters of d” • void sort(int *d, int *t, int n) • if (n > CUTOFF) { • sort(d,t,n/4); • sort(d+n/4,t+n/4,n/4); • sort(d+2*(n/2),t+2*(n/2),n/4); • sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); • merge(d,d+n/4,d+n/2,t); • merge(d+n/2,d+3*(n/4),d+n,t+n/2); • merge(t,t+n/2,t+n,d); • } else insertionSort(d,d+n); d d+n/4 d+n/2 d+3*(n/4)

  13. 4 7 1 6 3 5 2 8 “Recursively Sort Four Quarters of d” • void sort(int *d, int *t, int n) • if (n > CUTOFF) { • sort(d,t,n/4); • sort(d+n/4,t+n/4,n/4); • sort(d+2*(n/2),t+2*(n/2),n/4); • sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); • merge(d,d+n/4,d+n/2,t); • merge(d+n/2,d+3*(n/4),d+n,t+n/2); • merge(t,t+n/2,t+n,d); • } else insertionSort(d,d+n); Sorted Results Written Back Into Input Array d d+n/4 d+n/2 d+3*(n/4)

  14. 4 1 4 7 1 6 6 7 3 2 5 3 2 5 8 8 “Merge Sorted Quarters of d Into Halves of t” • void sort(int *d, int *t, int n) • if (n > CUTOFF) { • sort(d,t,n/4); • sort(d+n/4,t+n/4,n/4); • sort(d+2*(n/2),t+2*(n/2),n/4); • sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); • merge(d,d+n/4,d+n/2,t); • merge(d+n/2,d+3*(n/4),d+n,t+n/2); • merge(t,t+n/2,t+n,d); • } else insertionSort(d,d+n); d t t+n/2

  15. 1 1 4 2 3 6 4 7 5 2 6 3 7 5 8 8 “Merge Sorted Halves of t Back Into d” • void sort(int *d, int *t, int n) • if (n > CUTOFF) { • sort(d,t,n/4); • sort(d+n/4,t+n/4,n/4); • sort(d+2*(n/2),t+2*(n/2),n/4); • sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); • merge(d,d+n/4,d+n/2,t); • merge(d+n/2,d+3*(n/4),d+n,t+n/2); • merge(t,t+n/2,t+n,d); • } else insertionSort(d,d+n); d t t+n/2

  16. 7 4 6 1 3 5 8 2 “Use a Simple Sort for Small Problem Sizes” • void sort(int *d, int *t, int n) • if (n > CUTOFF) { • sort(d,t,n/4); • sort(d+n/4,t+n/4,n/4); • sort(d+2*(n/2),t+2*(n/2),n/4); • sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); • merge(d,d+n/4,d+n/2,t); • merge(d+n/2,d+3*(n/4),d+n,t+n/2); • merge(t,t+n/2,t+n,d); • } else insertionSort(d,d+n); d d+n

  17. 7 4 1 6 3 5 8 2 “Use a Simple Sort for Small Problem Sizes” • void sort(int *d, int *t, int n) • if (n > CUTOFF) { • sort(d,t,n/4); • sort(d+n/4,t+n/4,n/4); • sort(d+2*(n/2),t+2*(n/2),n/4); • sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); • merge(d,d+n/4,d+n/2,t); • merge(d+n/2,d+3*(n/4),d+n,t+n/2); • merge(t,t+n/2,t+n,d); • } else insertionSort(d,d+n); d d+n

  18. Parallel Sort • void sort(int *d, int *t, int n) • if (n > CUTOFF) { • spawn sort(d,t,n/4); • spawn sort(d+n/4,t+n/4,n/4); • spawn sort(d+2*(n/2),t+2*(n/2),n/4); • spawn sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); • sync; • spawn merge(d,d+n/4,d+n/2,t); • spawn merge(d+n/2,d+3*(n/4),d+n,t+n/2); • sync; • merge(t,t+n/2,t+n,d); • } else insertionSort(d,d+n);

  19. What Do You Need To Know To Exploit This Form of Parallelism?

  20. What Do You Need To Know To Exploit This Form of Parallelism? Symbolic Information About Accessed Memory Regions

  21. Information Needed To Exploit Parallelism Calls to sort access disjoint parts of d and t Together, calls access [d,d+n-1] and [t,t+n-1] sort(d,t,n/4); sort(d+n/4,t+n/4,n/4); sort(d+n/2,t+n/2,n/4); sort(d+3*(n/4),t+3*(n/4), n-3*(n/4)); d d+n-1 t t+n-1 d d+n-1 t t+n-1 d d+n-1 t t+n-1 d d+n-1 t t+n-1

  22. Information Needed To Exploit Parallelism First two calls to merge access disjoint parts of d,t Together, calls access [d,d+n-1] and [t,t+n-1] merge(d,d+n/4,d+n/2,t); merge(d+n/2,d+3*(n/4), d+n,t+n/2); merge(t,t+n/2,t+n,d); d d+n-1 t t+n-1 d d+n-1 t t+n-1 d d+n-1 t t+n-1

  23. Information Needed To Exploit Parallelism • Calls to insertionSort access [d,d+n-1] • insertionSort(d,d+n); d d+n-1 t t+n-1

  24. What Do You Need To Know To Exploit This Form of Parallelism? Symbolic Information About Accessed Memory Regions: sort(p,n) accesses [p,p+n-1] insertionSort(p,n) accesses [p,p+n-1] merge(l,m,h,d) accesses [l,h-1], [d,d+(h-l)-1]

  25. How Hard Is It To Figure These Things Out?

  26. How Hard Is It To Figure These Things Out? Challenging

  27. How Hard Is It To Figure These Things Out? void insertionSort(int *l, int *h) { int *p, *q, k; for (p = l+1; p < h; p++) { for (k = *p, q = p-1; l <= q && k < *q; q--) *(q+1) = *q; *(q+1) = k; } } Not immediately obvious that insertionSort(l,h) accesses [l,h-1]

  28. How Hard Is It To Figure These Things Out? void merge(int *l1, int*m, int *h2, int *d) { int *h1 = m; int *l2 = m; while ((l1 < h1) && (l2 < h2)) if (*l1 < *l2) *d++ = *l1++; else *d++ = *l2++; while (l1 < h1) *d++ = *l1++; while (l2 < h2) *d++ = *l2++; } Not immediately obvious that merge(l,m,h,d) accesses [l,h-1] and [d,d+(h-l)-1]

  29. Issues • Heavy Use of Pointers • Pointers into Middle of Arrays • Pointer Arithmetic • Pointer Comparison • Multiple Procedures • sort(int *d, int *t, n) • insertionSort(int *l, int *h) • merge(int *l, int *m, int *h, int *t) • Recursion

  30. How the Compiler Does It

  31. Compiler Structure Pointer Analysis Disambiguate References at Granularity of Allocation Blocks Symbolic Upper and Lower Bounds for Each Memory Access in Each Procedure Bounds Analysis Symbolic Regions Accessed By Execution of Each Procedure Region Analysis Parallelization Independent Procedure Calls That Can Execute in Parallel

  32. Example – Array Increment void f(char *p, int n) if (n > CUTOFF) { f(p, n/2); /* increment first half */ f(p+n/2, n/2); /* increment second half */ } else { /* base case: initialize small array */ int i = 0; while (i < n) { *(p+i) += 1; i++; } }

  33. Intra-procedural Bounds Analysis • For each integer variable at each program point, derive lower and upper bounds • Bounds are symbolic expressions • variables represent initial values of parameters of enclosing procedure • bounds are linear combinations of variables • Example expression for f(p,n): p+n-1

  34. Bounds Analysis What are upper and lower bounds for region accessed by while loop in base case? int i = 0; while (i < n) { *(p+i) += 1; i++; }

  35. Bounds Analysis, Step 1 Build control flow graph i = 0 i < n *(p+i) += 1 i = i+1

  36. Bounds Analysis, Step 2 Set up bounds at beginning of basic blocks l1 i  u1 i = 0 l2 i  u2 i < n l3 i  u3 *(p+i) += 1 i = i+1

  37. Bounds Analysis, Step 3 Compute transfer functions l1 i  u1 i = 0 0  i  0 l2 i  u2 i < n l3 i  u3 *(p+i) += 1 i = i+1 l3 i  u3 l3+1  i  u3+1

  38. Bounds Analysis, Step 3 Compute transfer functions l1 i  u1 i = 0 0  i  0 l2 i  u2 i < n l2 i  n-1l2 i  u2 l3 i  u3 *(p+i) += 1 i = i+1 l3 i  u3 l3+1  i  u3+1

  39. Bounds Analysis, Step 4 Set up constraints for bounds l1 i  u1 i = 0 l2 0 l2 l3+1 l3 l2 0  i  0 l2 i  u2 i < n l2 i  n-1l2 i  u2 0  u2 u2+1  u2 n-1  u3 l3 i  u3 *(p+i) += 1 i = i+1 l3 i  u3 l3+1  i  u3+1

  40. Bounds Analysis, Step 4 Set up constraints for bounds - i + i = 0 l2 0 l2 l3+1 l3 l2 0  i  0 l2 i  u2 i < n l2 i  n-1l2 i  u2 0  u2 u2+1  u2 n-1  u3 l3 i  u3 *(p+i) += 1 i = i+1 l3 i  u3 l3+1  i  u3+1

  41. Bounds Analysis, Step 5 Generate symbolic expressions for bounds Goal: express bounds in terms of parameters l2= c1p + c2n + c3 l3= c4p + c5n + c6 u2= c7p + c8n + c9 u3= c10p + c11n + c12

  42. Bounds Analysis, Step 6 Substitute expressions into constraints c1p + c2n + c3  0 c1p + c2n + c3 c4p + c5n + c6 +1 c4p + c5n + c6 c1p + c2n + c3 0  c7p + c8n + c9 c10p + c11n + c12+1  c7p + c8n + c9 c7p + c8n + c9  c10p + c11n + c12

  43. GoalSolve Symbolic Constraint Systemfind values for constraint variables c1, ..., c12 that satisfy the inequality constraintsMaximize Lower BoundsMinimize Upper Bounds

  44. Bounds Analysis, Step 7 Reduce symbolic inequalities to linear inequalities c1p + c2n + c3  c4p + c5n + c6 if c1  c4, c2  c5, and c3  c6

  45. Bounds Analysis, Step 7 Apply reduction and generate a linear program 0 c7 0 c8 0  c9 c10  c7 c11  c8 c12+1 c9 c7  c10 c8  c11 c9 c12 c1  0 c2  0 c3  0 c1  c4 c2  c5 c3 c6+1 c4  c1 c5  c2 c6 c3 Objective Function: max: (c1 + ••• + c6) - (c7 + ••• + c12) lower bounds upper bounds

  46. Bounds Analysis, Step 7 • Apply reduction and generate a linear program • This is a linear program (LP), not an integer linear program (ILP) • The coefficients in the symbolic expressions are rational numbers • Rational coefficients are needed for expressions like middle of an array: low+(high - low)/2

  47. Bounds Analysis, Step 8 Solve linear program to extract bounds c1=0 c2 =0 c3 =0 c4=0 c5 =0 c6 =0 c7=0 c8 =1 c9 =0 c10=0 c11=1 c12=-1 - i + i = 0 0  i  0 l2 i  u2 i < n l2 i  n-1l2 i  u2 l2= 0 l3 = 0 l3 i  u3 *(p+i) += 1 i = i+1 u2= 0 u3 = n-1 l3 i  u3 l3+1  i  u3+1

  48. Bounds Analysis, Step 8 Solve linear program to extract bounds c1=0 c2 =0 c3 =0 c4=0 c5 =0 c6 =0 c7=0 c8 =1 c9 =0 c10=0 c11=1 c12=-1 - i + i = 0 0  i  0 0  i  n i < n 0  i  n-10  i  n l2= 0 l3 = 0 0  i  n-1 *(p+i) += 1 i = i+1 u2= 0 u3 = n-1 0  i  n-1 1  i  n

  49. Bounds Analysis, Step 8 Solve linear program to extract bounds c1=0 c2 =0 c3 =0 c4=0 c5 =0 c6 =0 c7=0 c8 =1 c9 =0 c10=0 c11=1 c12=-1 - i + i = 0 0  i  0 0  i  n i < n 0  i  n-10  i  n l2= 0 l3 = 0 0  i  n-1 *(p+i) += 1 i = i+1 u2= 0 u3 = n-1 0  i  n-1 1  i  n

  50. Region Analysis Goal: Compute Accessed Regions of Memory • Intra-Procedural • Use bounds at each load or store • Compute accessed region • Inter-Procedural • Use intra-procedural results • Set up another symbolic constraint system • Solve to find regions accessed by entire execution of the procedure

More Related