770 likes | 881 Views
Design-Driven Compilation. Radu Rugina and Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology. Goal: Parallelization. Computation. +. Fully Automatic. Design Driven. Overview. Analysis Problems: Points-to Analysis, Region Analysis. Two Potential
E N D
Design-Driven Compilation Radu Rugina and Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology
Goal: Parallelization Computation + Fully Automatic Design Driven Overview Analysis Problems: Points-to Analysis, Region Analysis Two Potential Solutions Evaluation
Example - Divide and Conquer Sort 7 4 6 1 3 5 8 2
8 2 7 4 6 1 3 5 Example - Divide and Conquer Sort 7 4 6 1 3 5 8 2 Divide
8 2 7 4 6 1 3 5 Example - Divide and Conquer Sort 7 4 6 1 3 5 8 2 Divide 4 7 1 6 3 5 2 8 Conquer
8 2 7 4 6 1 3 5 Example - Divide and Conquer Sort 7 4 6 1 3 5 8 2 Divide 4 7 1 6 3 5 2 8 Conquer 1 4 6 7 2 3 5 8 Combine
8 2 7 4 6 1 3 5 Example - Divide and Conquer Sort 7 4 6 1 3 5 8 2 Divide 4 7 1 6 3 5 2 8 Conquer 1 4 6 7 2 3 5 8 Combine 1 2 3 4 5 6 7 8
Divide and Conquer Algorithms • Lots of Generated Concurrency • Solve Subproblems in Parallel
Divide and Conquer Algorithms • Lots of Recursively Generated Concurrency • Recursively Solve Subproblems in Parallel
Divide and Conquer Algorithms • Lots of Recursively Generated Concurrency • Recursively Solve Subproblems in Parallel • Combine Results in Parallel
“Sort n Items in d, Using t as Temporary Storage” • void sort(int *d, int *t, int n) • if (n > CUTOFF) { • sort(d,t,n/4); • sort(d+n/4,t+n/4,n/4); • sort(d+2*(n/2),t+2*(n/2),n/4); • sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); • merge(d,d+n/4,d+n/2,t); • merge(d+n/2,d+3*(n/4),d+n,t+n/2); • merge(t,t+n/2,t+n,d); • } else insertionSort(d,d+n);
“Recursively Sort Four Quarters of d” • void sort(int *d, int *t, int n) • if (n > CUTOFF) { • sort(d,t,n/4); • sort(d+n/4,t+n/4,n/4); • sort(d+2*(n/2),t+2*(n/2),n/4); • sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); • merge(d,d+n/4,d+n/2,t); • merge(d+n/2,d+3*(n/4),d+n,t+n/2); • merge(t,t+n/2,t+n,d); • } else insertionSort(d,d+n); Divide array into subarrays and recursively sort subarrays
7 4 6 1 3 5 8 2 “Recursively Sort Four Quarters of d” • void sort(int *d, int *t, int n) • if (n > CUTOFF) { • sort(d,t,n/4); • sort(d+n/4,t+n/4,n/4); • sort(d+2*(n/2),t+2*(n/2),n/4); • sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); • merge(d,d+n/4,d+n/2,t); • merge(d+n/2,d+3*(n/4),d+n,t+n/2); • merge(t,t+n/2,t+n,d); • } else insertionSort(d,d+n); Subproblems Identified Using Pointers Into Middle of Array d d+n/4 d+n/2 d+3*(n/4)
7 4 6 1 3 5 8 2 “Recursively Sort Four Quarters of d” • void sort(int *d, int *t, int n) • if (n > CUTOFF) { • sort(d,t,n/4); • sort(d+n/4,t+n/4,n/4); • sort(d+2*(n/2),t+2*(n/2),n/4); • sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); • merge(d,d+n/4,d+n/2,t); • merge(d+n/2,d+3*(n/4),d+n,t+n/2); • merge(t,t+n/2,t+n,d); • } else insertionSort(d,d+n); d d+n/4 d+n/2 d+3*(n/4)
4 7 1 6 3 5 2 8 “Recursively Sort Four Quarters of d” • void sort(int *d, int *t, int n) • if (n > CUTOFF) { • sort(d,t,n/4); • sort(d+n/4,t+n/4,n/4); • sort(d+2*(n/2),t+2*(n/2),n/4); • sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); • merge(d,d+n/4,d+n/2,t); • merge(d+n/2,d+3*(n/4),d+n,t+n/2); • merge(t,t+n/2,t+n,d); • } else insertionSort(d,d+n); Sorted Results Written Back Into Input Array d d+n/4 d+n/2 d+3*(n/4)
4 1 4 7 1 6 6 7 3 2 5 3 2 5 8 8 “Merge Sorted Quarters of d Into Halves of t” • void sort(int *d, int *t, int n) • if (n > CUTOFF) { • sort(d,t,n/4); • sort(d+n/4,t+n/4,n/4); • sort(d+2*(n/2),t+2*(n/2),n/4); • sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); • merge(d,d+n/4,d+n/2,t); • merge(d+n/2,d+3*(n/4),d+n,t+n/2); • merge(t,t+n/2,t+n,d); • } else insertionSort(d,d+n); d t t+n/2
1 1 4 2 3 6 4 7 5 2 6 3 7 5 8 8 “Merge Sorted Halves of t Back Into d” • void sort(int *d, int *t, int n) • if (n > CUTOFF) { • sort(d,t,n/4); • sort(d+n/4,t+n/4,n/4); • sort(d+2*(n/2),t+2*(n/2),n/4); • sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); • merge(d,d+n/4,d+n/2,t); • merge(d+n/2,d+3*(n/4),d+n,t+n/2); • merge(t,t+n/2,t+n,d); • } else insertionSort(d,d+n); d t t+n/2
7 4 6 1 3 5 8 2 “Use a Simple Sort for Small Problem Sizes” • void sort(int *d, int *t, int n) • if (n > CUTOFF) { • sort(d,t,n/4); • sort(d+n/4,t+n/4,n/4); • sort(d+2*(n/2),t+2*(n/2),n/4); • sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); • merge(d,d+n/4,d+n/2,t); • merge(d+n/2,d+3*(n/4),d+n,t+n/2); • merge(t,t+n/2,t+n,d); • } else insertionSort(d,d+n); d d+n
7 4 1 6 3 5 8 2 “Use a Simple Sort for Small Problem Sizes” • void sort(int *d, int *t, int n) • if (n > CUTOFF) { • sort(d,t,n/4); • sort(d+n/4,t+n/4,n/4); • sort(d+2*(n/2),t+2*(n/2),n/4); • sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); • merge(d,d+n/4,d+n/2,t); • merge(d+n/2,d+3*(n/4),d+n,t+n/2); • merge(t,t+n/2,t+n,d); • } else insertionSort(d,d+n); d d+n
Parallel Sort • void sort(int *d, int *t, int n) • if (n > CUTOFF) { • spawn sort(d,t,n/4); • spawn sort(d+n/4,t+n/4,n/4); • spawn sort(d+2*(n/2),t+2*(n/2),n/4); • spawn sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); • sync; • spawn merge(d,d+n/4,d+n/2,t); • spawn merge(d+n/2,d+3*(n/4),d+n,t+n/2); • sync; • merge(t,t+n/2,t+n,d); • } else insertionSort(d,d+n);
What Do You Need To Know To Exploit This Form of Parallelism? Points-to Information (data blocks that pointers point to) Region Information (accessed regions within data blocks)
Information Needed To Exploit Parallelism d and t point to different memory blocks Calls to sort access disjoint parts of d and t Together, calls access [d,d+n-1] and [t,t+n-1] sort(d,t,n/4); sort(d+n/4,t+n/4,n/4); sort(d+n/2,t+n/2,n/4); sort(d+3*(n/4),t+3*(n/4), n-3*(n/4)); d d+n-1 t t+n-1 d d+n-1 t t+n-1 d d+n-1 t t+n-1 d d+n-1 t t+n-1
Information Needed To Exploit Parallelism d and t point to different memory blocks First two calls to merge access disjoint parts of d,t Together, calls access [d,d+n-1] and [t,t+n-1] merge(d,d+n/4,d+n/2,t); merge(d+n/2,d+3*(n/4), d+n,t+n/2); merge(t,t+n/2,t+n,d); d d+n-1 t t+n-1 d d+n-1 t t+n-1 d d+n-1 t t+n-1
Information Needed To Exploit Parallelism • Calls to insertionSort access [d,d+n-1] • insertionSort(d,d+n); d d+n-1
What Do You Need To Know To Exploit This Form of Parallelism? Points-to Information (d and t point to different data blocks) Symbolic Region Information (accessed regions within d and t blocks)
How Hard Is It To Figure These Things Out? Challenging
How Hard Is It To Figure These Things Out? void insertionSort(int *l, int *h) { int *p, *q, k; for (p = l+1; p < h; p++) { for (k = *p, q = p-1; l <= q && k < *q; q--) *(q+1) = *q; *(q+1) = k; } } Not immediately obvious that insertionSort(l,h) accesses [l,h-1]
How Hard Is It To Figure These Things Out? void merge(int *l1, int*m, int *h2, int *d) { int *h1 = m; int *l2 = m; while ((l1 < h1) && (l2 < h2)) if (*l1 < *l2) *d++ = *l1++; else *d++ = *l2++; while (l1 < h1) *d++ = *l1++; while (l2 < h2) *d++ = *l2++; } Not immediately obvious that merge(l,m,h,d) accesses [l,h-1] and [d,d+(h-l)-1]
Issues • Heavy Use of Pointers • Pointers into Middle of Arrays • Pointer Arithmetic • Pointer Comparison • Multiple Procedures • sort(int *d, int *t, n) • insertionSort(int *l, int *h) • merge(int *l, int *m, int *h, int *t) • Recursion
Fully Automatic Solution • Whole-program pointer analysis • Context-sensitive, flow-sensitive • Rugina and Rinard, PLDI 1999 • Whole-program region analysis • Symbolic constraint systems • Solve by reducing to linear programs • Rugina and Rinard, PLDI 2000
Key Complication Need for sophisticated interprocedural analyses • Pointer analysis • Propagate analysis results through call graph • Fixed-point algorithm for recursive programs • Region analysis • Formulation avoids fixed-point algorithms • Single constraint system for each strongly connected component • Need to have whole program in analyzable form
Bigger Picture • Points-to and region information is (implicitly) part of the interface of each procedure • Programmer understands procedure interfaces • Programmer knows • Points-to relationships on entry • Effect of procedure on points-to relationships • Regions of memory blocks that procedure accesses
Idea Enhance procedure interface to make points-to and region information explicit • Points-to language • Points-to graphs at entry and exit • Effect on points-to relationships • Region language • Symbolic specification of accessed regions • Programmer provides information • Analysis verifies that it is correct
Points-to Language f(p, q, n) { context { entry: p->_a, q->_b; exit: p->_a, _a->_c, q->_b, _b->_d; } context { entry: p->_a, q->_a; exit: p->_a, _a->_c, q->_a; } }
p p q q p p q q Points-to Language f(p, q, n) { context { entry: p->_a, q->_b; exit: p->_a, _a->_c, q->_b, _b->_d; } context { entry: p->_a, q->_a; exit: p->_a, _a->_c, q->_a; } } Contexts for f(p,q,n) entry exit
p p q q p p q q Verifying Points-to Information One (flow sensitive) analysis per context f(p,q,n) { . . . } Contexts for f(p,q,n) entry exit
p p p q q q p p q q Verifying Points-to Information Start with entry points-to graph f(p,q,n) { . . . } Contexts for f(p,q,n) entry exit
p p q q p p q q Verifying Points-to Information Analyze procedure f(p,q,n) { . . . } Contexts for f(p,q,n) entry p q exit
p p q q p p p q q q Verifying Points-to Information Analyze procedure f(p,q,n) { . . . } Contexts for f(p,q,n) entry exit
p p q q p p p q q q Verifying Points-to Information Check result against exit points-to graph f(p,q,n) { . . . } Contexts for f(p,q,n) entry exit
p p q q p p q q Verifying Points-to Information Similarly for other context f(p,q,n) { . . . } Contexts for f(p,q,n) entry exit
p p p q q q p p q q Verifying Points-to Information Start with entry points-to graph f(p,q,n) { . . . } Contexts for f(p,q,n) entry exit
p p q q p p p q q q Verifying Points-to Information Analyze procedure f(p,q,n) { . . . } Contexts for f(p,q,n) entry exit
p p q q p p p q q q Verifying Points-to Information Check result against exit points-to graph f(p,q,n) { . . . } Contexts for f(p,q,n) entry exit
Analysis of Call Statements g(r,n) { . . f(r,s,n); . . }
Analysis of Call Statements Analysis produces points-graph before call g(r,n) { . . f(r,s,n); . . } r s
p p q q p p q q Analysis of Call Statements Retrieve declared contexts from callee g(r,n) { . . f(r,s,n); . . } Contexts for f(p,q,n) r entry s exit
p p q q p p q q Analysis of Call Statements Find context with matching entry graph g(r,n) { . . f(r,s,n); . . } Contexts for f(p,q,n) r entry s exit
p p q q p p q q Analysis of Call Statements Find context with matching entry graph g(r,n) { . . f(r,s,n); . . } Contexts for f(p,q,n) r entry s exit