1 / 24

Programming by Sketching

Programming by Sketching. Armando Solar-Lezama, Liviu Tancau, Gilad Arnold, Rastislav Bodik , Sanjit Seshia UC Berkeley, Rodric Rabbah MIT, Kemal Ebcioglu, Vijay Saraswat, Vivek Sarkar IBM. Merge sort. int[] mergeSort (int[] input, int n) { return merge ( mergeSort (input[0::n/2]),

alair
Download Presentation

Programming by Sketching

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Programming by Sketching Armando Solar-Lezama, Liviu Tancau, Gilad Arnold, Rastislav Bodik, Sanjit Seshia UC Berkeley, Rodric Rabbah MIT, Kemal Ebcioglu, Vijay Saraswat, Vivek Sarkar IBM

  2. Merge sort int[] mergeSort (int[] input, int n) { return merge(mergeSort (input[0::n/2]), mergeSort (input[n/2+1::n]) , n); } int[] merge (int[] a, int b[], int n) { int j=0, k=0; for (int i = 0; i < n; i++) if ( a[j] < b[k] ) { result[i] = a[j++]; } else { result[i] = b[k++]; } } return result; } looks simple to code, but there is a bug

  3. Merge sort int[] mergeSort (int[] input, int n) { return merge( mergeSort (input[0::n/2]), mergeSort (input[n/2+1::n]) , n); } int[] merge (int[] a, int b[], int n) { int j, k; for (int i = 0; i < n; i++) if ( j<n&& ( !(k<n) || a[j] < b[k]) ) { result[i] = a[j++]; } else { result[i] = b[k++]; } } return result; }

  4. + spec specification implementation (completed sketch) The sketching experience sketch

  5. The spec: bubble sort int[] sort (int[] input, int n) { for (int i=0; i<n; ++i) for (int j=i+1; j<n; ++j) if (input[j] < input[i]) swap(input, j, i); }

  6. Merge sort: sketched int[] mergeSort (int[] input, int n) { return merge( mergeSort (input[0::n/2]), mergeSort (input[n/2+1::n]) , n); } int[] merge (int[] a, int b[], int n) { int j, k; for (int i = 0; i < n; i++) if ( expression( ||, &&, <, !, [] ) ) { result[i] = a[j++]; } else { result[i] = b[k++]; } } return result; } hole

  7. Merge sort: synthesized int[] mergeSort (int[] input, int n) { return merge( mergeSort (input[0::n/2]), mergeSort (input[n/2::n]) ); } int[] merge (int[] a, int b[], int n) { int j, k; for (int i = 0; i < n; i++) if ( j<n&& ( !(k<n) || a[j] < b[k]) ) { result[i] = a[j++]; } else { result[i] = b[k++]; } } return result; }

  8. Sketching: spec vs. sketch • Specification • executable: easy to debug, serves as a prototype • a reference implementation: simple and sequential • written by domain experts: crypto, bio, MPEG committee • Sketched implementation • program with holes: filled in by synthesizer • programmer sketches strategy: machine provides details • written by performance experts: vector wizard; cache guru

  9. How sketching fits into autotuning • Autotuning: two methods for obtaining code variants • optimizing compiler: transform a “spec” in various ways • custom generator: for a specific algorithm • We seek to simplify the second approach • Scenario 1: library of variants stores resolved sketches • as if written by hand • Scenario 2: library has unresolved,flexible sketches • sketch works for a variety of specifications: e.g., a class of stencils

  10. SKETCH • A language with support for sketching-based synthesis • like C without pointers • two simple synthesis constructs • restricted to finite programs: • input size known at compile time, terminates on all inputs • most high-performance kernels are finite: • matrix multiply: yes • binary search tree: no • we’re already working on relaxing the fineteness restriction • later in this talk

  11. Ex1: Isolate rightmost 0-bit. 1010 0111  0000 1000 bit[W] isolate0 (bit[W] x) { // W: word sizebit[W] ret = 0;for (int i = 0; i < W; i++) if (!x[i]) { ret[i] = 1; break; } return ret; } bit[W] isolate0Fast (bit[W] x) implements isolate0 { return ~x & (x+1); } bit[W] isolate0Sketched (bit[W] x) implements isolate0 { return ~(x + ??) & (x + ??); }

  12. Programmer’s view of sketches • the ?? operator replaced with a suitable constant • as directed by the implementsclause. • the ?? operator introduces non-determinism • the implementsclause constrains it.

  13. Beyond synthesis of literals • Synthesizing values of ?? already very useful • parallelization machinery: bitmasks, tables in crypto codes • array indices: A[i+??,j+??] • We can synthesize more than constants • semi-permutations: functions that select and shuffle bits • polynomials: over one or more variables • actually, arbitrary expressions, programs

  14. Synthesizing polynomials int spec (int x) { return 2*x*x*x*x + 3*x*x*x + 7*x*x + 10; } int p (int x) implements spec { return (x+1)*(x+2)*poly(3,x); } int poly(int n, int x) { if (n==0) return ??; elsereturn x * poly(n-1, x) + ??; }

  15. Karatsuba’s multiplication x = x1*b + x0 y = y1*b + y0 b=2k x*y = b2*x1*y1 + b*(x1*y0 + x0*y1) + x0*y0 x*y = poly(??,b) * x1*y1 + + poly(??,b) * poly(1,x1,x0,y1,y0)*poly(1,x1, x0, y1, y0) + poly(??,b) * x0*y0 x*y = (b2 +b) * x1*y1 + b *(x1 - x0)*(y1 - y0) + (b+1) * x0*y0

  16. Sketch of Karatsuba bit[N*2] k<int N>(bit[N] x, bit[N] y) implements mult { if (N<=1) return x*y; bit[N/2] x1 = x[0:N/2-1]; bit[N/2+1] x2 = x[N/2:N-1]; bit[N/2] y1 = y[0:N/2-1]; bit[N/2+1] y2 = y[N/2:N-1]; bit[2*N] t11 = x1 * y1; bit[2*N] t12 = poly(1, x1, x2, y1, y2) * poly(1, x1, x2, y1, y2); bit[2*N] t22 = x2 * y2; return multPolySparse<2*N>(2, N/2, t11) // log b = N/2 + multPolySparse<2*N>(2, N/2, t12) + multPolySparse<2*N>(2, N/2, t22); } bit[2*N] poly<int N>(int n, bit[N] x0, x1, x2, x3) { if (n<=0) return ??; else return (??*x0 + ??*x1 + ??*x2 + ??*x3) * poly<N>(n-1, x0, x1, x2, x3); } bit[2*N] multPolySparse<int N>(int n, int x, bit[N] y) { if (n<=0) return 0; else return y << x*?? + multPolySparse<N>(n-1, x, y); }

  17. Semantic view of sketches • a sketch represents a set of functions: • the ?? operator modeled as reading from an oracle int f (int y) { int f (int y, bit[][K] oracle) { x = ??; x = oracle[0]; loop (x) { loop (x) { y = y + ??; y = y + oracle[1]; } } return y; return y; } } Synthesizer must find oracle satisfying f implements g

  18. Synthesis algorithm: overview • translation: represent spec and sketch as circuits • synthesis: find suitable oracle • code generation: specialize sketch wrt oracle

  19. 0 0 0 0 0 0 0 1 + mux + mux + mux + mux Ex : Population count. 0010 0110  3 x count one int pop (bit[W] x) { int count = 0; for (int i = 0; i < W; i++) { if (x[i]) count++; } return count; } count count count F(x) = count

  20. Synthesis as generalized SAT • The sketch synthesis problem is an instance of 2QBF:  o . x . P(x) = S(x,o) • Counter-example driven solver: I = {} x = random() do I = I U {x} c = synthesizeForSomeInputs(I) if c = nil then exit(“buggy sketch'') x = verifyForAllInputs(c) // x: counter-example while x !=nil return c • S(x1, c)=P(x1)  …  S(xk, c)=P(xk) • I ={ x1, x2, …, xk } S(x, c)  P(x)

  21. Case study • Implemented AES • the modern block-cipher standard • 14 rounds: each has table lookup, permutation, GF-multiply • a good implementation collapses each round into table lookups • Our results • we synthesized 32Kbit oracle! • synthesis time: about 1 hour • counterexample-driven synthesizer iterated 655 times • performance of synthesized code within 10% of hand-tuned

  22. Finite programs • In theory, SKETCH is complete for all finite programs: • specification can specify any finite program • sketch can describe any implementation over given instructions • synthesizer can resolve any sketch • In practice, SKETCH scales for small finite programs • small finite programs: block ciphers, small kernels • large finite: big-integer multiplication, matrix multiplication • Solution: • synthesize for a small input size • prove (or examine) that result of synthesis works for bigger inputs

  23. Lossless abstraction • Problem • does result of synthesis for a small matrix work for all matrices? • Approach • spec, sketch have unbounded-input/output • abstract them into finite functions, with the same abstraction • synthesize • obtained oracle works for original sketch • Stencil kernels • concrete: matrix A[N]  matrix B[N] • abstract: A[e(i)], i, N  B[i]

  24. Example: divide and conquer parallelization • Parallel algorithm: • Data rearrangement + parallel computation • spec: • sequential version of the program • sketch: • parallel computation • automatically synthesized: • Rearranging the data (dividing the data structure)

More Related