1 / 128

A Unified Framework for Schedule and Storage Optimization

This paper presents a unified framework for optimizing schedule and storage in a computational system, considering the tradeoff between parallelism and storage requirements.

lgatlin
Download Presentation

A Unified Framework for Schedule and Storage Optimization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Unified Framework forSchedule and Storage Optimization William Thies, Frédéric Vivien*, Jeffrey Sheldon, and Saman Amarasinghe MIT Laboratory for Computer Science * ICPS/LSIIT, Université Louis Pasteur http://compiler.lcs.mit.edu/aov

  2. Motivating Example for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j

  3. Motivating Example t = 1 (i, j) = (1, 1) for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j

  4. Motivating Example t = 2 (i, j) = (1, 2) for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j

  5. Motivating Example t = 3 (i, j) = (1, 3) for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j

  6. Motivating Example t = 4 (i, j) = (1, 4) for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j

  7. Motivating Example t = 5 (i, j) = (1, 5) for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j

  8. Motivating Example t = 6 (i, j) = (2, 1) for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j

  9. Motivating Example t = 7 (i, j) = (2, 2) for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j

  10. Motivating Example t = 8 (i, j) = (2, 3) for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j

  11. Motivating Example t = 9 (i, j) = (2, 4) for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j

  12. Motivating Example t = 10 (i, j) = (2, 5) for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j

  13. Motivating Example t = 25 (i, j) = (5, 5) for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j

  14. Motivating Example t = 25 (i, j) = (5, 5) for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j Array Expansion init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2]) i j

  15. Motivating Example t = 25 (i, j) = (5, 5) for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j Array Expansion t = 1 (i, j) = (1, 1) init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2]) i j

  16. Motivating Example t = 25 (i, j) = (5, 5) for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j Array Expansion t = 2 (i, j) = {(1, 2), (2, 1)} init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2]) i j

  17. Motivating Example t = 25 (i, j) = (5, 5) for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j Array Expansion t = 3 (i, j) = {(1, 3), (2, 2), (3, 1)} init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2]) i j

  18. Motivating Example t = 25 (i, j) = (5, 5) for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j Array Expansion t = 4 (i, j) = {(1, 4), (2, 3), (3, 2), (4, 1)} init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2]) i j

  19. Motivating Example t = 25 (i, j) = (5, 5) for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j Array Expansion t = 5 (i, j) = {(1, 5), (2, 4), (3, 3), (4, 2), (5, 1)} init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2]) i j

  20. Motivating Example t = 25 (i, j) = (5, 5) for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j Array Expansion t = 6 (i, j) = {(2, 5), (3, 4), (4, 3), (5, 2)} init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2]) i j

  21. Motivating Example t = 25 (i, j) = (5, 5) for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j Array Expansion t = 7 (i, j) = {(3, 5), (4, 4), (5, 3)} init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2]) i j

  22. Motivating Example t = 25 (i, j) = (5, 5) for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j Array Expansion t = 8 (i, j) = {(4, 5), (5, 4)} init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2]) i j

  23. Motivating Example t = 25 (i, j) = (5, 5) for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j Array Expansion t = 9 (i, j) = (5, 5) init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2]) i j

  24. Parallelism/Storage Tradeoff • Increasing storage can enable parallelism • But storage can be expensive • Phase ordering problem • Optimizing for storage restricts parallelism • Maximizing parallelism restricts storage options • Too complex to consider all combinations • Need efficient framework to integrate schedule and storage optimization Cache RAM Disk

  25. Outline • Abstract problem • Simplifications • Concrete problem • Solution Method • Conclusions

  26. Abstract Problem • Given DAG of dependent operations • Must execute producers before consumers • Must store a value until all consumers execute • Two parameters control execution: • 1. A scheduling function  • Maps each operation to execution time • Parallelism is implicit • 2. A fully associative store of size m

  27. Abstract Problem • We can ask three questions: • Two parameters control execution: • 1. A scheduling function  • Maps each operation to execution time • Parallelism is implicit • 2. A fully associative store of size m

  28. Abstract Problem • We can ask three questions: 1. Given , what is the smallest m? • Two parameters control execution: • 1. A scheduling function  • Maps each operation to execution time • Parallelism is implicit • 2. A fully associative store of size m

  29. Abstract Problem • We can ask three questions: 1. Given , what is the smallest m? 2. Given m, what is the “best” ? • Two parameters control execution: • 1. A scheduling function  • Maps each operation to execution time • Parallelism is implicit • 2. A fully associative store of size m

  30. Abstract Problem • We can ask three questions: 1. Given , what is the smallest m? 2. Given m, what is the “best” ? 3. What is the smallest m that is valid for all legal ? • Two parameters control execution: • 1. A scheduling function  • Maps each operation to execution time • Parallelism is implicit • 2. A fully associative store of size m

  31. Outline • Abstract problem • Simplifications • Concrete problem • Solution Method • Conclusions

  32. Simplifying the Schedule • Real programs aren’t DAG’s • Dependence graph is parameterized by loops • Too many nodes to schedule • Size could even be unknown (symbolic constants) • Use classical solution: affine schedules • Each statement has a scheduling function  • Each  is an affine function of the enclosing loop counters and symbolic constants • To simplify talk, ignore symbolic constants: (i) = B• i

  33. Simplifying the Storage Mapping • Programs use arrays, not associative maps • If size decreases, need to specify which elements are mapped to the same location

  34. Simplifying the Storage Mapping • Programs use arrays, not associative maps • If size decreases, need to specify which elements are mapped to the same location

  35. Simplifying the Storage Mapping • Programs use arrays, not associative maps • If size decreases, need to specify which elements are mapped to the same location ?

  36. Simplifying the Storage Mapping Occupancy Vectors (Strout et al.) • Specifies unit of overwriting within an array • Locations collapsed if separated by a multiple of v v = (1, 1) j i

  37. Simplifying the Storage Mapping Occupancy Vectors (Strout et al.) • Specifies unit of overwriting within an array • Locations collapsed if separated by a multiple of v v = (1, 1) j i

  38. Simplifying the Storage Mapping Occupancy Vectors (Strout et al.) • Specifies unit of overwriting within an array • Locations collapsed if separated by a multiple of v v = (1, 1) j i

  39. Simplifying the Store Occupancy Vectors (Strout et al.) • Specifies unit of overwriting within an array • Locations collapsed if separated by a multiple of v v = (1, 1) j i

  40. Simplifying the Store Occupancy Vectors (Strout et al.) • Specifies unit of overwriting within an array • Locations collapsed if separated by a multiple of v v = (1, 1) j i

  41. Simplifying the Store Occupancy Vectors (Strout et al.) • Specifies unit of overwriting within an array • Locations collapsed if separated by a multiple of v v = (1, 1) j i

  42. Simplifying the Store Occupancy Vectors (Strout et al.) • Specifies unit of overwriting within an array • Locations collapsed if separated by a multiple of v v = (1, 1) j i

  43. Simplifying the Store Occupancy Vectors (Strout et al.) • Specifies unit of overwriting within an array • Locations collapsed if separated by a multiple of v v = (1, 1) j i

  44. Simplifying the Store Occupancy Vectors (Strout et al.) • Specifies unit of overwriting within an array • Locations collapsed if separated by a multiple of v v = (1, 1) j i

  45. Simplifying the Store Occupancy Vectors (Strout et al.) • Specifies unit of overwriting within an array • Locations collapsed if separated by a multiple of v v = (1, 1) j i

  46. Simplifying the Store Occupancy Vectors (Strout et al.) • For a given schedule, v is valid if semantics are unchanged using transformed array • Shorter vectors require less storage v = (1, 1) j i

  47. Outline • Abstract problem • Simplifications • Concrete problem • Solution Method • Conclusions

  48. Answering Question #1 • Given (i, j) = i + j, what is the shortest valid occupancy vector v? i j

  49. Answering Question #1 • Given (i, j) = i + j, what is the shortest valid occupancy vector v? i j

  50. Answering Question #1 • Given (i, j) = i + j, what is the shortest valid occupancy vector v? • Solution: v = (1, 1) i j

More Related