1 / 46

Priority Scheduling: An Application for the Permutahedron

Priority Scheduling: An Application for the Permutahedron. Ethan Bolker UMass-Boston BMC Software AMS Toronto meeting September 24, 2000. Plan. Brief introduction to queueing theory Priority scheduling Conservation laws and the permutahedron Specifying CPU shares

kelton
Download Presentation

Priority Scheduling: An Application for the Permutahedron

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Priority Scheduling: An Application for the Permutahedron Ethan Bolker UMass-Boston BMC Software AMS Toronto meeting September 24, 2000

  2. Plan • Brief introduction to queueing theory • Priority scheduling • Conservation laws and the permutahedron • Specifying CPU shares interesting pictures and open questions References: www.cs.umb.edu/~eb/goalmode Acknowledgements: Jeff Buzen, Yiping Ding, Dan Keefe, Oliver Chen, Aaron Ball, Tom Larard

  3. Queueing theory • Workload: stream of jobs visiting a server (ATM, time shared CPU, printer, …) • Jobs queue when server is busy • Input: • Arrival rate:  job/sec • Service demand: s sec/job • Performance metrics: • Utilization: u = s (must be  1) • Response time: r = ??? • Degradation: d = r/s • Queue length: q = r (Little’s law)

  4. Response time computations • r, d, q measure queueing delay r  s (d  1), unless parallel processing possible • Randomness really matters r = s (d = 1) if arrivals scheduled (best case, no waiting) r >> s for bulk arrivals (worst case, maximum delays) • Theorem. d = 1/(1- u) if arrivals are Poisson and service is exponentially distributed (M/M/1).  r = s/(1- u) (think virtual server with speed 1-u )  q = u/(1- u) (convention: job in service is on queue)

  5. M/M/1 • Essential nonlinearity often counterintuitive • at u = 90% average queue length is 0.9/(1-0.9) = 9, • average response time is s/(1-0.9) = 10s, • but 1 customer in 10 has no wait at all (10% idle time) • A useful guide even when hypotheses fail • accurate enough ( 20%) for real computer systems • d depends only on u: many small jobs have same impact as few large jobs • faster system  smaller s  smaller u r = s/(1-u)  double win: less service, less wait • waiting costly, server cheap (telephones): want u  0 • server costly (doctors): want u  1 but scheduled

  6. Multiple Job Streams • Multiple workloads, utilizations u1, u2, … • U =  ui < 1 All degradations equal: di = 1/(1-U) • Suppose priority scheduling possible Study degradation vector V = (d1, d2, …)

  7. Priority Scheduling • Priority state: order workloads by priority (ties OK) • two workloads, 3 states: 12, 21, [12] • three workloads, 13 states: • 123 (6 = 3! of these ordered states), • [12]3 (3 of these), • 1[23] (3 of these), • [123] (1 state with no priorities) • n wkls, f(n) states, n! ordered (simplex lock combos) • p(s) = prob( state = s ) = fraction of time in state s • V(s) = degradation vector when state = s (measure this, or compute it using queueing theory) • V = s p(s)V(s) (time avg is convex combination) • Achievable region is convex hull of vectors V(s)

  8. Two workloads d1 = d2 d2 V(12) (wkl 1 high prio)  V([12]) (no priorities)  achievable region  V(21) d1

  9. Two workloads d1 = d2 d2 V(12) (wkl 1 high prio)  V([12]) (no priorities)   0.5 V(12) + 0.5V(21)  V([12])  V(21) d1

  10. Two workloads d1 = d2 d2 V(12) (wkl 1 high prio)  V([12]) (no priorities)  note: u1 < u2  wkl 2 effect on wkl 1 large  V(21) d1

  11. Conservation • No Free Lunch Theorem. Weighted average degradation is constant, independent of priority scheduling scheme: i (ui /U) di = 1/(1-U) • Provable from some hypotheses • Observable in some real systems • Sometimes false: shortest job first minimizes average response time (printer queues, supermarket express checkout lines)

  12. Conservation • For any proper set A of workloads Imagine giving those workloads top priority. Then can pretend other wkls don’t exist. In that case i  A (ui /U(A)) di= 1/(1-U(A)) When wkls in A have lower priorities they have higher degradations, so in general i  A (ui /U(A)) di 1/(1-U(A)) • These 2n -2 linear inequalities determine the convex achievable regionR • R is a permutahedron: only n! vertices

  13. Two workload permutahedron d2 u1d1 + u2d2 = U/(1-U) d1

  14. Two workload permutahedron d2 u1d1 + u2d2 = U/(1-U) d2  1/(1- u2 )  V(21) d1

  15. Two workload permutahedron d2 V(12)  achievable region u1d1 + u2d2 = U/(1-U) d2  1/(1- u2 )  V(21) d1  1/(1- u1 ) d1

  16. Three workload permutahedron d3 u1d1 + u2d2 + u3d3 = U/(1-U) V(123)   V(213) d2 d1

  17. Experimental evidence

  18. Four workload permutahedron 4! = 24 vertices (ordered states) 24 - 2 = 14 facets (proper subsets) (conservation constraints) 74 faces (states) Simplicial geometry and transportation polytopes, Trans. Amer. Math. Soc. 217 (1976) 138.

  19. Scheduling for performance • Administrator specifies performance goals • desired degradations (IBM OS/390) (not today) • CPU shares (UNIX offerings from HP, IBM, Sun) • Operating system dispatches jobs in an attempt to meet goals • Model predicts degradations by constructing map workload performance goals permutahedron

  20. Specifying CPU shares • Administrator specifies workload CPU shares • Share f (0 < f < 1) means workload guaranteed fraction f of CPU when at least one of its jobs is queued for service, can get more if some competition is absent • share  utilization • share  cap • share should be renamed guarantee

  21. Map shares to degradations- two workloads - • Suppose f1 and f2 > 0 , f1 + f2 = 1 • Model: System operates in state • 12 with probability f1 • 21 with probability f2 (independent of who is on queue) • Average degradation vector: V = f1 V(12) + f2 V(21)

  22. Model validation

  23. Model validation

  24. Map shares to degradations- three (n) workloads - f1 f2 f3 prob(123) = ------------------------------ (f1 + f2 +f3) (f2 +f3) (f3) • Theorem: These n! probabilities sum to 1 • interesting identity generalizing adding fractions • prove by induction, or by coupon collecting • V = ordered states s prob(s) V(s) • O(n!), (n!), good enough for n  9 (12) • Searching for fast (approximate) algorithm ...

  25. Model validation

  26. Model validation

  27. Map shares to degradations(geometry) • Interpret shares as barycentric coordinates in the n-1 simplex • Study the geometry of the map from the simplex to the n-1 dimensional permutahedron • Easy when n=2: each is a line segment and map is linear

  28. Mapping a triangle to a hexagon f3 = 1 f1 = 0 312 132 f1 = 1  M f3 = 0 321 123 wkl 1 high priority 213 231 wkl 1 low priority

  29. Mapping a triangle to a hexagon f1 = 0 f1 = 1  {23}

  30. Mapping a triangle to a hexagon

  31. Implementing fair share scheduling • Actual Sun/solaris implementation is subtle • HP and IBM are black boxes (for me) • Stochastic solution: randomly choose queued job to dispatch (implement the model rather than model an implementation) • May require prior computation of priodist(w, p) = prob(wkl w runs at prio p) • workload priority probabilities, not state probabilities

  32. Priority distributions • Given degradations, compute a priodist • A priodist is an nn matrix with row sums 1 • {priodists} = cartesian product of n n-simplices • Map is surjective, not injective • Look for a well behaved inverse image priodist space (dim n(n-1)) permutahedron (dim n-1)

  33. Three workload permutahedron d2 d1 = d2 [13]2 312 132 3[12] 1[23] [123] 321 123 [23]1 [12]3 231 d2 = d3 213 2[13] d1 d1 = d3

  34. … dissected into 3! quadrilaterals d2 d1 = d2 1[23] [123] 123 [12]3 d2 = d3 d1

  35. … each mapped to from a skew quadrilateral of priodists 1 0 0 0 .5 .5 0 .5 .5 .33 .33 .33 .33 .33 .33 .33 .33 .33 P[123] P1[23] 1[23] [123]  (x,y) P123 P[12]3 123 [12]3 .5 .5 0 .5 .5 0 0 0 1 1 0 0 0 1 0 0 0 1 (x,y)  xyP123 + x(1-y) P1[23] + (1-x)yP[12]3 + (1-x)(1-y) P[123] degradation vector in this corner of permutahedron

  36. Skew quadrilaterals • Given 4 points P00, P01, P10, P11 Rm , map unit square: (x,y)  xyP00 + x(1-y) P01+ (1-x)yP10 + (1-x)(1-y) P11 • Easy to generalize to 2k points • Analogous to convex hull, which maps barycentric coordinates on a simplex • Reference for this construction?

  37. Inversion Try to locate * = (d1, d2 ) on coordinate grid d2  d1

  38. Sequential bisection d2   d1

  39. Sequential bisection d2    d1

  40. Sequential bisection  d2    d1

  41. Sequential bisection  d2     d1

  42. Sequential bisection  d2      d1

  43. … may fail to converge d2    d1

  44. Tempered sequential bisection o d2     d1

  45. Tempered sequential bisection o d2 o      d1

  46. Tempered sequential bisection o d2 o o       d1 prove that this converges...

More Related