390 likes | 403 Views
Olivier Beaumont, ENS Larry Carter, UCSD Jeanne Ferrante, UCSD Arnaud Legrand, ENS Yves Robert, ENS. Bandwidth-centric scheduling of independent tasks on a heterogeneous grid. Grid Computing. Distributed heterogeneous computing Large number of independent tasks
E N D
Olivier Beaumont, ENS Larry Carter, UCSD Jeanne Ferrante, UCSD Arnaud Legrand, ENS Yves Robert, ENS Bandwidth-centric schedulingof independent taskson a heterogeneous grid PCL seminar, July 2001
Grid Computing • Distributed heterogeneous computing • Large number of independent tasks • Data begins and ends at specific site • Examples: • SETI @ home • Factoring numbers • Animated films • Drug screening PCL seminar, July 2001
Computational grid My computer Intermediate nodes can compute too Data starts here Internet Gateway Cluster Host Partner site Super- computer Participating PC’s and workstations PCL seminar, July 2001
“Base Model” Node • Processor takes W0 time to do one task • Takes C-1 time to receive task from parent • Takes Ci time to send task to i-th child • These 3 activities can be done concurrently • but only one “send” at a time PCL seminar, July 2001
Example A is the root of the tree; all tasks start at A 3 A Time for sending one task from A to B 1 2 Time for computing one task in C 2 6 B C 1 2 D Examples assume no communication of results back to root PCL seminar, July 2001
Example 3 A 1 2 2 6 B C A compute 1 A send 2 D B receive B compute C receive C compute C send D receive D compute 1 2 3 Time PCL seminar, July 2001
Example 3 A 1 2 2 6 B C A compute 1 A send 2 D B receive B compute C receive C compute C send D receive D compute 1 2 3 Time PCL seminar, July 2001
Example 3 A 1 2 2 6 B C A compute 1 A send 2 D B receive B compute C receive C compute C send D receive D compute 1 2 3 Time PCL seminar, July 2001
Example 3 A 1 2 2 6 B C A compute 1 A send 2 D B receive B compute C receive C compute C send D receive D compute 1 2 3 Time PCL seminar, July 2001
Example 3 A 1 2 2 6 B C A compute 1 A send 2 D B receive B compute C receive C compute C send D receive D compute 1 2 3 Time PCL seminar, July 2001
Example 3 A 1 2 2 6 B C A compute 1 A send 2 D B receive B compute C receive C compute C send D receive D compute 1 2 3 Time PCL seminar, July 2001
Example 3 A 1 2 2 6 B C A compute 1 A send 2 D B receive B compute C receive C compute C send D receive D compute 1 2 3 Time PCL seminar, July 2001
Steady-state 3 A 1 2 Repeated pattern 6 2 C B Startup Clean-up 1 A compute 2 D A send B receive B compute C receive C compute C send D receive D compute 1 2 3 Steady-state: 7 tasks every 6 time units PCL seminar, July 2001 Total time: 16 tasks in 16 time units
Steady State Problem w0 • One-level “fork graph”: node & k leaves • Concurrent activities: • w0 time to execute task • ci time to send to i-th child (only one at a time) • Let Ri denote steady-state rates • R0 = tasks/second executed in node • Ri = tasks/second sent to and done by child i • Constraints: • ∑ Ri ci ≤ 1 • Ri ≤ 1/wi for i = 0, ..., k Ck C1 ... wk w1 PCL seminar, July 2001
p i=1 k i=0 Solution • Sort by communication times c1 ≤ c2 ≤ ... • Find largest p such that ∑ ci/wi ≤ 1 • For i = 1, ..., p, set Ri = 1/wi • keep the first p children busy • note that ∑ Ri ci ≤ 1 so far • Set Rp+1 = e/cp+1, where e = 1- ∑ Ri ci • give the (p+1)-st child any leftover work • Set R0 = 1/w0 • Keep the root’s processor busy w0 Ck C1 ... wk w1 Constraints: Ri ≤ 1/wi i = 0 ... k ∑ Ri ci ≤ 1 PCL seminar, July 2001
New law of efficient management • Delegate work to whomever it takes you the least time to explain the problem to! • Provided worker’s desk isn’t overloaded. • It doesn’t matter if that person is a slow worker. • Of course, slow workers will have full desktops more often. PCL seminar, July 2001
k p i=1 i=0 With communication from above • Three concurrent activities: • W0 time to execute task • C-1 time to receive task from parent • Ci time to send to i-th child • New constraints: • R-1 = ∑ Ri and R-1 c-1 ≤ 1, where R-1 = receive rate • Solution: • R-1 = min (1/c-1, 1/wo + ∑ 1/wi + e/c p+1) C-1 w0 Ck C1 ... wk w1 solution for one-level tree PCL seminar, July 2001
Steady-state for tree Process root last My computer Internet Gateway Cluster Host Partner site Super- computer Reduce fork graphs to single node PCL seminar, July 2001
Steady-state for tree Process root last My computer Internet Gateway Cluster Host Summary node Super- computer Reduce fork graphs to single node PCL seminar, July 2001
Steady-state for tree Process root last My computer Internet Summary Cluster Summary Reduce fork graphs to single node PCL seminar, July 2001
Example 3 A 1 2 First find equivalent work-time for subtree 2 6 B C 1 2 D Subtree’s rate is 1/6 + 1/2 = 2/3, equivalent to node with w = 3/2 PCL seminar, July 2001
Example 3 A 1 2 Replace subtree with equivalent node 2 1.5 B C’ Subtree’s rate is 1/6 + 1/2 = 2/3, equivalent to node with w = 3/2 PCL seminar, July 2001
Example 3 A 1 2 2 1.5 B C’ Solve root tree: Keep C’ busy (i.e. RC ’ = 1/1.5) e = 1-1/1.5 = 1/3 RB = e/2 = 1/6 rate = 1/3 + 1/1.5 + 1/6 = 7/6 PCL seminar, July 2001
“Base Model” Notation • Three concurrent activities: • W0 time to do one task • C-1 time to receive task from parent • Ci time to send to i-th child (only one at a time) c-1 to parent receive things stacked vertically can be done concurrently w0 node’s processor send c1 ck ... wk w1 children PCL seminar, July 2001
Concurrent receive model • Two concurrent activities: • Receive task from parent. • EITHER send to i-th child OR execute one task c-1 receive w0 send c1 ck ... w1 wk PCL seminar, July 2001
Reduce to previous model • Replace node’s processor by a new child. c-1 c-1 receive receive ∞ w0 send send c1 w0 ck ck c1 w1 wk w0 wk w1 base model concurrent receive model PCL seminar, July 2001
Concurrent work model • 2 concurrent activities: execute and one communication c-1 c-1 receive w0 ∞ receive send send c1 ck c-1 c-1+ck c-1+c1 w1 wk wk w1 w0 base model concurrent receive model PCL seminar, July 2001
Other models have similar reductions Fully sequential model w0 receive send receive w0 send send send w0 receive Fully parallel model send Concurrent send model PCL seminar, July 2001
Revised law of efficient management(if you can’t work and delegate at the same time) • Delegate work to whomever it takes you the least time to explain the problem to! • Provided worker’s desk isn’t overloaded. • Do it yourself only if that’s faster than explaining it to any available worker. • And fire anyone who isn’t working when you are! • Or reassign them to a better communicating or slower-working manager PCL seminar, July 2001
OGO model of communication • Recall LogP model: Latency Overhead Gap • Latency is irrelevant to steady-state • We allow different Overhead’s at each end • OGO model • channel can send one task every G time units • node’s processor is interrupted for time O • Easy to extend our methods • use no-concurrency model with O’s • take min with 1/G PCL seminar, July 2001
Flaky example • Communicate with Alice via fax • takes me 5 minutes to send message • ties up the fax machine for 2 minutes • Communicate with Bob via US mail • takes me 1 minute to address letter • doesn’t reach him for 3 days • Communicate with Carol via courier • takes me 30 seconds to summon courier • can only deliver two tasks per day PCL seminar, July 2001
Bandwidth-centric scheduling • Prioritize children by communication times • Request tasks from parent • initially, request enough to fill your buffers • make more requests as you delegate & execute • On receiving a task: • execute locally if idle • satisfy children’s request according to priorities • Occasionally re-adjust priorities PCL seminar, July 2001
Simulations • 100 randomly-generated trees • up to 5 children per node • up to 10 non-leaf nodes • Three strategies: • All children; equal priority (first-come, first-serve) • Theorem’s subset; equal priority • Theorem’s subset; low priority to last child • Buffer for one task per node (buffer starts full) • Buffer four tasks per node PCL seminar, July 2001
Start-up issues • Bandwidth-centric is optimal within an additive constant • Can execute N tasks in N/w-1 + k time • Bound on start-up and clean-up time: • tree depth x longest node’s steady-state period • Bound on period: • lcm(w0, w1, ..., wp, c-1, cp+1) PCL seminar, July 2001
Start-up 4 A 2 1 8 2 Steady- State C B Startup 8 D 2 A compute A send B receive B compute C receive C compute C send D receive D compute Steady-state: 8 tasks every 8 time units PCL seminar, July 2001
Modeling networks as trees • Leiserson: Fat Trees • recursively partition graph into 2 parts. • Alpern, Carter, Ferrante: PMH model • group nodes connected with fast links as a new node with original ones children. • successively relax definition of “fast”. • Shao: ENV model • measure bandwidth to you; group ones similar speeds. PCL seminar, July 2001
Summary • Explicit solution to steady-state throughput • heterogeneous speeds • heterogeneous models • Simple computation • uniform tasks • no dependences • Suggests simple dynamic scheduling PCL seminar, July 2001
Current work • Application model: • Bag of tasks • Performance model: • Application’s workflow • Type of schedule • Application • Grid model • Very heterogeneous tree PCL seminar, July 2001
Future work • Application model: • Bag of tasks -> parallel & pipeline; unequal sizes • Performance model: • Application’s workflow -> total execution time • Type of schedule • Application • Grid model • Very heterogeneous tree -> memory & latency PCL seminar, July 2001