Bandwidth-centric scheduling of independent tasks on a heterogeneous grid

Olivier Beaumont, ENS Larry Carter, UCSD Jeanne Ferrante, UCSD Arnaud Legrand, ENS Yves Robert, ENS Bandwidth-centric schedulingof independent taskson a heterogeneous grid PCL seminar, July 2001

Grid Computing • Distributed heterogeneous computing • Large number of independent tasks • Data begins and ends at specific site • Examples: • SETI @ home • Factoring numbers • Animated films • Drug screening PCL seminar, July 2001

Computational grid My computer Intermediate nodes can compute too Data starts here Internet Gateway Cluster Host Partner site Super- computer Participating PC’s and workstations PCL seminar, July 2001

“Base Model” Node • Processor takes W0 time to do one task • Takes C-1 time to receive task from parent • Takes Ci time to send task to i-th child • These 3 activities can be done concurrently • but only one “send” at a time PCL seminar, July 2001

Example A is the root of the tree; all tasks start at A 3 A Time for sending one task from A to B 1 2 Time for computing one task in C 2 6 B C 1 2 D Examples assume no communication of results back to root PCL seminar, July 2001

Example 3 A 1 2 2 6 B C A compute 1 A send 2 D B receive B compute C receive C compute C send D receive D compute 1 2 3 Time PCL seminar, July 2001

Steady-state 3 A 1 2 Repeated pattern 6 2 C B Startup Clean-up 1 A compute 2 D A send B receive B compute C receive C compute C send D receive D compute 1 2 3 Steady-state: 7 tasks every 6 time units PCL seminar, July 2001 Total time: 16 tasks in 16 time units

Steady State Problem w0 • One-level “fork graph”: node & k leaves • Concurrent activities: • w0 time to execute task • ci time to send to i-th child (only one at a time) • Let Ri denote steady-state rates • R0 = tasks/second executed in node • Ri = tasks/second sent to and done by child i • Constraints: • ∑ Ri ci ≤ 1 • Ri ≤ 1/wi for i = 0, ..., k Ck C1 ... wk w1 PCL seminar, July 2001

p i=1 k i=0 Solution • Sort by communication times c1 ≤ c2 ≤ ... • Find largest p such that ∑ ci/wi ≤ 1 • For i = 1, ..., p, set Ri = 1/wi • keep the first p children busy • note that ∑ Ri ci ≤ 1 so far • Set Rp+1 = e/cp+1, where e = 1- ∑ Ri ci • give the (p+1)-st child any leftover work • Set R0 = 1/w0 • Keep the root’s processor busy w0 Ck C1 ... wk w1 Constraints: Ri ≤ 1/wi i = 0 ... k ∑ Ri ci ≤ 1 PCL seminar, July 2001

New law of efficient management • Delegate work to whomever it takes you the least time to explain the problem to! • Provided worker’s desk isn’t overloaded. • It doesn’t matter if that person is a slow worker. • Of course, slow workers will have full desktops more often. PCL seminar, July 2001

k p i=1 i=0 With communication from above • Three concurrent activities: • W0 time to execute task • C-1 time to receive task from parent • Ci time to send to i-th child • New constraints: • R-1 = ∑ Ri and R-1 c-1 ≤ 1, where R-1 = receive rate • Solution: • R-1 = min (1/c-1, 1/wo + ∑ 1/wi + e/c p+1) C-1 w0 Ck C1 ... wk w1 solution for one-level tree PCL seminar, July 2001

Steady-state for tree Process root last My computer Internet Gateway Cluster Host Partner site Super- computer Reduce fork graphs to single node PCL seminar, July 2001

Steady-state for tree Process root last My computer Internet Gateway Cluster Host Summary node Super- computer Reduce fork graphs to single node PCL seminar, July 2001

Steady-state for tree Process root last My computer Internet Summary Cluster Summary Reduce fork graphs to single node PCL seminar, July 2001

Example 3 A 1 2 First find equivalent work-time for subtree 2 6 B C 1 2 D Subtree’s rate is 1/6 + 1/2 = 2/3, equivalent to node with w = 3/2 PCL seminar, July 2001

Example 3 A 1 2 Replace subtree with equivalent node 2 1.5 B C’ Subtree’s rate is 1/6 + 1/2 = 2/3, equivalent to node with w = 3/2 PCL seminar, July 2001

Example 3 A 1 2 2 1.5 B C’ Solve root tree: Keep C’ busy (i.e. RC ’ = 1/1.5) e = 1-1/1.5 = 1/3 RB = e/2 = 1/6 rate = 1/3 + 1/1.5 + 1/6 = 7/6 PCL seminar, July 2001

“Base Model” Notation • Three concurrent activities: • W0 time to do one task • C-1 time to receive task from parent • Ci time to send to i-th child (only one at a time) c-1 to parent receive things stacked vertically can be done concurrently w0 node’s processor send c1 ck ... wk w1 children PCL seminar, July 2001

Concurrent receive model • Two concurrent activities: • Receive task from parent. • EITHER send to i-th child OR execute one task c-1 receive w0 send c1 ck ... w1 wk PCL seminar, July 2001

Reduce to previous model • Replace node’s processor by a new child. c-1 c-1 receive receive ∞ w0 send send c1 w0 ck ck c1 w1 wk w0 wk w1 base model concurrent receive model PCL seminar, July 2001

Concurrent work model • 2 concurrent activities: execute and one communication c-1 c-1 receive w0 ∞ receive send send c1 ck c-1 c-1+ck c-1+c1 w1 wk wk w1 w0 base model concurrent receive model PCL seminar, July 2001

Other models have similar reductions Fully sequential model w0 receive send receive w0 send send send w0 receive Fully parallel model send Concurrent send model PCL seminar, July 2001

Revised law of efficient management(if you can’t work and delegate at the same time) • Delegate work to whomever it takes you the least time to explain the problem to! • Provided worker’s desk isn’t overloaded. • Do it yourself only if that’s faster than explaining it to any available worker. • And fire anyone who isn’t working when you are! • Or reassign them to a better communicating or slower-working manager PCL seminar, July 2001

OGO model of communication • Recall LogP model: Latency Overhead Gap • Latency is irrelevant to steady-state • We allow different Overhead’s at each end • OGO model • channel can send one task every G time units • node’s processor is interrupted for time O • Easy to extend our methods • use no-concurrency model with O’s • take min with 1/G PCL seminar, July 2001

Flaky example • Communicate with Alice via fax • takes me 5 minutes to send message • ties up the fax machine for 2 minutes • Communicate with Bob via US mail • takes me 1 minute to address letter • doesn’t reach him for 3 days • Communicate with Carol via courier • takes me 30 seconds to summon courier • can only deliver two tasks per day PCL seminar, July 2001

Bandwidth-centric scheduling • Prioritize children by communication times • Request tasks from parent • initially, request enough to fill your buffers • make more requests as you delegate & execute • On receiving a task: • execute locally if idle • satisfy children’s request according to priorities • Occasionally re-adjust priorities PCL seminar, July 2001

Simulations • 100 randomly-generated trees • up to 5 children per node • up to 10 non-leaf nodes • Three strategies: • All children; equal priority (first-come, first-serve) • Theorem’s subset; equal priority • Theorem’s subset; low priority to last child • Buffer for one task per node (buffer starts full) • Buffer four tasks per node PCL seminar, July 2001

Start-up issues • Bandwidth-centric is optimal within an additive constant • Can execute N tasks in N/w-1 + k time • Bound on start-up and clean-up time: • tree depth x longest node’s steady-state period • Bound on period: • lcm(w0, w1, ..., wp, c-1, cp+1) PCL seminar, July 2001

Start-up 4 A 2 1 8 2 Steady- State C B Startup 8 D 2 A compute A send B receive B compute C receive C compute C send D receive D compute Steady-state: 8 tasks every 8 time units PCL seminar, July 2001

Modeling networks as trees • Leiserson: Fat Trees • recursively partition graph into 2 parts. • Alpern, Carter, Ferrante: PMH model • group nodes connected with fast links as a new node with original ones children. • successively relax definition of “fast”. • Shao: ENV model • measure bandwidth to you; group ones similar speeds. PCL seminar, July 2001

Summary • Explicit solution to steady-state throughput • heterogeneous speeds • heterogeneous models • Simple computation • uniform tasks • no dependences • Suggests simple dynamic scheduling PCL seminar, July 2001

Current work • Application model: • Bag of tasks • Performance model: • Application’s workflow • Type of schedule • Application • Grid model • Very heterogeneous tree PCL seminar, July 2001

Future work • Application model: • Bag of tasks -> parallel & pipeline; unequal sizes • Performance model: • Application’s workflow -> total execution time • Type of schedule • Application • Grid model • Very heterogeneous tree -> memory & latency PCL seminar, July 2001

Bandwidth-centric scheduling of independent tasks on a heterogeneous grid

Bandwidth-centric scheduling of independent tasks on a heterogeneous grid

Presentation Transcript

Grid Scheduling

Bandwidth Aggregation in Heterogeneous Networks

MITHRA: Multiple data Independent Tasks on a Heterogeneous Resource Architecture

Scheduling Periodic Real-Time Tasks with Heterogeneous Reward Requirements

Scheduling on Heterogeneous Machines: Minimize Total Energy + Flowtime

Scheduling of parallel jobs in a heterogeneous grid environment

A Survey on Parallel Computing in Heterogeneous Grid Environments

Heterogeneous and Grid Computing

A presentation on Bandwidth

Scheduling Interactive Tasks in the Grid-based Systems

Definition of Grid Resource Scheduling

Grid Scheduling

A Hybrid Heuristic for DAG Scheduling on Heterogeneous Systems

Towards a Query Optimizer for Text-Centric Tasks

Scheduling in Heterogeneous Grid Environments: The Effects of Data Migration

Efficient Scheduling of Heterogeneous Continuous Queries

Transport Independent SDP Bandwidth Parameter

Scheduling Componentised Applications on a Computation Grid

Fault-tolerant Scheduling of Fine-grained Tasks in Grid Environments

A new scheduling algorithm for non-preemptive independent tasks on a multi-processor platform

Efficient Scheduling of Heterogeneous Continuous Queries

Transport Independent SDP Bandwidth Parameter