1 / 24

Computer Science 320

Computer Science 320. Load Balancing with Clusters. Mandelbrot on a Cluster. Each node has a portion of the pixel data matrix After computation, each portion of the matrix sent to one process (process 0) which then writes the data to the image file

nona
Download Presentation

Computer Science 320

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computer Science 320 Load Balancing with Clusters

  2. Mandelbrot on a Cluster • Each node has a portion of the pixel data matrix • After computation, each portion of the matrix sent to one process (process 0) which then writes the data to the image file • The pixel data matrix is partitioned into K row slices, which are gathered after computation finishes

  3. Row slicing among K = 4 processes

  4. Process 0 Matrix After Gather

  5. Execution Timeline

  6. Data for Cluster Program // Start timing here // Initialize world and get world, size, and rank // Parse command line data here static in xoffset; static intyoffset; static in[][] matrix; static PJColorImage image; static Range[] ranges; static Range myRange; static in mylb; static intmyub; static IntegerBuf[] slices; static IntegerBufmySlice; static int[] hueTable;

  7. Initializing the Data // Same command line data as in SMP program go here matrix = new int[height][]; ranges = new Range(0, height – 1).subranges(size); myRange = ranges[rank]; mylb = myRange.lb(); myub = myRange.ub(); if (rank == 0) Arrays.allocate(matrix, width); else Arrays.allocate(matrix, myRange, width); Process 0 gets the whole matrix; others get only their range

  8. Set Up the Buffers and Compute! slices = IntegerBuf.rowSliceBuffers(matrix, ranges); mySlice = slices[rank]; // Set up the hue table as in SMP program for (intr = mylb; r <= myub; ++r) // Test for membership in Mandelbrot as before and // store the iteration count in the matrix // at position (r, c)

  9. Gather the Data and Write It Out world.gather(0, mySlice, slices); if (rank == 0) // Same code as in SMP program for output to image file

  10. Performance

  11. Load Balancing • For SMP situation, use a parallel for loop with a dynamic or guided schedule • For a cluster, use a master-worker pattern • The master node schedules agenda tasks for the other nodes • A separate thread on the master runs a task as well

  12. Master-Worker Pattern

  13. One Master and K Workers

  14. The Communication Sequence • Master thread sends first K row slices to worker threads; master thread waits for results • Master uses wildcard receive for results • A range object arrives first, then the data • Master sends range of next row to this worker thread • Master sends a null message when last row is received

  15. Master-worker execution timeline

  16. Types of Messages and Tags • Three types of messages: master sends a range, master receive a range, master receives pixel data • Distinguish these by tagging them with message tags • A receive with a tag will only match a send with the same tag world.send(toRank, tab, buffer); ... world.receive(fromRank, tag, buffer);

  17. New Data for MandelbrotSetClu2 // Message tags static final int WORKER_MSG = 0; static final int MASTER_MSG = 1; static final int PIXEL_DATA_MSG = 2; // Number of chunks the worker completed static in chunkCount;

  18. Master and Worker Sections if (rank == 0) new ParallelTeam(2).exccute(new ParallelRegion(){ public void run(){ execute( new ParallelSection(){ public void run(){ masterSection(); } }, new ParallelSection(){ public void run(){ workerSection(); } }); } }); else workerSection();

  19. The Master Section: Sending the Data private static void masterSection() throws Exception{ int worker; Range range; matrix = new int{height][width]; IntegerSchedule schedule = IntegerSchedule.runTime(); schedule.start(size, new Range(0, height – 1); int activeWorkers = size; for (int worker = 0; worker < size; ++worker){ range = schedule.next(worker); world.send(worker, WORKER_MSG, ObjectBuf.buffer(range)); if (range == null) --activeWorkers; ...

  20. The Master Section: Receiving Results private static void masterSection() throws Exception{ ... ... while (activeWorkers > 0){ ObjectItemBuf<Range> rangeBuf = ObjectBuf.buffer(); CommStatus status = world.receive(null, MASTER_MSG, rangeBuf); worker = status.fromRank; range = rangeBuf.item; world.receive(worker, PIXEL_DATA_MSG, IntegerBuf.rowSLiceBuffer(matrix, range)); range = schedule.next(worker); world.send(worker, WORKER_MSG, ObjectBuf.buffer(range)); if (range == null) --activeWorkers; } }

  21. The Worker Section: Doing the Rows private static void workerSection() throws Exception{ int[][] slice = null; for (;;){ ObjectItemBuf<Range> rangeBuf = ObjectBuf.buffer(); world.receive(0, WORKER_MSG, rangeBuf); Range range = rangeBuf.item; if (range == null) break; int lb = range.lb(); int ub = range.ub(); int len = range.length(); ++chunkCount; if (slice == null || slice.length < len) slice = new int[len][width]; for (int r = lb; r <= ub; ++r){ int slice_r = slice[r – lb]; // yahdah, yahdah, yahdah ...

  22. The Worker Section: Send Results private static void workerSection() throws Exception{ int[][] slice = null; for (;;){ ... for (int r = lb; r <= ub; ++r){ int slice_r = slice[r – lb]; // yahdah, yahdah, yahdah world.send(0, MASTER_MSG, rangeBuf); world.send(0, PIXEL_DATA_MSG, IntegerBuf.rowSliceBuffer(slice, new Range(0, len – 1))); } }

  23. Running the Program $ java –Dpj.schedule=“dynamic(10) . . .

  24. Running the Program $ java –Dpj.schedule=“dynamic(10) . . .

More Related