1 / 40

Adding numbers

Adding numbers. n data items, p processors. t s = O(n). t p = O(n/p) if data on each proc => S=t s /t p =O(p). t p = O(n + n/p) if data needs broadcasting => S=t s /t p =o(1). Sequential Recursion. Parallel Recursion. t comm = O(n/2 +n/4 + ..+ n/p) = O(n). S=o(1).

rlindner
Download Presentation

Adding numbers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Adding numbers n data items, p processors ts = O(n) tp = O(n/p) if data on each proc => S=ts/tp=O(p) tp = O(n + n/p) if data needs broadcasting => S=ts/tp=o(1)

  2. Sequential Recursion

  3. Parallel Recursion tcomm = O(n/2 +n/4 + ..+ n/p) = O(n) S=o(1) tcomp = O(n/2 +n/4 + ..+ n/p) = O(n)

  4. tcomm = O(1 +1 + ..+ 1) = O(log p) S=O(n / log p) tcomp = O(1 +1 + ..+ 1) = O(log p)

  5. Parallel Bucket Sort

  6. Sequential m buckets , n numbers ts = O(n + m((n/m) log (n/m))) = O(n log(n/m))

  7. m buckets , n numbers, p=m processors tp = O(n + (n/p) log (n/p))

  8. tp = O(n/p + (n/p) log (n/p)) = O( (n/p) log (n/p)) => S=O(p)

  9. Det. Sample Sort

  10. Det. Sample Sort • sort locally and create p-sample

  11. Det. Sample Sort • send all p-samples to processor 1

  12. Det. Sample Sort • proc.1: sort all received samples and compute global p-sample

  13. Det. Sample Sort • broadcast global p-sample • bucket locally according to global p-sample • send bucket i to proc.i • resort locally

  14. Det. Sample Sort Lemma: Each proc. receives at most 2 n/p data items n/p2 n/p2 global sample global sample

  15. Det. Sample Sort Post-Processing: “Array Balancing” n/p n/p n/p n/p n/p n/p n/p n/p 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 2 Rounds: • Each proc. sends rec. data size to all other proc. • Move data to right location via one h-relation

  16. Det. Sample Sort • 5 MPI_AlltoAllv for n/p > p2 • O(n/p log n) local comp. • Goodrich (FOCS'98): O(1) rounds for n/p > pe

  17. Performance: Det. Sample Sort

  18. Numerical Integration

  19. static assignment of processors to segments of [a,b] area = d (f(p)+f(q))/2

  20. Problem: precision depends on curve’s shape

  21. Adaptive Quadrature Terminate when C is sufficiently small Problem: different parts of the curve need different resolution

  22. segment 1 segment 3 segment 4 segment 2 segment 5

  23. Gravitational N-Body Problem

  24. ts = O(n2) per time step

  25. for each time step: for each object: traverse tree to determine its forces Problem:traversals have different lengths

  26. object 1 object 3 object 5 object 2 object 4

More Related