1 / 29

Extreme Models

Explore the diverse range of parallel computation models, from abstract PRAM to concrete circuit models. Get insights into PRAM submodels and key algorithms like data broadcasting and matrix multiplication. Understand the assumptions and submodels in the PRAM model for efficient parallel processing. Discover the power of semigroup computation, fan-in computation, and parallel prefix computation. Learn about ranking elements in linked lists and performing matrix multiplication efficiently in this insightful chapter.

rhowington
Download Presentation

Extreme Models

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Extreme Models Part II

  2. The models of parallel computation range from very abstract to very concrete, with real parallel machine implementations and user models falling somewhere between these two extremes. • At one extreme lies the abstract shared-memory PRAM model. • The other extreme is the circuit model of parallel processing. • Intermediate models Part II

  3. PRAM and Basic Algorithms Part II

  4. In this chapter, • the relative computational powers of several PRAM submodels • five key building-block algorithms • Data broadcasting • Semigroup or fan-in computation • Parallel prefix computation • Ranking the elements of a linked list • Matrix multiplication Part II

  5. Part I

  6. PRAM Submodels and Assumptions (1) • PRAM model prescribes the concurrent operation of p processors (in SIMD or MIMD mode) on data that are accessible to all of them in an m-word shared memory. • In the synchronous SIMD or SPMD version of PRAM, Processor i can do the following in the three phases of one cycle: (Not all three phases need to be present in every cycle) Part II

  7. PRAM Submodels and Assumptions (2) • It is possible that several processors may want to read data from the same memory location or write their values into a common location. • Four submodels of the PRAM model have been defined: Part II

  8. Part II

  9. PRAM Submodels and Assumptions (3) • Here are a few example submodels based on the semantics of concurrent writes in CRCW PRAM: Part II

  10. PRAM Submodels and Assumptions (4) • The following relationships have been established between some of the PRAM submodels: Part II

  11. DATA BROADCASTING (1) • one-to-all, broadcasting is used when one processor needs to send a data value to all other processors. • In the CREW or CRCW submodels, broadcasting is trivial. (sending processor can write the data value into a memory location, with all processors reading that data value in the following machine cycle) Θ(1) • All-to-all broadcasting, where each of the p processors needs to send a data value to all other processors, can be done through p separate broadcast operations in Θ(P)steps, which is optimal. Part II

  12. DATA BROADCASTING (2) • The above scheme is clearly inapplicable to broadcasting in the EREW model. (one-to-all) • The simplest scheme for EREW broadcasting is to make p copies of the data value, say in a broadcast vector B of length p, and then let each processor read its own copy by accessing B[J] • a method known as recursive doublingis used to copy B[0] into all elements of B in log2p steps Part II

  13. Part II

  14. DATA BROADCASTING (3) • The complete EREW broadcast algorithm with this provision is given below. Part II

  15. Part II

  16. DATA BROADCASTING (4) • To perform all-to-all broadcasting, so that each processor broadcasts a value that it holds to each of the other p - 1 processors, we let Processor j write its value into B[j], rather than into B[0]. Part II

  17. DATA BROADCASTING (5) • Given a data vector S of length p, a naive sorting algorithm can be designed based on the above all-to-all broadcasting scheme. Part II

  18. SEMIGROUP OR FAN-IN COMPUTATION • This computation is trivial for a CRCW PRAM. • Here too the recursive doubling scheme can be used to do the computation on an EREW PRAM. Part II

  19. Part II

  20. PARALLEL PREFIX COMPUTATIO • parallel prefix computation consists of the first phase of the semigroup computation. • the divide-and-conquer paradigm Part II

  21. Part II

  22. RANKING THE ELEMENTS Of A LINKED LIST Part II

  23. Part II

  24. Part II

  25. MATRIX MULTIPLICATION Part II

  26. Part II

  27. Part II

  28. Part II

  29. Skip Chapters • Chapter 6: More shared-Memory Algorithm • Chapter 7: Sorting and Selection Networks • Chapter 8: Other Circuit-Level Examples Part II

More Related