1 / 24

Parallel Processing

Parallel Processing. I’ve gotta spend at least 10 hours studying for the IT 344 final!. I’m going to study with 9 friends… we’ll be done in an hour. Next up: TIPS. Mega- = 10 6 , Giga- = 10 9 , Tera- = 10 12 , Peta- = 10 15 BOPS, anyone?

yanka
Download Presentation

Parallel Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Parallel Processing I’ve gotta spend at least 10 hours studying for the IT 344 final! I’m going to study with 9 friends… we’ll be done in an hour.

  2. Next up: TIPS • Mega- = 106, Giga- = 109, Tera- = 1012, Peta- = 1015 • BOPS, anyone? • Light travels about 1 ft / 10-9 secs in free space. • A Tera-Hertz uniprocessor could have no clock-to-clock path longer than 300 microns… • We already know of problems that require greater than a TIP (Simulations of weather, weapons, brains)

  3. Solution: Parallelism • Pipelining – reasonable for a small number of stages (5-10), after that bypassing and stalls become unmanageable. • Superscalar – replicate data paths and design control logic to discover parallelism in traditional programs. • Explicit parallelism – must learn how to write programs that run on multiple CPUs.

  4. Pipelining

  5. Superscalar – How far can it go? • Multiple functional units (ALUs, Addr, Floating point, etc.) • Instruction dispatch • Dynamic scheduling • Pipelines • Speculative execution

  6. Explicit Parallelism • Distributed • Transaction-oriented • Geographically dispersed locations • E.g. SETI@home • Parallel • Single goal computing • Computing intense and/or data-intense • High-speed data exchange • Often on custom hardware • E.g. Geochemical surveys

  7. Challenges • For distributed processing, parallelism is given and usually cannot easily change. Programming is relatively easy. • For parallel processing, the programmer defines parallelism by partitioning the serial program(s). Parallel programming in general is more difficult than transaction applications.

  8. Other vocabulary • Decomposition • The way that a program can be broken up for parallel processing • Course-grain • Breaks into big chunks (fewer processors) • SMP • Distributed (often) • Fine-grain • Breaks into small chunks (more processors) • Image processing

  9. Inter-processor communications Loosely-coupled Tightly-coupled Custom supercomputers Distributed processors Beowulf clusters

  10. More Terminology • SIMD (Single Instruction Multiple Data) • MIMD (Multiple Instruction Multiple Data) • MISD (Pipeline)

  11. SIMD • Same instruction executed in multiple units, on different data • Examples: Vector processors, AltiVec D1 I D2 I D3 I D4 I

  12. D1 I1 D2 I2 D3 I3 D4 I4 MIMD • Each unit does own instruction on own text • Examples: Mercury, Beowulf, etc.

  13. MISD (pipeline) D4 D3 D2 D1 I1 I2 I3 I4

  14. Distributed Programming Tools • C/C++ with TCP/IP • Perl with TCP/IP • Java • Corba • ASP • .Net

  15. Parallel Programming Tools • PVM • MPI • Synergy • Others (proprietary hardware)

  16. Parallel Programming Difficulties • Program partition and allocation • Data partition and allocation • Program(process) synchronization • Data access mutual exclusion • Dependencies • Process(or) failures • Scalability…

  17. Software techniques • Shared Memory Buffers — Areas of memory that any node can read or write • Sockets — Provide full-duplex message passing between processes. • Semaphores and Spinlocks — Provide locking and synchronization functions • Mailbox Interrupts — Provide an interrupt-driven communication mechanism • Direct Memory Access — Provides asynchronous shared memory bufferI/O.

  18. Hardware configurations – Interconnects and Memory

  19. Interconnects

  20. Crossbar

  21. Mesh

  22. Interconnects

  23. What it really looks like Note: this computer would rank well on www.top500.org

  24. Summary • Prospects for future CPU architectures: • Pipelining - Well understood, but mined-out • Superscalar - Nearing its practical limits • SIMD - Limited use for special applications • VLIW - Returns controls to S/W. The future? • Prospects for future Computer System architectures: • SMP - Limited scalability. Harder than it appears. • MIMD/message-passing - It’s been the future for over 20 years now. How to program?

More Related