1 / 37

Parallel Processing: Architecture Overview

WW Grid. Parallel Processing: Architecture Overview. Subject Code: 433-498. Gri d Computing and D istributed S ystems (GRIDS) Lab . The University of Melbourne Melbourne, Australia www.gridbus.org. Rajkumar Buyya. Overview of the Talk . Why Parallel Processing ?

ting
Download Presentation

Parallel Processing: Architecture Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. WW Grid Parallel Processing: Architecture Overview Subject Code: 433-498 Grid Computing and Distributed Systems (GRIDS) Lab. The University of MelbourneMelbourne, Australiawww.gridbus.org Rajkumar Buyya

  2. Overview of the Talk • Why Parallel Processing ? • Parallel Hardwares • Parallel Operating Systems • Parallel Programming Paradigms • Grand Challenges

  3. Threads Interface Microkernel Multi-Processor Computing System . . P P P P P P P Processor Process Thread Computing Elements Applications Programming paradigms Operating System Hardware

  4. Commercialization R & D Commodity Two Eras of Computing Architectures System Software/Compiler Applications P.S.Es Architectures System Software Applications P.S.Es Sequential Era Parallel Era 1940 50 60 70 80 90 2000 2030

  5. History of Parallel Processing • PP can be traced to a tablet dated around 100 BC. • Tablet has 3 calculating positions. • Infer that multiple positions: • Reliability/ Speed

  6. Motivating factors • Just as we learned to fly, not by constructing a machine that flaps its wings like birds, but by applying aerodynamics principles demonstrated by the nature... • We modeled PP after those of biological species.

  7. Motivating Factors • Aggregated speed with which complex calculations carried out by neurons-individual response is slow (ms) – demonstrate feasibility of PP

  8. Why Parallel Processing? • Computation requirements are ever increasing -- visualization, distributed databases, simulations, scientific prediction (earthquake), etc. • Sequential architectures reaching physical limitation (speed of light, thermodynamics)

  9. Human Architecture! Growth Performance Vertical Horizontal Growth 5 10 15 20 25 30 35 40 45 . . . . Age

  10. Computational Power Improvement Multiprocessor Uniprocessor C.P.I. 1 2 . . . . No. of Processors

  11. Why Parallel Processing? • The Tech. of PP is mature and can be exploited commercially; significant R & Dwork on development of tools & environment. • Significant development in Networking technology is paving a way for heterogeneous computing.

  12. Why Parallel Processing? • Hardware improvements like Pipelining, Superscalar, etc., are non-scalable and requires sophisticated Compiler Technology. • Vector Processing works well for certain kind of problems.

  13. Parallel Program has & needs ... • Multiple “processes” active simultaneously solving a given problem, general multiple processors. • Communication and synchronization of its processes (forms the core of parallel programming efforts).

  14. Processing Elements Architecture

  15. Processing Elements • Simple classification by Flynn: (No. of instruction and data streams) • SISD - conventional • SIMD - data parallel, vector computing • MISD - systolic arrays • MIMD - very general, multiple approaches. • Current focus is on MIMD model, using general purpose processors. (No shared memory)

  16. Instructions Processor Data Output Data Input SISD : A Conventional Computer • Speed is limited by the rate at which computer can transfer information internally. Ex:PC, Macintosh, Workstations

  17. Instruction Stream A Instruction Stream B Instruction Stream C Processor A Data Output Stream Data Input Stream Processor B Processor C The MISD Architecture • More of an intellectual exercise than a practicle configuration. Few built, but commercially not available

  18. Instruction Stream Data Output stream A Data Input stream A Processor A Data Output stream B Processor B Data Input stream B Data Output stream C Processor C Data Input stream C SIMD Architecture Ex: CRAY machine vector processing, Thinking machine cm* Intel MMX (multimedia support) Ci<= Ai * Bi

  19. MIMD Architecture Instruction Stream A Instruction Stream C Instruction Stream B Unlike SISD, MISD, MIMD computer works asynchronously. Shared memory (tightly coupled) MIMD Distributed memory (loosely coupled) MIMD Data Output stream A Data Input stream A Processor A Data Output stream B Processor B Data Input stream B Data Output stream C Processor C Data Input stream C

  20. MEMORY MEMORY MEMORY BUS BUS BUS Shared Memory MIMD machine Processor A Processor B Processor C Comm: Source PE writes data to GM & destination retrieves it • Easy to build, conventional OSes of SISD can be easily be ported • Limitation : reliability & expandibility. A memory component or any processor failure affects the whole system. • Increase of processors leads to memory contention. Ex. : Silicon graphics supercomputers.... Global Memory System

  21. MEMORY MEMORY MEMORY BUS BUS BUS Memory System A Memory System B Memory System C Distributed Memory MIMD IPC channel IPC channel Processor A Processor B Processor C • Communication : IPC on High Speed Network. • Network can be configured to ... Tree, Mesh, Cube, etc. • Unlike Shared MIMD • easily/ readily expandable • Highly reliable (any CPU failure does not affect the whole system)

  22. C (speed = cost2) S S log2P P Laws of caution..... • Speed of computers is proportional to the square of their cost. i.e. cost = Speed • Speedup by a parallel computer increases as the logarithm of the number of processors. • Speedup = log2(no. of processors)

  23. Caution.... • Very fast development in PP and related area have blurred concept boundaries, causing lot of terminological confusion : concurrent computing/ programming, parallel computing/ processing, multiprocessing, distributed computing, etc.

  24. It’s hard to imagine a field that changes as rapidly as computing.

  25. Caution.... Computer Science is Immature Science. (lack of standard taxonomy, terminologies)

  26. Caution.... • Even well-defined distinctions like shared memory and distributed memory are merging due to new advances in technolgy. • Good environments for developments and debugging are yet to emerge.

  27. Caution.... • There is no strict delimiters for contributors to the area of parallel processing : CA,OS, HLLs, databases, computer networks, all have a role to play. • This makes it a Hot Topic of Research

  28. Operating Systems forHigh Performance Computing

  29. Types of Parallel Systems • Shared Memory Parallel • Smallest extension to existing systems • Program conversion is incremental • Distributed Memory Parallel • Completely new systems • Programs must be reconstructed • Clusters • Slow communication form of Distributed

  30. Operating Systems for PP • MPP systems having thousands of processors requires OS radically different fromcurrent ones. • Every CPU needs OS : • to manage its resources • to hide its details • Traditional systems are heavy, complex and not suitable for MPP

  31. Operating System Models • Frame work that unifies features, services and tasks performed • Three approaches to building OS.... • Monolithic OS • Layered OS • Microkernel based OS Client server OS Suitable for MPP systems • Simplicity, flexibility and high performance are crucial for OS.

  32. Application Programs Application Programs User Mode Kernel Mode System Services Hardware Monolithic Operating System • Better application Performance • Difficult to extend Ex: MS-DOS

  33. Layered OS Application Programs Application Programs User Mode • Easier to enhance • Each layer of code access lower level interface • Low-application performance Kernel Mode System Services Memory & I/O Device Mgmt Process Schedule Hardware Ex : UNIX

  34. Application Programs Application Programs Traditional OS User Mode Kernel Mode OS Hardware OS Designer

  35. New trend in OS design Servers Application Programs Application Programs User Mode Kernel Mode Microkernel Hardware

  36. User Kernel Microkernel/Client Server OS(for MPP Systems) • Tiny OS kernel providing basic primitive (process, memory, IPC) • Traditional services becomes subsystems • Monolithic Application Perf. Competence • OS = Microkernel + User Subsystems Client Application Thread lib. File Server Network Server Display Server Microkernel Send Reply Hardware Ex: Mach, PARAS, Chorus, etc.

  37. Few Popular Microkernel Systems • MACH, CMU • PARAS, C-DAC • Chorus • QNX, • (Windows)

More Related