1 / 30

Programming Multicore Processors

Programming Multicore Processors. Aamir Shafi High Performance Computing Lab http://hpc.seecs.nust.edu.pk. Serial Computation . Traditionally, software has been written for serial computation: To be run on a single computer having a single Central Processing Unit (CPU)

sandra_john
Download Presentation

Programming Multicore Processors

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Programming Multicore Processors Aamir Shafi High Performance Computing Lab http://hpc.seecs.nust.edu.pk

  2. Serial Computation • Traditionally, software has been written for serial computation: • To be run on a single computer having a single Central Processing Unit (CPU) • A problem is broken into a discrete series of instructions

  3. Parallel Computation • Parallel computing is the simultaneous use of multiple compute resources to solve a computational problem: • Also known as High Performance Computing (HPC) • The prime focus of HPC is performance—the ability to solve biggest possible problems in the least possible time

  4. Traditional Usage of Parallel Computing--Scientific Computing • Traditionally parallel computing is used to solve challenging scientific problems by doing simulations: • For this reason, it is also called “Scientific Computing”: • Computational science

  5. Emergence of Multi-core Processors • In the last decade, performance of processors is not enhanced by increasing clock speed: • Increasing clock speed directly increases power consumption • Power is dissipated as heat, not practical to cool down processors • Intel canceled a project to produce 4 GHz processor! • This led to the emergence of multi-core processors: • Performance is increased by increasing processing cores that run on lower clock speed: • Implies better power usage Disruptive Technology!

  6. Moore’s Law is Alive and Well

  7. Power Wall

  8. Why Multi-core Processors Consume Lesser Power • Dynamic power is proportional to V2fC • Increasing frequency (f) also increases supply voltage (V): more than linear effect • Increasing cores increases capacitance (C) but has only a linear effect

  9. Software in the Multi-core Era • The challenge has been thrown to the software industry: • Parallelismis perhaps the answer • The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software: • http://www.gotw.ca/publications/concurrency-ddj.htm • Some excerpts: • The biggest sea change in software development since the OO revolution is knocking at the door, and its name is Concurrency • This essentially means every software programmer will be a parallel programmer: • The main motivation behind conducting this “Programming Multicore Processors” workshop

  10. About the “Programming Multicore Processors” Workshop

  11. Course Contents …A little background on Parallel Computing Approaches

  12. Parallel Hardware • Three main classifications: • Shared Memory Multi-processors: • Symmetric Multi-Processors (SMP) • Multi-core Processors • Distributed Memory Multi-processors • Massively Parallel Processors (MPP) • Clusters: • Commodity and custom clusters • Hybrid Multi-processors: • Mixture of shared and distributed memory technologies

  13. First Type: Shared Memory Multi-processors • All processors have access to shared memory: • Notion of “Global Address Space”

  14. Symmetric Multi-Processors (SMP) • A SMP is a parallel processing system with a shared-everything approach: • The term signifies that each processor shares the main memory and possibly the cache • Typically a SMP can have 2 to 256 processors • Also called Uniform Memory Access (UMA) • Examples include AMD Athlon, AMD Opteron 200 and 2000 series, Intel XEON etc

  15. Multi-core Processors

  16. Second Type: Distributed Memory • Each processor has its own local memory • Processors communicate with each other by message passing on an interconnect

  17. Cluster Computers • A group of PCs or workstations or Macs (called nodes) connected to each other via a fast (and private) interconnect: • Each node is an independent computer • Each cluster has one head-node and multiple compute-nodes: • Users logon to head-node and start parallel jobs on compute-nodes • Two popular cluster classifications: • Beowulf Clusters (http://www.beowulf.org) • Rocks Clusters (http://www.rocksclusters.org)

  18. Cluster Computer Proc 1 Proc 2 Proc 0 Proc 3 Proc 7 Proc 6 Proc 4 Proc 5

  19. Third Type: Hybrid • Modern clusters have hybrid architecture: • Distributed memory for inter-node (between nodes) communications • Shared memory for intra-node (within a node) communications

  20. SMP and Multi-core clusters • Most modern commodity clusters have SMP and/or multi-core nodes: • Processors not only communicate via interconnect, but shared memory programming is also required • This trend is likely to continue: • Even a new name “constellations” has been proposed

  21. Classification of Parallel Computers Parallel Hardware Shared Memory Hardware Distributed Memory Hardware SMPs Multicore Processors MPPs Clusters In this workshop, we will learn how to program shared memory parallel hardware … Parallel Hardware  Shared Memory Hardware  *

  22. Writing Parallel Software • There are mainly two approaches for writing parallel software • The first approach is to use libraries (packages) written in already existing languages: • Economical • The second and more radical approach is to provide new languages: • Parallel Computing has a history of novel parallel languages • These languages provide high level parallelism constructs:

  23. Shared Memory Languages and Libraries • Designed to support parallel programming on shared memory platforms: • OpenMP: • Consists of a set of compiler directives, library routines, and environment variables • The runtime uses fork-join model of parallel execution • Cilk++: • A design goal was to support asynchronous parallelism • A set of keywords: • cilk_for, cilk_spawn, cilk_sync … • POSIX Threads (PThreads) • Threads Building Blocks (TBB)

  24. Distributed Memory Languages and Libraries • Libraries: • Message Passing Interface (MPI)—defacto standard • PVM • Languages: • High Performance Fortran (HPF): • Fortran M: • HPJava:

  25. Our Focus • Shared Memory and Multi-core Processors Machines: • Using POSIX Threads • Using OpenMP • Using Cilk++ (covered briefly) • Disruptive Technology: • Using Graphics Processing Units (GPUs) by NVIDIA for general-purpose computing We are assuming that all of us know the C programming language…

  26. Day One

  27. Day Two

  28. Day Three

  29. Learning Objectives • To become aware of the multicore revolution and its impact on the computer software industry • To program multicore processors using POSIX Threads • To program multicore processors using OpenMP and Cilk++ • To program Graphics Processing Units (GPUs) for general purpose computation (using NVIDIA CUDA API) You may download the tentative agenda from http://hpc.seecs.nust.edu.pk/~aamir/res/mc_agenda.pdf

  30. Next Session • Review of important and relevant Operating Systems and Computer Architecture concepts by Akbar Mehdi ….

More Related