1 / 59

Ubiquitous Parallelism

Ubiquitous Parallelism. Are You Equipped To Code For Multi- and Many- Core Platforms?. Agenda. Introduction/Motivation Why Parallelism? Why now? Survey of Parallel Hardware CPUs vs. GPUs Conclusion How Can I Start?. Talk Goal.

laken
Download Presentation

Ubiquitous Parallelism

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ubiquitous Parallelism Are You Equipped To Code For Multi- and Many- Core Platforms?

  2. Agenda • Introduction/Motivation • Why Parallelism? Why now? • Survey of Parallel Hardware • CPUs vs. GPUs • Conclusion • How Can I Start?

  3. Talk Goal • Encourage undergraduates to answer the call to the era of parallelism • Education • Software Engineering

  4. Why Parallelism? Why now? • You’ve already been exposed to parallelism • Bit Level Parallelism • Instruction Level Parallelism • Thread Level Parallelism

  5. Why Parallelism? Why now? • Single-threaded performance has plateaued • Silicon Trends • Power Consumption • Heat Dissipation

  6. Why Parallelism? Why now?

  7. Power Chart: P = CV2F

  8. Heat Chart (Feature Size)

  9. Why Parallelism? Why now? • Issue: Power & Heat • Good: Cheaper to have more cores, but slower • Bad: Breaks hardware/software contract

  10. Why Parallelism? Why now? • Hardware/Software Contract • Maintain backwards-compatibility with existing codes

  11. Why Parallelism? Why now?

  12. Agenda • Introduction/Motivation • Why Parallelism? Why now? • Survey of Parallel Hardware • CPUs vs. GPUs • Conclusion • How Can I Start?

  13. Personal Mobile Device Space iPhone 5 Galaxy S3

  14. Personal Mobile Device Space 2 CPU cores/ 3 GPU cores iPhone 5 Galaxy S3

  15. Personal Mobile Device Space 2 CPU cores/ 3 GPU cores 4 CPU cores/ 4 GPU cores iPhone 5 Galaxy S3

  16. Desktop Space

  17. Desktop Space 16 CPU cores • Rare To Have “Single Core” CPU • Clock Speeds < 3.0 GHz • Power Wall • Heat Dissipation AMD Opteron 6272

  18. Desktop Space • General Purpose • Power Efficient • High Performance • Not All Problems Can Be Done on GPU 2048 GPU Cores AMD Radeon 7970

  19. Warehouse Space (HokieSpeed) • Each node: • 2x Intel Xeon 5645 (6 cores each) • 2x NVIDIA C2050 (448 GPUs each)

  20. Warehouse Space (HokieSpeed) • Each node: • 2x Intel Xeon 5645 (6 cores each) • 2x NVIDIA C2050 (448 GPUs each) • 209 nodes

  21. Warehouse Space (HokieSpeed) • Each node: • 2x Intel Xeon 5645 (6 cores each) • 2x NVIDIA C2050 (448 GPUs each) • 209 nodes • 2508 CPU cores • 187264 GPU cores

  22. All Spaces

  23. Convergence in Computing • Three Classes: • Warehouse • Desktop • Personal Mobile Device • Main Criteria • Power, Performance, Programmability

  24. Agenda • Introduction/Motivation • Why Parallelism? Why now? • Survey of Parallel Hardware • CPUs vs. GPUs • Conclusion • How Can I Start?

  25. What is a CPU? • CPU • SR71 Jet • Capacity • 2 passengers • Top Speed • 2200 mph

  26. What is the GPU? • GPU • Boeing 747 • Capacity • 605 passengers • Top Speed • 570 mph

  27. CPU vs. GPU

  28. CPU Architecture • Latency Oriented (Speculation)

  29. GPU Architecture

  30. APU = CPU + GPU • Accelerated Processing Unit • Both CPU + GPU on the same die

  31. CPUs, GPUs, APUs • How to handle parallelism? • How to extract performance? • Can I just throw processors at a problem?

  32. CPUs, GPUs, APUs • Multi-threading (2-16 threads) • Massive multi-threading (100,000+) • Depends on Your Problem

  33. Agenda • Introduction/Motivation • Why Parallelism? Why now? • Survey of Parallel Hardware • CPUs vs. GPUs • Conclusion • How Can I Start?

  34. How Can I start? • CUDA Programming • You most likely have a CUDA enabled GPU if you have a recent NVIDIA card

  35. How Can I start? • CPU or GPU Programming • Use OpenCL (your laptop could potentially run)

  36. How Can I start? • Undergraduate research • Senior/Grad Courses: • CS 4234 – Parallel Computation • CS 5510 – Multiprocessor Programming • ECE 4504/5504 – Computer Architecture • CS 5984 – Advanced Computer Graphics

  37. In Summary … • Parallelism is here to stay • How does this affect you? • How fast is fast enough? • Are we content with current computer performance?

  38. Thank you! • Carlo del Mundo, • Senior, Computer Engineering • Website: http://filebox.vt.edu/users/cdel/ • E-mail: cdel@vt.edu Previous Internships @

  39. Appendix

  40. Programming Models • pthreads • MPI • CUDA • OpenCL

  41. pthreads • A UNIX API to create and destroy threads

  42. MPI • A communications protocol • “Send and Receive” messages between nodes

  43. CUDA • Massive multi-threading (100,000+) • Thread-level parallelism

  44. OpenCL • Heterogeneous programming model that is catered to several devices (CPUs, GPUs, APUs)

  45. Comparisons † Productivity is subjective and draws from my experiences

  46. Parallel Applications • Vector Add • Matrix Multiplication

  47. Vector Add

  48. Vector Add • Serial • Loop N times • N cycles† • Parallel • Assume you have N cores • 1 cycles† † Assume 1 add = 1 cycle

  49. Matrix Multiplication

  50. Matrix Multiplication

More Related