1 / 28

Tesla: Fastest Processor Adoption in HPC History

Tesla: Fastest Processor Adoption in HPC History. http://www.nvidia.com/tesla. GPU Computing. 240 cores. 4 cores. CPU + GPU Co-Processing Heterogeneous Computing. Computation Discontinuity. Double Precision debut. 50x – 150x. 146X. 36X. 18X. 50X. 100X. Medical Imaging U of Utah.

mikkel
Download Presentation

Tesla: Fastest Processor Adoption in HPC History

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tesla: Fastest Processor Adoption in HPC History http://www.nvidia.com/tesla

  2. GPU Computing 240 cores 4 cores CPU + GPU Co-Processing Heterogeneous Computing

  3. Computation Discontinuity Double Precision debut

  4. 50x – 150x 146X 36X 18X 50X 100X Medical Imaging U of Utah Molecular Dynamics U of Illinois, Urbana Video Transcoding Elemental Tech Matlab Computing AccelerEyes Astrophysics RIKEN 149X 47X 20X 130X 30X Financial simulation Oxford Linear Algebra Universidad Jaime 3D Ultrasound Techniscan Quantum Chemistry U of Illinois, Urbana Gene Sequencing U of Maryland

  5. Processors NVIDIA Tesla 10-Series GPU Massively parallel, many core architecture 240 Processor Cores 1 Teraflops – 1,000 times Cray X-MP IEEE Compliant Double Precision Floating Point Designed for Scientific Computing L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 Processor Communication Fabric Memory & I/O Fixed Function Acceleration L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 Processors

  6. Tesla GPU Computing Products Tesla C1060 Computing Board Tesla S1070 1U System

  7. New Class of Hybrid CPU-GPU Servers 2 Tesla M1060 GPUs Upto 18 Tesla M1060 GPUs SuperMicro 1U GPU Server Bull Bullx Blade Enclosure

  8. Performance Tesla Co-processing Cluster 10,000x TeslaPersonal Supercomputer 100x TraditionalCPU Cluster CPU Workstation 1x K$ M$

  9. UPenn: Finding a Better Shampoo 1 1 • Equal Performance • No Data Center Required Tesla PSC 32 CPU Servers 13x Lower Cost ~$7 K $128 K 9.6x Lower Power 1 kWatt 19.2 kWatts

  10. Finance: Equity Pricing 1 1 • Equal Performance • 16x Less Space 2 Tesla S1070s 500 CPU Servers 10x Lower Cost $24 K $250 K 13x Lower Power 2.8 kWatts 37.5 kWatts

  11. Oil & Gas: Seismic Processing • 31x Less Space 32 Tesla S1070s 2000 CPU Servers 1 1 • Equal Performance 27x Lower Power 45 kWatts 1200 kWatts 20x Lower Cost ~$400 K ~$8 M

  12. Workstation Supercomputing Tesla Personal Supercomputer ~5000 Customers

  13. Tesla Cluster Installations 2008 2009

  14. Supercomputing for the Masses 100s of researchers $10M+ Large Clusters 100,000s of researchers Tesla Preconfigured Clusters $50K-$1M Tesla Personal Supercomputer Millions of researchers < $5K

  15. CUDA Parallel Computing Architecture GPU Computing Applications C C++ Fortran OpenCLtm DirectX Compute Java Python NVIDIA GPU CUDA Parallel Computing Architecture OpenCL is trademark of Apple Inc. used under license to the Khronos Group Inc.

  16. CUDA: Widely Adopted Parallel Programming Model • 120 Million CUDA GPUs • 60,000+ Active Developers • 1000+ Research Papers • 200+ universities teaching CUDA

  17. CUDA Ecosystem Over 200 Universities Teaching CUDA Compilers PGI FortranCAPs HMPPMCUDAMPINOAA Fortran2COpenMP Languages C, C++DirectXFortranJavaOpenCLPython IIT Delhi Tsinghua Dortmundt ETH Zurich Moscow NTU … UIUC MIT Harvard Berkeley Cambridge Oxford … Oil & Gas Finance Medical Biophysics Applications Libraries FFTBLASLAPACKImage processingVideo processingSignal processingVision OEMs Consultants Numerics DSP EDA ANEO Imaging CFD GPU Tech

  18. Released Applications

  19. More Informationhttp://www.nvida.com/teslaProductsVertical SolutionsCUDA GPU Programming Training • GPU Developer Conference Sept 30 – Oct 2, 2009 • San Jose, CA • http://www.nvidia.com/gtc

  20. Programming the GPU

  21. Compiling C for CUDA Applications void serial_function(… ) { ... } void other_function(int ... ) { ... } void saxpy_serial(float ... ) { for(int i = 0; i<n; ++i) y[i] = a*x[i] + y[i]; } void main( ) { float x; saxpy_serial(..); ... } • C CUDA • Key Kernels • Rest of C • Application NVCC (Open64) • CPU Compiler Modify into Parallel CUDA code • CUDA object • files • CPU object • files Linker • CPU-GPU • Executable

  22. C for CUDA : C with a few keywords Standard C Code Parallel C Code

  23. CUDA Programming Effort / Performance Source : MIT CUDA Course

  24. Computed Tomography (CT) Science Medical Source: Batenburg, Sijbers, et al Source: Ufimtsev, Martinez Manufacturing Finance Source: Tolke, Krafczyk Source: CUDA SDK, NAG

  25. FFT Performance: CPU vs GPU cuFFT 2.3: NVIDIA Tesla C1060 GPU MKL 10.1r1: Quad-CoreIntel Core i7 (Nehalem) 3.2GHz

  26. BLAS Performance: CPU vs GPU CUBLAS: CUDA 2.2, Tesla C1060 MKL 10.0.3: Intel Core2 Extreme, 3.00GHz

  27. Heterogeneous Computing Domains Graphics HighlyParallel Computation GPU(Parallel Computing) CPU(Sequential Computing) Control and Communication Productivity Application Data Intensive Application Oil & Gas Finance Medical Biophysics Numerics Audio Video Imaging

  28. 5000+ Customers / ISVs

More Related