By, Sandeep A. Ganage M.Tech (IT) Guided by Dr. R. C. Thool

CUDA.. (Compute Unified Device Architecture) By, Sandeep A. Ganage M.Tech(IT) Guided by Dr. R. C. Thool

Terms • What is GPGPU? • General-Purpose computing on a Graphics Processing Unit • Using graphic hardware for non-graphic computations • What is CUDA? • Compute Unified Device Architecture • Software architecture for managing data-parallel programming

Introduction What is GPU? • It is a processor optimized for 2D/3D graphics, video, visual computing, and display. • It is highly parallel, highly multithreaded multiprocessor optimized for visual computing. • It provide real-time visual interaction with computed objects via graphics images, and video. • It serves as both a programmable graphics processor and a scalable parallel computing platform. • Heterogeneous Systems: combine a GPU with a CPU

GPU Architecture

Processing Element • Processing element = thread processor = ALU

Memory Architecture • Constant Memory • Texture Memory • Device Memory

CPU vs. GPU • CPU • Fast caches • Branching adaptability • High performance • Multicore • GPU • Multiple ALUs • Fast onboard memory • High throughput on parallel tasks • Executes program on each fragment/vertex • CPUs are great for task parallelism • GPUs are great for data parallelism

Computing Capability GPU 369GIPS < CPU 177,730 IPS Why.. ??

CPU vs. GPU • GPUs contain much larger number of dedicated ALUs then CPUs. • GPUs also contain extensive support of Stream Processing paradigm. It is related to SIMD ( Single Instruction Multiple Data) processing. • Each processing unit on GPU contains local memory that improves data manipulation and reduces fetch time.

CPU vs. GPU More transistors devoted to data processing

“What is CUDA” • CUDA is a set of developing tools to create applications that will perform execution on GPU (Graphics Processing Unit). • CUDA compiler uses variation of C with future support of C++ • CUDA was developed by NVidia and as such can only run on NVidia GPUs of G8x series and up. • CUDA was released on February 15, 2007 for PC and Beta version for MacOS X on August 19, 2008.

Why CUDA • CUDA provides ability to use high-level languages such as C to develop application that can take advantage of high level of performance and scalability that GPUs architecture offer. • GPUs allow creation of very large number of concurrently executed threads at very low system resource cost. • CUDA also exposes fast shared memory (16KB) that can be shared between threads. • Full support for integer and bitwise operations. • Compiled code will run directly on GPU.

Software Requirements/Tools • CUDA device driver • CUDA Software Development Kit • Emulator • CUDA Toolkit • Occupancy calculator • Visual profiler

How CUDA works..????? • We need to allocate space in the GPU’s memory for the variables. • The video card does not have I/O devices, hence we need to copy the input data from the memory in the host computer into the memory in the GPU, using the variable allocated in the previous step. • We need to specify code to execute. • Copy the results back to the memory in the host computer.

Initially: Host’s Memory GPU Card’s Memory array

Allocate Memory in the GPU card Host’s Memory GPU Card’s Memory array array_d

Copy content from the host’s memory to the GPU card memory Host’s Memory GPU Card’s Memory array array_d

Execute code on the GPU GPU MPs Host’s Memory GPU Card’s Memory array array_d

Copy results back to the host memory Host’s Memory GPU Card’s Memory array array_d

The Kernel • It is necessary to write the code that will be executed in the stream processors in the GPU card • That code, called the kernel, will be downloaded and executed, simultaneously and in lock-step fashion, in several (all?) stream processors in the GPU card • How is every instance of the kernel going to know which piece of data it is working on?

In the GPU: Block 1 Block 0 ProcessingElements Array Elements

In the GPU: Thread 0 Thread 1 Thread 2 Thread 3 Thread 0 Thread 1 Thread 2 Thread 3 Block 0 Block 1 ProcessingElements Array Elements

To compile: • nvccsimple.c simple.cu –o simple • The compiler generates the code for both the host and the GPU • Demo on cuda.littlefe.net …

Testing - Matrices • Test the multiplication of two matrices. • Creates two matrices with random floating point values. • We tested with matrices of various dimensions…

Results:

Applications of CUDA • Electrodynamics and Electromagnetic • Nuclear Physics, Molecular Dynamics and Computational Chemistry • Video, Imaging and Vision Applications • Game Industry • Matlab, Labview , Mathematica, R • Weather and Ocean Modeling • Financial Computing and Options Pricing • Medical Imaging, CT, MRI • Government and Defence • Geophysics

Conclusions.. • GPGPU enhances the power of GPU in order to execute all the computing operations which are intended to be performed on CPU. • With Runtime API and Driver API provides by NVDIA device drivers and NVCC compiler, Immensely parallel computing can be performed with hundred times faster than that on CPU. • CUDA is the best platform over SIMD architecture. • Many scientific applications, simulators, high end computations and research oriented simulations can be easily performed on GPU with CUDA.

References Song Jun Park, “An Analysis of GPU Parallel Computing, ” 2009 DoD High Performance Computing Modernization Program Users Group Conference. Ian Buck, “GPU Computing: Programming a Massively Parallel Processor, ” International Symposium on Code Generation and Optimization (CGO'07) NobuhiroFunatsu, YoshimitsuKuroki, “Fast Parallel Processing using GPU in computing L1-PCA bases, ” 2010 IEEE John Owens, “GPU Computing: Heterogeneous Computing for Future Systems, ” 2009 IEEE Jason Sanders, Edward Kandrot, “CUDA by Example An Introduction to General Purpose GPU Programming, ” NVIDIA Corporation.

Courtesy… Supercomputing 2008 Education Program

Thank you…Queries..???

By, Sandeep A. Ganage M.Tech (IT) Guided by Dr. R. C. Thool

By, Sandeep A. Ganage M.Tech (IT) Guided by Dr. R. C. Thool

Presentation Transcript

Diversity in Publishing Sandeep Mahal

M.tech or MBA ?

It’s a Family Election: S. Vaikundarajan Confused Between T

Sandeep Arora

- Sandeep Kamble

Sandeep weds Eva

Instructor Sandeep Basnyat Sandeep_basnyat@yahoo 9841 892281

Sandeep Gokhale

Demystifying Information Asymmetry: Aurobindo Pharma’s Journey Sandeep Chatterjee

-Sandeep Shiva

Sandeep marwah inaugurated competition at iskcon

DISTANCE EDUCATION M.TECH IN NOIDA(9278888318)

Distance Education Course In M.Tech - IT In Delhi, Noida @8527271018

Can you fail in M.Tech examinations?

M.Tech Thesis Help Online

Sandeep Marwah Patron to IFUNA

Sandeep Jethwa

Top Career Choices after M.Tech

Why pursue M.Tech ECE?

Why you should pursue M.Tech ECE ?

Sandeep Chauhan | The Essentials Of a Strong Company Website

M.Tech in Computer Engineering – MITAOE