160 likes | 368 Views
CUDA Library and Demo . Yafeng Yin, Lei Zhou, Hong Man 07/21/2010. Outline. Basic CUDA computation library GPULib , CUBLAS , CUFFT Advanced CUDA computation library CULA /MAGMA , VSIPL CUDA FIR Demo(UMD) Discuss and future work. Basic lib - GPULib.
E N D
CUDA Library and Demo Yafeng Yin, Lei Zhou, Hong Man 07/21/2010
Outline • Basic CUDA computation library • GPULib, CUBLAS, CUFFT • Advanced CUDA computation library • CULA /MAGMA , VSIPL • CUDA FIR Demo(UMD) • Discuss and future work
Basic lib - GPULib • GPULibprovides a library of mathematical functions • addition, subtraction, multiplication, and division, as well as unary functions, including sin(), cos(), gamma(), and exp(), • interpolation, array reshaping, array slicing, and reduction operations
Basic lib - CUBLAS • BLAS-- Basic Linear Algebra Subprograms • CUBLAS Provide a set of functions for basic vector and matrix operations, such as matrix‐vector copy, sort, dot product, Euclidean norm etc • Real data • Level 1 (vector-vector O(N) ) • Level 2 (matrix-vector O(N2) ) • Level 3 (matrix-matrix O(N3) ) • Complex data • Level 1
Basic lib - CUFFT • CUFFT is the CUDA FFT library • Provides a simple interface for computing parallel FFT on an NVIDIA GPU • Allows users to leverage the floating-point power and parallelism of the GPU without having to develop a GPU-based FFT implementation • cufftPlan1d(),cufftPlan2d(),cufftPlan3d() Creates a 1D,2D or 3D FFT plan configuration for a specified signal size
Advanced lib – CULA and MAGMA • CULA: GPU Accelerated Linear Algebra • provide LAPACK (Linear Algebra PACKage) function on CUDA GPUs • MAGMA: Matrix Algebra on GPU and MulticoreArchitectures • develop a dense linear algebra library similar to LAPACK but for heterogeneous/hybrid architectures and "Multicore+GPU" systems
Advanced lib -CULA function • Linear Equation Routines • Solves a general system of linear equations AX=B. • Orthogonal Factorizations • LQ ,RQ factorization • Least Squares Routines • Symmetric and non- SymmetricEigenvalueRoutines • Singular Value Decomposition (SVD) Routines
Advanced lib - MAGMA • LAPACK on CUDA GPUs • LU, QR, and Cholesky factorizations in both real and complex arithmetic (single and double) • Linear solvers based on LU, QR, and Cholesky in real arithmetic (single and double) • Mixed-precision iterative refinement solvers based on LU, QR, and Cholesky in real arithmetic • Reduction to upper Hessenberg form in real arithmetic (single and double) • MAGMA BLAS in real arithmetic (single and double),
Advanced lib -VSIPL • VSIPL: Vector Image Signal Processing Library • Generalized matrix product • Fast FIR filtering • Correlation • Fast Fourier Transform • QR decomposition • Random number generation • Elementwise arithmetic, logical, and comparison operators, linear algebra procedures
CUDA library Summary • Basic vector or matrix computation • GPULib, CUBLAS, CUFFT • vector or matrix: addition, subtraction, multiplication, and divisionsin(), cos(), sort, dot product, • Libraries can be used for Signal Processing • CULA /MAGMA , VSIPL • LU, QR, and Choleskyfactorizations • SVD decompostion
CUDA Demo (FIR) GPU: NVIDIA GeForce 8600 GT CPU: Intel Duo CPU 2.33G Software: Visual Studio 2005
Discuss and future work • how to connect CUDA to the SSP re-hosting demo • how to change the sequential executed codes in signal processing system to CUDA codes • how to transfer the XML codes to CUDA codes to generate the CUDA input.
Reference • CUDA Zone http://www.nvidia.com/object/cuda_home_new.html • http://en.wikipedia.org/wiki/CUDA