880 likes | 1.24k Views
Introduction to Parallel Computing Intel Math Kernel Library. Huan -Ting Yen, Department of Mathematics, National Taiwan University 2011/07/22. Parallel Computing. What is parallel computing?. Traditionally, software has been written for serial computation:. What is parallel computing?.
E N D
IntroductiontoParallelComputingIntelMathKernelLibrary Huan-TingYen, Department of Mathematics, National Taiwan University2011/07/22
ParallelComputing IntroductiontoParallelComputing
Whatisparallelcomputing? • Traditionally,softwarehasbeenwrittenforserialcomputation: IntroductiontoParallelComputing
Whatisparallelcomputing? • Inthesimplestsense,parallelcomputingisthesimultaneoususeofmultiplecomputeresourcestosolveacomputationalproblem: IntroductiontoParallelComputing
Resource • Thecomputeresource • Asinglecomputerwithmultipleprocessors; • Anarbitrarynumberofcomputersconnectedbyanetwork; • Acombinationofboth. Core 1 Core 2 Core 3 Core 4 thread 1 thread 2 thread 3 thread 4 IntroductiontoParallelComputing
Resource • Thecomputeresource • Asinglecomputerwithmultipleprocessors; • Anarbitrarynumberofcomputersconnectedbyanetwork; • Acombinationofboth. several threads several threads several threads several threads core1 core2 core3 core4 IntroductiontoParallelComputing
Resource • Thecomputeresource • Asinglecomputerwithmultipleprocessors; • Anarbitrarynumberofcomputersconnectedbyanetwork; • Acombinationofboth. IntroductiontoParallelComputing
Resource • Thecomputeresource • Asinglecomputerwithmultipleprocessors; • Anarbitrarynumberofcomputersconnectedbyanetwork; • Acombinationofboth. IntroductiontoParallelComputing
Whyuseparallelcomputing? • Theprimaryreasonsforusingparallelcomputing: • Savetime–wallclocktime • Solvelargerproblems • Provideconcurrency(domanythingsatthesametime) • Otherreasonsmightinclude: • Takingadvantageofnon-localresources • Costsavings • Overcomingmemoryconstraints IntroductiontoParallelComputing
Amdahl’sLaw • Speedupofaparallelprogramislimitedbyamountof serialworks. IntroductiontoParallelComputing
Amdahl’sLaw • Speedupofaparallelprogramislimitedbyamountof serialworks. IntroductiontoParallelComputing
Flynn’sTaxonomy • Classificationforparallelcomputersandprograms IntroductiontoParallelComputing
Flynn’sTaxonomy • Classificationforparallelcomputersandprograms IntroductiontoParallelComputing
Flynn’sTaxonomy • Classificationforparallelcomputersandprograms IntroductiontoParallelComputing
IntelMathKernelLibrary IntroductiontoParallelComputing
Overview • The Intel® Math Kernel Library (Intel® MKL) provides Fortran routines and functions that perform a wide variety of operations on vectors and matrices including sparse matrices. The library also includes fast Fourier transform (FFT) functions, as well as vector mathematical and vector statistical functions with Fortran and C interfaces. • The versions of Intel MKL intended for Windows* and Linux* operating systems also include ScaLAPACK software and Cluster FFT software for solving respective computational problems on distributed-memory parallel computers. Intel MKL Quickstart
Intel MKL: Intel Math Kernel Library • Functionality • BLASandSparseBLASRoutines • LAPACKRoutines:LinearEquations • LAPACKRoutines:EigenvalueProblems • ScaLAPACK • SparseSolverRoutines • Fast Fourier Transforms • ClusterFast Fourier Transforms Intel MKL Quickstart
SystemRequirements(Hardware) • Hardware: • Intel® Core™ processor family • Intel® Xeon® processor family • Intel® Pentium® 4processor family • Intel® Pentium®lllprocessor • Intel® Pentium®processor(300MHzorfaster) • Intel® Celeron® processor • AMD Athlon* and Opteron* processors • HowdoyouknowthatinformationabouttheCPUs? • $cat/proc/cpuinfo Intel MKL Quickstart
SystemRequirements(Software) • Followingisthelistofsupposedoperatingsystem: • RedHat* Enterprise Linux* 3, 4, 5 • RedHat* Fedora* 9 • Debian* GNU/Linux4.0 • Ubuntu* 8.04 • Howdoyouknowthatinformationabouttheoperatingsystem? • $cat/etc/*release • FollowingisthelistofsupposedC/C++andFortrancompilers: • Intel® Fortran Compiler 10.1 for Linux* • Intel® C++ Compiler 10.1 for Linux* • GNUCompilerCollection(gcc,g77,gfortran4.2.0) Intel MKL Quickstart
InstallingIntelMKLonaLinux* System • Tools&Downloads • http://software.intel.com/en-us/(google“intelsoftware”) Intel MKL Quickstart
InstallingIntelMKLonaLinux* System Intel MKL Quickstart
InstallingIntelMKLonaLinux* System Intel MKL Quickstart
InstallingIntelMKLonaLinux* System Intel MKL Quickstart
InstallingIntelMKLonaLinux* System Intel MKL Quickstart
InstallingIntelMKLonaLinux* System Intel MKL Quickstart
InstallingIntelMKLonaLinux* System Intel MKL Quickstart
InstallingIntelMKLonaLinux* System • user@host:~/software$wget“URL” • user@host:~/software$ll • $tar–zxvfl_mkl_p_10.2.x.yyy.tar.gz Intel MKL Quickstart
InstallingIntelMKLonaLinux* System • cdl_mkl_p_10.2.x.yyy • ./install.sh Intel MKL Quickstart
InstallingIntelMKLonaLinux* System Intel MKL Quickstart
InstallingIntelMKLonaLinux* System Intel MKL Quickstart
Some Examples Intel MKL Quickstart
Example • Brief examples to • BLASLevel1Routines(vector-vectoroperations) • BLASLevel2Routines(matrix-vectoroperations) • BLASLevel3Routines(matrix-matrixoperations) • Compute the LU factorization of a matrix(LAPACK) • Solve linear system(LAPACK) • Solve eigensystem(LAPACK) • Fast Fourier Transforms Intel MKL Quickstart
Example • Brief examples to • BLASLevel1Routines(vector-vectoroperations) • BLASLevel2Routines(matrix-vectoroperations) • BLASLevel3Routines(matrix-matrixoperations) • Compute the LU factorization of a matrix(LAPACK) • Solve linear system(LAPACK) • Solve eigensystem(LAPACK) • Fast Fourier Transforms Intel MKL Quickstart
Ex1.Thecomplexdot product() #include<stdio.h> #include "mkl_blas.h” #define N5 typedefstruct{ doublere; doubleim; }mkl_complex; intmain() { intn, incx= 1, incy= 1, i; mkl_complexx[N], y[N], res; voidzdotc(); n = N; for( i = 0; i < n; i++ ){ x[i].re = (double)i; x[i].im = (double)i * 2.0; y[i].re = (double)(n - i); y[i].im = (double)i * 2.0; } zdotc( &res, &n, x, &incx, y, &incy); printf( “The complex dot product is: (%6.2f, %6.2f)\n", res.re, res.im ); return0; } Intel MKL Quickstart
?dotc • Computesadotproductofaconjugatevectorwithanothervector. • Description:Theroutineisdeclaredin • Fortran77:mkl_blas.fi • Fortran95:blas.f90 • C:mkl_blas.h • InputParameters(zdotc(&res,&n,x,&incx,y,&incy)) • n: The length of two vectors. • incx: Specifies the increment for the elements of x • incy: Specifies the increment for the elements of y • outputParameters(zdotc(&res,&n,x,&inca,y,&incb)) • res:The final result Intel MKL Quickstart
Makefile (Sequential) Test:blas_c CC=icc MKL_HOME = /home/opt/intel/mkl/10.2.2.025 MKL_INCLUDE = $(MKL_HOME)/include MKL_PATH = $(MKL_HOME)/lib/em64t EXE=blas_c.exe blas_c: $(CC) -o $(EXE) blas_c.c-I$(MKL_INCLUDE) -L$(MKL_PATH) -lmkl_intel_lp64 -lmkl_sequential-lmkl_core-lpthread Intel MKL Quickstart
Makefile (Parallel) Test=blas_c CC=icc MKL_HOME = /home/opt/intel/mkl/10.2.2.025 MKL_INCLUDE = $(MKL_HOME)/include MKL_PATH = $(MKL_HOME)/lib/em64t EXE=blas_c.exe blas_c: $(CC) -o $(EXE) blas_c.c-I$(MKL_INCLUDE) -L$(MKL_PATH) -Wl,--start-group-lmkl_intel_lp64 -lmkl_core -lmkl_intel_thred-Wl,--end-group–liomp5-lpthread Intel MKL Quickstart
?dotc • Computesadotproductofaconjugatevectorwithanothervector. • Description:Theroutineisdeclaredin • Fortran77:mkl_blas.fi • Fortran95:blas.f90 • C:mkl_blas.h • InputParameters(zdotc(&res,&n,x,&inca,y,&incb)) • n: The length of two vectors. • incx: Specifies the increment for the elements of x • incy: Specifies the increment for the elements of y • outputParameters(zdotc(&res,&n,x,&inca,y,&incb)) • res:The final result Intel MKL Quickstart
BLASRoutines • RoutinesNamingConventions • BLASBroutinenameshavethefollowingstructure: <character><name><mode>() • The<character>filedindicatesthedatatype: sreal,singleprecision ccomplex,singleprecision dreal,doubleprecision zcomplex,doubleprecision • The<mode>filedindicatesthedatatype: cconjugatedvector uunconjugatedvector gGivensrotation. Intel MKL Quickstart
BLASRoutines • RoutinesNamingConventions • BLASBroutinenameshavethefollowingstructure: <character><name><mode>() • InBLASlevel2and3,<name>filedindicatesthematrixtype: gegeneralmatrix gbgeneralbandmatrix sysymmetricmatrix sbsymmetricbandmatrix heHermitianmatrix hbHermitianbandmatrix trtriangularmatrix tbtriangularbandmatrix Intel MKL Quickstart
BLASLevel1Routines Intel MKL Quickstart
Example • Brief examples to • BLASLevel1Routines(vector-vectoroperations) • BLASLevel2Routines(matrix-vectoroperations) • BLASLevel3Routines(matrix-matrixoperations) • Compute the LU factorization of a matrix(LAPACK) • Solve linear system(LAPACK) • Solve eigensystem(LAPACK) • Fast Fourier Transforms Intel MKL Quickstart
Ex2-1.Matrix-vectorproduct #include "mkl_blas.h” intmain() { intm, n, incx, incy, lda, idxi, idxj; doublealpha, beta, *x, *y, *A ; chartrans; m = 3; n = 3; incx = 1; incy = 1; lda = m; alpha = 1.0; beta = 1.0; trans = 'n’; x = (double*)malloc(sizeof(double)*n); y = (double*)malloc(sizeof(double)*n); A = (double*)malloc(sizeof(double)*m*n); Intel MKL Quickstart
Ex2-2.Matrix-vectorproduct for( idxi = 0; idxi < n; idxi++ ){ *(x+idxi) = 1.0; *(y+idxi) = 1.0; } for( idxi = 0; idxi < m; idxi++ ) for( idxj = 0; idxj < n; idxj++) *(A+idxi*m+idxj) = (double)(idxi+1) + idxj; dgemv(&trans, &m, &n, &alpha, A, &lda, x, &incx, &beta, y, &incy); return 0; } Intel MKL Quickstart
?gemv • Computesamatrix-vectorproductusingageneralmatrix. • Description:Theroutineisdeclaredin • Fortran77:mkl_blas.fi • Fortran95:blas.f90 • C:mkl_blas.h • InputParameters dgemv(&trans,&m,&n,&alpha,A,&lda,x,&incx,&beta,y,&incy) • trans: iftrans=‘N’,‘n’,then iftrans=‘T’,‘t’,then iftrans=‘C’,‘c’,then • m: ThenumberofrowsofthematrixA. Intel MKL Quickstart
?gemv • InputParameters • n: Thenumberofcolumnsofthematrix • lda:Thefirstdimensionofmatrix,lda=max(1,m) • incx: Specifies the increment for the elements of x • incy: Specifies the increment for the elements of y • outputParameters • y:Updatedvectory. Intel MKL Quickstart
Ex2. Result Introduction to MATLAB
BLASLevel2Routines Intel MKL Quickstart
Example • Brief examples to • BLASLevel1Routines(vector-vectoroperations) • BLASLevel2Routines(matrix-vectoroperations) • BLASLevel3Routines(matrix-matrixoperations) • Compute the LU factorization of a matrix(LAPACK) • Solve linear system(LAPACK) • Solve eigensystem(LAPACK) • Fast Fourier Transforms Intel MKL Quickstart
Ex3-1.Matrix-Matrixproduct #include "mkl_blas.h” intmain() { intm, n, k, lda, ldb, ldc, idxi, idxj; doublealpha, beta, *A, *B, *C ; chartransa, transb; m = 3; n = 3; k = 3; lda= m; ldb= k; ldc = m; alpha = 1.0; beta = 1.0; transa ='n’; transb= 'n’; Intel MKL Quickstart