1 / 27

MATLAB *P

MATLAB *P. Alan Edelman MIT Lab for Computer Science March 30, 2001. Joint work with Parry Husbands, C. Isbell, S. Kubo, T. Wen. Already a MATLAB fan?. Then no need to justify why a parallel MATLAB would be cool! Especially one that allows software reuse.

nellh
Download Presentation

MATLAB *P

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MATLAB*P Alan Edelman MIT Lab for Computer Science March 30, 2001 Joint work with Parry Husbands, C. Isbell, S. Kubo, T. Wen

  2. Already a MATLAB fan? • Then no need to justify why a parallel MATLAB would be cool! • Especially one that allows software reuse. • Especially one for large matrices! • Research topic: how can we do this? Not a MATLAB fan? • Probably you’ve never used MATLAB!

  3. Eigenvalue Records Made Easy >>a=randn(12000*p); >>tic;e=eig(a);toc “68 minutes” 64 processors on T3E

  4. MATLAB*P in action >> a=randn(512,512*p); a2=ones(512*p,512); m=sprand(10000,1000*p,0.01); >> whose Your variables are: Name Size Bytes Class a 512 x 512p 1048576 ddense array a2 512p x 512 1048576 ddense array m 10000 x 1000p 810176 dsparse array Grand total is 624560 elements using 2907328 bytes >> b=inv(a); c=a*b; c(1:3,1:3) ans = 1.0000 0.0000 -0.0000 -0.0000 1.0000 0.0000 0.0000 -0.0000 1.0000 >> e=eig(a);plot(e,’*’);axis([-30 30 -30 30]);axis(’square’) >> [u,s,v]=svds(m,5);s’ ans = 7.7153 7.7342 7.7447 7.7831 16.9842 >> id=eye(1000*p);x=cumsum(id,1);y=cumsum(x,1); >> imagesc(y+y’)

  5. MATLAB*P in action Parallelism through Polymorphism!!! >> a=randn(512,512*p); a2=ones(512*p,512); m=sprand(10000,1000*p,0.01); >> whose Your variables are: Name Size Bytes Class a 512 x 512p 1048576 ddense array a2 512p x 512 1048576 ddense array m 10000 x 1000p 810176 dsparse array Grand total is 624560 elements using 2907328 bytes >> b=inv(a); c=a*b; c(1:3,1:3) ans = 1.0000 0.0000 -0.0000 -0.0000 1.0000 0.0000 0.0000 -0.0000 1.0000 >> e=eig(a);plot(e,’*’);axis([-30 30 -30 30]);axis(’square’) >> [u,s,v]=svds(m,5);s’ ans = 7.7153 7.7342 7.7447 7.7831 16.9842 >> id=eye(1000*p);x=cumsum(id,1);y=cumsum(x,1); >> imagesc(y+y’)

  6. Parallel Problems Server Matlab MATLAB*P Classes & Methods The Product • Parallel Problems Server • Number crunching • MATLAB*P • Interactive benefits

  7. The Parallel Problems Server • Standalone program encapsulating data and algorithms (NetSolve (UTK), MatPar (NASA JPL), PSI (UT Austin)) • Dense and sparse matrices indexed by ids • Row and column distributed in single or double precision • Functions indexed by strings • Written in MPI (portability & functionality) • Handles extremely large datasets • Client communication protocol • Extensible via package system (for functionality and optimisation)

  8. Extensibility S3L Libraries ScaLAPACK Computational &Interface Routines mathfun.cc s3l.cc scalapack.cc Packages(dynamically loaded) mathfun.pp s3l.pp scalapack.pp Server Matlab Matlab Scripts

  9. MATLAB*P: Delivering the Benefits • Uses MATLAB’s classes and objects • No source code changes! • New classes • ddense and dsparse • Distributed dense and sparse matrices • Only matrix ids and sizes are stored in Matlab • Operator overloading gives transparency • layout • Enables further integration by allowing re-use of Matlab code

  10. Code Reuse example • MATLAB’s hilb routinehilb(n)=nxn Hilbert matrixh(i,j)=1/(i+j-1) 1) We want this to work! >>hilb(1000*p); >>type hilb function H=hilb(n) J=1:n;J=J(ones(n,1),:); I=J’; E=ones(n,n); H=E./(I+J-1); 2)This must be parallel: 3) So that this can be parallel

  11. The magic of layout variables • p=layout(1) • Specify parallelism:randn(100*p,100) or ones(100,100*p) • More importantly, allows propagation of parallelism through code: • >> a=randn(1000*p,1000);>> [m,n]=size(a);>> m 1000p>> b=ones(3*m,n); • b is now a parallel object!

  12. Nick Higham’s Toolbox >> a=clement(100*p) a = ddense object: 100-by-100 >> a2=clement(100); >> norm(a-a2,’fro’) ans = 0 >> b=fiedler(500*p) % b(i,j)=abs(i-j); b = ddense object: 500-by-500 >> c=inv(b); >> imagesc(b); >> imagesc(c); Nearly 70% of the functions work without modification.

  13. MATLAB*P vs. NetSolve • Differing goals: • MATLAB*P - transparency, code reuse • NetSolve - resource management, load balancing Server #1 MATLAB Agent MATLAB Server Server #2 x=A\b; x=netsolve(‘solve’,A,b);

  14. Iterative methods • Find x=A\b when you don’t have A! • Tested MATLAB’s pcg.m and gmres.m. They worked without modification • Performed better with some modification (that also improved the serial code): • Use dot(a,b) instead of a’*b • Sample the residual less frequently

  15. Automatic Polymorphism Directives Rewriting Parallelism through Polymorphism • Parallelism delivered through MATLAB classes and operator overloading • Extensibility • Minimal user-visible changes • Support for interactive environments • Disadvantage: Can’t infer parallelism from language constructs (e.g. for loops) Programmer Effort

  16. Performance Summary • Get performance of libraries! • Complex operations on large matrices do very well • Elementwise code fares poorly 2 procs. DEC Alpha Matlab MATLAB*Pm=moler(2000) 343s 29schol(m) 52s 14s

  17. IRLAB • Building term/document and query matrices handled by standalone C programs • Everything else is written in MATLAB: • Viewing documents • Retrieval schemes • Evaluation

  18. Support for PDEs • Specify domains and equations à la PDE toolbox • Find solution quickly in parallel on server • Use domain decomposition (KeLP) transparently

  19. New button!

  20. New button!

  21. cumulvs? Parallelism for “Workgroups” >>whose a 512x512p ddense array b 512p512 ddense array MATLAB*P AVS, VTK, Mathematica, MAPLE?, ...

  22. Conclusion • MATLAB*P provides users with: • Convenience • Interactivity and familiar syntax • Reliability • Few changes to existing system • Expressiveness • MATLAB’s data parallel syntax • Compatibility • Can easily interface to libraries • MATLAB code executed in parallel • Access to MATLAB’s environment

  23. MATLAB*P in action Parallelism through Polymorphism!!! >> a=randn(512,512*p); a2=ones(512*p,512); m=sprand(10000,1000*p,0.01); >> whose Your variables are: Name Size Bytes Class a 512 x 512p 1048576 ddense array a2 512p x 512 1048576 ddense array m 10000 x 1000p 810176 dsparse array Grand total is 624560 elements using 2907328 bytes >> b=inv(a); c=a*b; c(1:3,1:3) ans = 1.0000 0.0000 -0.0000 -0.0000 1.0000 0.0000 0.0000 -0.0000 1.0000 >> e=eig(a);plot(e,’*’);axis([-30 30 -30 30]);axis(’square’) >> [u,s,v]=svds(m,5);s’ ans = 7.7153 7.7342 7.7447 7.7831 16.9842 >> id=eye(1000*p);x=cumsum(id,1);y=cumsum(x,1); >> imagesc(y+y’)

  24. MATLAB*P in action Parallelism through Polymorphism!!! >> a=randn(512,512*p); a2=ones(512*p,512); m=sprand(10000,1000*p,0.01); >> whose Your variables are: Name Size Bytes Class a 512 x 512p 1048576 ddense array a2 512p x 512 1048576 ddense array m 10000 x 1000p 810176 dsparse array Grand total is 624560 elements using 2907328 bytes >> b=inv(a); c=a*b; c(1:3,1:3) ans = 1.0000 0.0000 -0.0000 -0.0000 1.0000 0.0000 0.0000 -0.0000 1.0000 >> e=eig(a);plot(e,’*’);axis([-30 30 -30 30]);axis(’square’) >> [u,s,v]=svds(m,5);s’ ans = 7.7153 7.7342 7.7447 7.7831 16.9842 >> id=eye(1000*p);x=cumsum(id,1);y=cumsum(x,1); >> imagesc(y+y’)

  25. MATLAB*P uses • Teaching Scientific Computing • A PGAPack (Parallel Genetic Algorithms) toolbox • Implementing fast cshifts, eoshifts, and sections • A Bezier Curve and Surface package • A Binary Image Processing package • Large Scale Information Retrieval • Ocean Modeling • Machine Learning

  26. What’s the point? • What benefits? (think MATLAB) • Ease of use and interactivity • Ease of development of applications • Lots of functionality available • Visualisation The benefits of interactive tools can beenjoyed in supercomputer installationswithout an appreciable loss in performance

  27. MATLAB*P in action >> a=randn(512,512*p); a2=ones(512*p,512); m=sprand(10000,1000*p,0.01); >> whose Your variables are: Name Size Bytes Class a 512 x 512p 1048576 ddense array a2 512p x 512 1048576 ddense array m 10000 x 1000p 810176 dsparse array Grand total is 624560 elements using 2907328 bytes >> b=inv(a); c=a*b; c(1:3,1:3) ans = 1.0000 0.0000 -0.0000 -0.0000 1.0000 0.0000 0.0000 -0.0000 1.0000 >> e=eig(a);plot(e,’*’);axis([-30 30 -30 30]);axis(’square’) >> [u,s,v]=svds(m,5);s’ ans = 7.7153 7.7342 7.7447 7.7831 16.9842 >> id=eye(1000*p);x=cumsum(id,1);y=cumsum(x,1); >> imagesc(y+y’)

More Related