180 likes | 284 Views
Virtual Machines for HPC. Paul Lu, Cam Macdonell Dept of Computing Science. The Problems. Making applications run faster Not discussed today Parallelism is not always the answer Making it easier to use different clusters Packaging of applications, scripts, and libraries
E N D
Virtual Machines for HPC Paul Lu, Cam Macdonell Dept of Computing Science
The Problems • Making applications run faster • Not discussed today • Parallelism is not always the answer • Making it easier to use different clusters • Packaging of applications, scripts, and libraries • Dealing with differences in environment • Making it easier to manage your files • Distributed file systems
Making Use of Clusters • Heterogeneity creates complexity • How can a scientist make use of all these clusters, without becoming a computing scientist? GROMACS BLAST Library X Python 2.3.5 FFTW Python 2.2 Trellis Globus Red Hat Linux Scientific Linux
Package once OS (e.g., Linux) Libraries Application(s) Run many places Busby Glacier Favourite workstation Shrink-Wrapped VMs VM GROMACS Trellis Linux Linux, Windows, Mac OS
GROMACS GROMACS GROMACS GROMACS Trellis Trellis Trellis Trellis Linux Linux Linux Linux HPC using VMs File Server, Laptop • Packaged once, run on many x86 clusters • Using Trellis, data is automatically moved from local-to-remote, and back Local Remote Glacier Busby, AICT
Concluding Remarks • Small performance hit with VMs • Much easier to package and use • Potentially, access to many more compute nodes
There is hope! • Virtualization!
What is Computing Science? • “So…you…like…write programs or something?” • Can you fix my printer?
Scientific Computing • Scientific applications are on the leading edge of computing • Lots of resources • Complex interactions • Huge amounts of data
Fastest Supercomputer • Fastest Supercomputer • IBM BlueGene/L @ LLNL • Previously fastest • NEC Earth Simulator • Are computers good at solving problems in natural science?
Computing in Canada • Canada lacks world class computing facilities • We have to be able to aggregate resources from numerous institutions • The CISS experiments explored aggregating computing resources • 4000 CPUs, 19 ADs
Aggregating is difficult • Different administration domains • Running GROMACS • Requires fftw • Doesn’t like new compilers • Files must be in certain locations • And this is just for one application!
Virtualization • Is it appropriate for Scientific Computing? • Performance has improved • Pricing has improved (it’s become free)
Virtual Images • Positives • Completely portable • Less administration • Control entire environment within Virtual Image • We can run any application in them • We can bundle data control software within them
Virtual Images • Negatives • Large size • GBs for virtual disks • Performance Loss • Virtualization is slower than running on hardware
VMware on Busby • Gromacs test run on Busby1
Future Directions • Resolve performance anomaly • More accurate timings of phases • Run other applications • Get all 4 nodes running concurrently