110 likes | 233 Views
Purdue RP Highlights TeraGrid Round Table May 20, 2010. Preston Smith Manager - HPC Grid Systems Rosen Center for Advanced Computing Purdue University. More Steele nodes for TG usres. Purdue Condor resource now in excess of 30,000 cores Recent active users:
E N D
Purdue RP HighlightsTeraGrid Round TableMay 20, 2010 Preston Smith Manager - HPC Grid Systems Rosen Center for Advanced Computing Purdue University
More Steele nodes for TG usres • Purdue Condor resource now in excess of 30,000 cores • Recent active users: • Fixation Tendencies of the H3N2 Influenza Virus • N-body simulations: Planets to Cosmology • De Novo RNA Structures with Experimental Validation • Planet-planet scattering in planetesimal disks • Robetta Gateway TeraGrid Round Table, 5/20/2010
New Developments in Condor Pool Running on student Windows labs today – with VMWare Integrating now: KVM and libVirt on cluster (Steele) nodes Virtual Machine “Universe” TeraGrid Round Table, 5/20/2010
Condor VM Use Cases • “VMGlide” • Using Condor to submit, transfer, and boot Linux VMs as cluster nodes on Windows systems • More usable to the end-user! • Tested and demonstrated on the order of ~800 VMs over a weekend • All running real user jobs inside of the VM container • Working with Condor team to minimize the network impact of transferring hundreds of VM images over the network TeraGrid Round Table, 5/20/2010
Condor VM use cases • User-submitted virtual machines • For example: User has a code written in Visual Basic, that runs for weeks at a time on his PC • Submitting to Windows Condor is an option, but the long runtime coupled with an inability to checkpoint limits its utility • Solution: • Submit the entire Windows PC as a VM universe job – which will be suspended, checkpointed, and moved to a new machine until execution completes TeraGrid Round Table, 5/20/2010
Condor and Power • In the economic climate of 2010, Purdue, like many institutions is looking to save power costs • The campus Condor grid will help! • By installing Condor on machines around campus we will • Get useful computation out of the powered-on machines • And if there’s no work to be done? • Condor can hibernate the machines and wake them when there is work waiting TeraGrid Round Table, 5/20/2010
Cloud Computing: Wispy • Purdue staff operating experimental cloud resource • Built with Nimbus from UC • Current Specs • 32 nodes (128 cores): • 16 GB RAM • 4 cores per node • Public IP space for VM guests TeraGrid Round Table, 5/20/2010
The Workspace Service Slide borrowed from Kate Keahey: http://www.cs.wisc.edu/condor/CondorWeek2010/condor-presentations/keahey-nimbus.pdf TeraGrid Round Table 5/20/2010
Wispy – Use Cases • “CloudBLAST: Combining MapReduce and Virtualization on Distributed Resources for Bioinformatics Applications” by A. Matsunaga, M. Tsugawa and J. Fortes. eScience 2008. • “Sky Computing”, by K. Keahey, A. Matsunaga, M. Tsugawa, J. Fortes, to appear in IEEE Internet Computing, September 2009 • Used in Virtual Clusters • Publications using Purdue’s Wispy cited below • NEES project exploring using Wispy to provision on-demand clusters for quick turn-around of wide parallel jobs • Working with faculty at Marquette Univ. to use Wispy in Fall 2010 course to teach cloud computing concepts • With OSG team, using Wispy (and Steele) to run VMs for STAR project TeraGrid Round Table, 5/20/2010
HTPC • High-Throughput Parallel Computing • With OSG and Wisconsin, using Steele to submit ensembles of single-node parallel jobs • Package jobs with a parallel library (MPI, OpenMP, etc) • Submit to many OSG sites as well as TG • Who’s using? • Chemistry – over 300,000 hours used in Jan • HTPC allowed 9 papers in 10 months to be written! TeraGrid Round Table, 5/20/2010
Storage • DC-WAN mounted and used at Purdue • Working on Lustre lnet routers to reach compute nodes • Distributed Replication Service • Sharing spinning disk to DRS today • Investigating integration with Hadoop Filesystem (HDFS) • NEES project investigating using DRS to archive data TeraGrid Round Table, 5/20/2010