200 likes | 278 Views
Comparison of VM Deployment Methods for HPC Education. Nicholas Robison University of Washington nrobison @ u.washington.edu Thomas Hacker, PhD Purdue University tjhacker@purdue.edu. Overview. Motivation for Work Introduction to OpenNebula Overview of Cluster
E N D
Comparison of VM Deployment Methods for HPC Education Nicholas Robison University of Washington nrobison@u.washington.edu Thomas Hacker, PhD Purdue University tjhacker@purdue.edu
Overview • Motivation for Work • Introductionto OpenNebula • Overview of Cluster • Description of Testing Methodologies • Discussion of Results • Final Conclusions
Motivation for Work • New IT curriculum is beginning to add both HPC and virtualization experience • Education labs need to be able to rapidly deploy multiple, distinct virtualized environments • Need to support multiple users who may have limited to no experience in configuring cluster environments • Labs are often short on cash and computing resources and need to be able to repurpose equipment for multiple use scenarios
Why OpenNebula? • Open Source alternative to VMware, which many labs may be unable to afford • Actively supported by the community and various corporations • Supports management of VMware, KVM, Xen, and VirtualBox Hypervisors • Easy administration by command line and web application • Low impact installation and infrastructure
Introduction • For OpenNebula two distinct storage options exist, each with benefits and challenges • SSH • Deploy VM image to multiple nodes using local storage • No centralized disk usage • Reduced network traffic • No ability to live migrate VM between nodes • NFS • Copy VM image once/node to NFS share • Centralized disk usage, multiple copies of same image • Ability to live migrate VM between nodes • Potential for increased network traffic
Introduction cont. • OrangeFS* • Not an officially supported configuration • Functions identical to NFS configuration using parallel file system as share • Faster performance than single disk • Ability to live migrate • Potentially separate I/O network • File system I/O requests physically separated from cluster network
Research Question Which storage method provides the optimum balance of performance and reliability for an HPC education laboratory?
Hardware Overview • 16 Dell Optiplex Computers • 2.16GHz Core2Duo Processors • 2GB RAM • 160GB 7200 RPM Hard Drive* • 1Gb NIC Note: OrangeFS was configured on a separate cluster of similar machines but with 10 storage nodes and networked with a Force-10 S50 Switch using a RHEL6 OS. * Head node contained an additional 160GB 7200 RPM hard drive for VM storage
Test Bed Setup • Four Primary Metrics • VM Deployment Times • OSU Microbenchmarksmulti_latency • Aggregate latency between all active compute nodes • OSU Microbenchmarksmulti_bandwidth • Aggregate uni-directional bandwidth between all active compute nodes • Iozone • 8K Record Size • 1.5G File Size • Read/Write/Random Read/Random Write • Test VM configuration • 1 CPU • 768 MB RAM • 8 GB VM Size
Testing Methodology • Testing Methodology (Identical for SSH and NFS configuations) • Node VMs deployed in batches of 2 • Time from PENDING to RUNNING computed from OpenNebula logs • Bash script executing OSU and IOzone test 3 times • Multi_latency • μsec • Multi_banwidth • MB/sec • Iozone metrics • I/O Operations Per Second (IOPS)
Bandwidth Results SSH Bandwidth NFS Bandwidth
Latency Results SSH Latency NFS Latency
Disk IO Results (64KB) SSH 64KB NFS 64KB
Disk IO Results (32MB) SSH 32MB NFS 32MB
Conclusions • SSH provided superior deployment characteristics • Uniform disk reads • Minimal differences in network characteristics observed between NFS and SSH • Test cluster too small to fully saturate network • At 10 concurrent VMs NFS began to limit bandwidth • Performance needs to be balanced with reliability • For non-disk heavy VMs NFS may be preferred choice due to live migration • NFS storage becomes single point of failure • The optimum storage configuration depends on your specific needs and requirements
OrangeFS Observations • Direct performance characteristics are not possible between different clusters • 43.2 MB/sec deployment throughput • 2x Faster than SSH • 4x Faster than NFS • Allows for completely separate I/O and communication networks • Retains the ability to live migrate VMs • Added system administration
Thank you for youRtime Any Questions?