Virtualization in Clusters and Grids

Virtualization in Clusters and Grids Dr. Lizhe Wang

Virtualization in Cluster/Grids • On demand computing resource provision • with desired OS, software configuration, • with “root” privilege • Easy management from resource provision side • Resource accounting • Startup/shutdown/clone/migration,

Topics • Virtualization for a cluster scheduler • Xen Grid Engine • COD: cluster on demand • In-VIGO @ UFL • Virtuoso @ NWU • SODA

Virtual machine in Cluster:Computing cluster context • Existing cluster scheduler distributes jobs to cluster nodes • Jobs may come from local users or remote users (grid) • Problem: • Jobs have different resource requirements: OS, software package • Jobs may require QoS guarantee • Security issues

Virtual machine in Cluster:Solution • Prepare a set of virtual machine templates • On demand start up virtual machines when jobs come • Cluster scheduler distributes jobs to virtual machine nodes • No change on existing cluster scheduler • Programming with cluster scheduler interface

Virtual machine in Cluster:Implementation • With Maui/Torque • In University Karlsruhe, Germany • Used for LCG Grid project • Computing jobs for huge data processing

XGE: Xen & cluster scheduling • A share-used compute cluster • Improve the performance of cluster usage • Work from Marburg, Germany • Based on Sun Grid Engine

XGE: Cluster usage

XGE: Cluster scheduling • Parallel job submission • qsub with reservation • qsub without reservation • Backfilling • Problem: • My quota, why backfilling? • I did not get quick response!

XGE: requirements • User should be entitled to speedy job execution within their quotas. • Unused CPU time of a user may be consumed freely byother users when needed. • To maximize overall cluster performance, serial jobs should run whenever possible. • Parallel jobs should have waiting times as short as possible. • To minimize response time, parallel jobs should get as many CPUs as needed (deﬁnitively more than 32) without increasing the waiting time or reducing the overall cluster performance. • Any modiﬁcation of the scheduling strategy should be easy to use and transparent for administrators and users to avoid arguments.

XGE: solution

XGE: implementation

Cluster on Demand: goals • Secure isolation of multiple user communities • Custom software environments • Dynamic policy-based resource provisioning • As a Grid site manager • Balancing local vs. global resource use • Controlled provisioning for grid services • Resource reservation

Node Management • As the node boots, the COD servers shape its view of its environment: • COD assigns node IP addresses within a subnet for each vcluster. • Each vcluster occupies aprivate DNS subdomain de rived from the vcluster’s symbolic name assigned at creation time. • Each vcluster executes within a predefined NIS do main, which enables access for user identities and net groups enabled for the vcluster. • COD exports NFS file storage volumes as groups and vclusters are defined.

COD architecture

Virtual Cluster Manager of COD • for each vcluster that hosts a dynamic service: vcm • contain the logic for monitoring load and changing membership in the active server set for the specific application environment. • handles the details of resource negotiation with the COD manager.

VCM implements SGE scheduler • Add_node • Remove_node • Resize

VMShop • In-VIGO from UFL • a virtual machine management system • providing application VM based execution environments for Grid Computing. • http://www.acis.ufl.edu/~aganguly/vmshop/

VMShop operations • Creating new VM. • Configuring existing VM. • Estimate cost of creating a new VM. • Attribute-value based querying of VMs. • Collect (or destroy) VM.

VMShop architecture

VM description • VMs are described using a DAG encoded in XML strings. • The VMPlant servers maintain a library of cached VM images, from which new VMs can be cloned • The new VM DAG starts with the node identifying the cached image from which to clone, followed by nodes which may include configuring network, mounting application data files etc.

In-VIGO • In-VIGO provides a distributed environment where multiple application instances can coexist in virtual or physical resources, such that clients are unaware of the complexities inherent to grid computing. • From UFL • http://invigo.acis.ufl.edu/

Three layer of virtualization • virtual resource, “primitive” components: • virtual machines • virtual data • virtual applications • virtual networks. • Virtual computing grids • grid applications are instantiated as services • Virtual interface • aggregated services (possibly presented to users via portals) export interfaces

Three layer of virtualization

Virtuoso • Distributed/Grid Computing Using VMs • A complete system with VM provision, scheduling, virtual network, automatic application environment provision, information service • http://virtuoso.cs.northwestern.edu/ • From Northwestern Univ.

Complexity from User’s Perspective • Process or job model • Lots of complex state: connections, special shared libraries, licenses, file descriptors • Operating system specificity • Perhaps even version-specific • Symbolic supercomputer example • Need to buy into some Grid API • Install and learn potentially complex Grid software

Complexity from Resource Owner’s Perspective • Install and learn potentially complex Grid software • Deal with local accounts and privileges • Associated with global accounts or certificates • Protection/Isolation • Support users with different OS, library, license, etc, needs.

The Virtuoso Model (1) • User orders raw machine(s) • Specifies hardware and performance • Basic software installation available • Virtuoso creates raw image and returns reference • Image contains disk, memory, configuration, etc. • User “powers up” machine • Virtuoso chooses provider • Information service • Virtuoso migrates image to provider • Efficient network transfer

The Virtuoso Model (2) • Provider instantiates machine • Virtual networking ties machine back to user’s home network • Remote device support makes user’s desktop’s devices available on remote VM • Remote display support gives user the console of the machine (VNC) • Resource control to give user expected performance • User goes to his network admin to get address, routing for his new machine • User customizes machine • Feeds in CDs, floppies, ftp, up2date, etc.

The Virtuoso Model (3) • User uses machine • Shutdown, hibernate, power-off, throw away • Virtuoso continuously monitors and adapts • Virtual network as a monitoring platform • Various mechanisms, all invisible to user • Migrating the machine • Routing traffic between machines • Virtual network topology • Predictive scheduling versus reservations • Various goals • Price • Interactivity • Direct User Feedback

SODA • A Service-On-Demand Architecture for Application Service Hosting Utility Platforms • Utility computing concept • Application service • On-demand providing service on the Hosting Utility Platform • From Purdue Univ.

SODA architecture

Virtualization in Clusters and Grids