290 likes | 438 Views
On-Demand Virtual Workspaces: Quality of Life in the Grid . Kate Keahey keahey@mcs.anl.gov Argonne National Laboratory. the Grid metaphor. What happens if a power station fails? . How do we store energy?. How do we charge for energy?. What elements make for a safe
E N D
On-Demand Virtual Workspaces: Quality of Life in the Grid Kate Keahey keahey@mcs.anl.gov Argonne National Laboratory
the Grid metaphor What happens if a power station fails? How do we store energy? How do we charge for energy? What elements make for a safe and efficient power Grid? How do we ensure quality of service? How do we reliably deliver energy? How do we make sure that supply meets demand? 5th Meeting of Spanish Initiative in Grid Middleware
What happens if a power station fails? How do we store computing? Tera Grid How do we charge for computing? Caltech ANL What elements make for a safe and efficient power Grid? SDSC NCSA How do we ensure quality of service? How do we reliably deliver cycles? Grid Middleware How do we make sure that supply meets demand? computational Grids How can we manage different computing environments? What is the “unit” of resource usage? We need a vehicle that will enable us to use Grid resources as easily and intuitively as we use electrical power today How can we negotiate for computation? How can we ensure that disk, CPUs, network are all available? 5th Meeting of Spanish Initiative in Grid Middleware
Here is the environment I need to solve my problem -- deploy it on the Grid what is virtualization? Let’s see what’s available and adapt my problem to use it Can we provide the middleware that will enable this change of approach? 5th Meeting of Spanish Initiative in Grid Middleware
virtual workspaces • Focus on execution environments • Two aspects of workspaces: • Environment definition: We get exactly the (software) environment me need on demand. • Resource allocation: Provision and guarantee all the resources the workspace needs to function correctly (CPU, memory, disk, bandwidth, availability), allowing for dynamic renegotiation to reflect changing requirements and conditions. • Environment and resource allocation are now independent Quality of Life Quality of Service 5th Meeting of Spanish Initiative in Grid Middleware
how can we implement VWs? • Configuring physical machines • Slow and invasive • Environments are hard to describe • Limited/none enforcement options • Using environment management tools • Virtual Machines • Fast to deploy, much less invasive • Environments are easy to describe • Bonus: isolation, serialize, redeploy, migrate 5th Meeting of Spanish Initiative in Grid Middleware
VM VM VM virtual machine primer App App App App App Xen Guest OS (Linux) Guest OS (NetBSD) Guest OS (Windows) VMWare UML Virtual Machine Monitor (VMM) / Hypervisor Denali Hardware etc. Paravirtualization makes the performance overhead very acceptable 5th Meeting of Spanish Initiative in Grid Middleware
virtualizing other elements of an environment • Virtual storage • Combining many distributed physical resources • Virtual networks • Namespace management • Virtual private networks, ViNE, virtuoso, VIOLIN • Quality of Service • Overlay networks • Toward Virtual Grids • Putting all these elements together 5th Meeting of Spanish Initiative in Grid Middleware
GT4 workspace service • The GT4 Virtual Workspace Service (VWS) allows an authorized client to deploy and manage workspaces on-demand. • GT4 WSRF front-end • Leverages multiple GT services • Currently implements workspaces as VMs • Uses the Xen VMM but others could also be used • Current release 1.2 (September, 06) • http://workspace.globus.org 5th Meeting of Spanish Initiative in Grid Middleware
workspace service backstage The VWS manages a set of nodes inside the TCB (typically a cluster). This is called the node pool. Pool node Pool node Pool node The workspace service has a WSRF frontend that allows users to deploy and manage virtual workspaces VWS Service Pool node Pool node Pool node VWS Node Each node must have a VMM (Xen) installed, along with the workspace backend (software that manages individual nodes) Pool node Pool node Pool node Image Node Pool node Pool node Pool node VM images are staged to a designated image node inside the TCB Trusted Computing Base (TCB) 5th Meeting of Spanish Initiative in Grid Middleware
Adapter-based implementation model Transport adapters Default scp, then gridftp Control adapters Default ssh Deprecated: PBS, SLURM VW deployment adapter Xen Previous versions: VMware deploying workspaces Pool node Pool node Pool node VWS Service Workspace - Workspace metadata (with image location) - Deployment request Pool node Pool node Pool node Pool node Pool node Pool node Image Node Pool node Pool node Pool node 5th Meeting of Spanish Initiative in Grid Middleware
workspace request arguments • A workspace, composed of: • VM image • Workspace metadata • XML document • Includes deployment-independent information: • VMM and kernel requirements • NICs + IP configuratoin • VM image location • Need not change between deployments • Resource allocation • Specifies availability, memory, CPU%, disk • Changes during or between deployments 5th Meeting of Spanish Initiative in Grid Middleware
interacting with workspaces The workspace service publishes information on each workspace as standard WSRF Resource Properties. Pool node Pool node Pool node VWS Service Pool node Pool node Pool node Users can query those properties to find out information about their workspace (e.g. what IP the workspace was bound to) Pool node Pool node Pool node Image Node Pool node Pool node Pool node Users can interact directly with their workspaces the same way the would with a physical machine. Trusted Computing Base (TCB) 5th Meeting of Spanish Initiative in Grid Middleware
Workspace Factory Service Workspace Service Workspace Resource Instance workspace service interfaces Handles creation of workspaces. Also publishes information on what types of workspaces it can support Workspace Meta-data/Image Create() Deployment Request authorize & instantiate inspect & manage Workspace Service notify Resource Properties publish the assigned resource allocation, how VW was bound to metadata (e.g. IP address), duration, and state Handles management of each created workspace (start, stop, pause, migrate, inspecting VW state, ...) 5th Meeting of Spanish Initiative in Grid Middleware
status • Latest Release: 1.2 released 9/14 • Significant improvement over 1.1.1 • At least one more release planned by the end of the year to include C client and better IP handling among others • To be included in the next VDT release • VW is an incubator project in dev.globus • New governance model for Globus Toolkit • http://dev.globus.org • All software released under Apache license 2.0 5th Meeting of Spanish Initiative in Grid Middleware
support And that’s what we do to bugs! 5th Meeting of Spanish Initiative in Grid Middleware
applications: ESF www.opensciencegrid.org/esf 5th Meeting of Spanish Initiative in Grid Middleware
ESF: division of labor Paper: “Division of Labor: Tools for Growth and Scalability of Grids”, ICSOC 2006 5th Meeting of Spanish Initiative in Grid Middleware
applications: STAR STAR GRAM STAR STAR VWS no STAR STAR no STAR no STAR Provisioning STAR nodes on TeraPort (UC): demonstrated at SC06 show floor 5th Meeting of Spanish Initiative in Grid Middleware
are we there yet? • YES: we do have reliable infrastructure that can implement the basic virtualization scenario • NO: the basic scenario addresses about 10% of virtualization potential (on a good day) Yes. And No... 5th Meeting of Spanish Initiative in Grid Middleware
a chicken and egg problem Chicken Egg 5th Meeting of Spanish Initiative in Grid Middleware
meet the chicken • Overcoming Xenophobia • Hypervisor installations are “invasive” • We need flexible site resource management systems • Security: the cure or the disease? • On the whole the cure, but it is a new tool • Will it scale? • This is not a question that a simulation could answer! • We need more effort in this area • Commercial deployments are moving faster • Hosting services, Amazon’s EC2, others… • There are more incentives • Pioneering is hard! • OSG 5th Meeting of Spanish Initiative in Grid Middleware
meet the egg • Suppose you have this infrastructure deployed, now what? • Where would be iTunes without music? • Original idea: develop a library of VM images • Labor intensive • Images “age” • “Assembly line” approach • rPath: scientific appliances and rBuilder • Appliance = application + its environment • BCFG2: configuration management tool • Producing and managing images • How do we describe, indentify, and query to find the right image? 5th Meeting of Spanish Initiative in Grid Middleware
virtualizing clusters • How do we construct virtual clusters? • How do we deploy virtual clusters on hardware resources? (overcoming xenophobia) • The overhead should be “invisible” to the client • Can we take advantage of application-specific knowledge when we schedule VMs? • What scheduler logic is appropriate and needed for scheduling workspaces? • Papers • “Virtual Clusters for Grid Communities”, CCGrid06 • “Overhead Matters: A Model for Virtual Resource Management”, VTDC 2006 (in SC06) 5th Meeting of Spanish Initiative in Grid Middleware
toward virtual grids • Deploying workspaces across multiple sites • Remember the STAR application • Virtualizing multiple aspects of a Grid • Combining networking and storage • Use Case: Combining QoS on data movement and execution • We want to get rid of workspace staging! • These are good times to be a meta-scheduler! 5th Meeting of Spanish Initiative in Grid Middleware
details, details… • Looking down the road • Assume we have resolved the “simple” problem… • What if we succeed? • 100s of VMs per physical manchine • Name management, storage, etc. • On the bright side • There may also be pleasant surprises 5th Meeting of Spanish Initiative in Grid Middleware
conclusions • We live in exciting times! • Making progress is hard • We have useful infrastructure that is being used by projects today on a small scale -- we need to move to larger scales • There are still many open problems • We have work to do! 5th Meeting of Spanish Initiative in Grid Middleware
Virtualization Workshop • Virtualization Technology in Distributed Computing (VTDC) 2006 • Co-held with SC06 • http://workspace.globus.org/vtdc2006 5th Meeting of Spanish Initiative in Grid Middleware
credits • Workspace team • Tim Freeman • Borja Sotomayor • With guest appearances by: • Ian Foster, Elizeu Santos-Neto, Frank Siebenlist, and others 5th Meeting of Spanish Initiative in Grid Middleware