330 likes | 478 Views
Grid & Virtualization Working Group. OGF20 gridvirt-wg. Erol Bozak, Chair SAP, Development Architect Wolfgang Reichert, Co-Chair IBM, Senior Technical Staff Member. May 7, 2007 Manchester, UK. OGF IPR Policies Apply.
E N D
Grid & Virtualization Working Group OGF20 gridvirt-wg Erol Bozak, ChairSAP, Development ArchitectWolfgang Reichert, Co-ChairIBM, Senior Technical Staff Member May 7, 2007 Manchester, UK
OGF IPR Policies Apply • “I acknowledge that participation in this meeting is subject to the OGF Intellectual Property Policy.” • Intellectual Property Notices Note Well: All statements related to the activities of the OGF and addressed to the OGF are subject to all provisions of Appendix B of GFD-C.1, which grants to the OGF and its participants certain licenses and rights in such statements. Such statements include verbal statements in OGF meetings, as well as written and electronic communications made at any time or place, which are addressed to: • the OGF plenary session, • any OGF working group or portion thereof, • the OGF Board of Directors, the GFSG, or any member thereof on behalf of the OGF, • the ADCOM, or any member thereof on behalf of the ADCOM, • any OGF mailing list, including any group list, or any other list functioning under OGF auspices, • the OGF Editor or the document authoring and review process • Statements made outside of a OGF meeting, mailing list or other function, that are clearly not intended to be input to an OGF activity, group or function, are not subject to these provisions. • Excerpt from Appendix B of GFD-C.1: ”Where the OGF knows of rights, or claimed rights, the OGF secretariat shall attempt to obtain from the claimant of such rights, a written assurance that upon approval by the GFSG of the relevant OGF document(s), any party will be able to obtain the right to implement, use and distribute the technology or works when implementing, using or distributing technology based upon the specific specification(s) under openly specified, reasonable, non-discriminatory terms. The working group or research group proposing the use of the technology with respect to which the proprietary rights are claimed may assist the OGF secretariat in this effort. The results of this procedure shall not affect advancement of document, except that the GFSG may defer approval where a delay may facilitate the obtaining of such assurances. The results will, however, be recorded by the OGF Secretariat, and made available. The GFSG may also direct that a summary of the results be included in any GFD published containing the specification.” • OGF Intellectual Property Policies are adapted from the IETF Intellectual Property Policies that support the Internet Standards Process. 2
Agenda • Status of GridVirt Activities • Management of Virtual Machines in Grid Infrastructures • Speaker: Ruben S. Montero • Use Cases • Virtual Workspaces • Work Streams for the GridVirt-Working Group 3
Status • Introduction of the working group at OGF19 • Virtualization Concepts • Goals and Scope Definition • Milestones Definition • Milestones • Milestone 1 (OGF 19) • Introduction, scope definition & milestones definition • Milestone 2 (OGF 20) • Terminology definition • Collection of use cases • Determine relations to other OGF WGs and SDOs • Milestone 3 (OGF 21) • Requirements collection • Determine relation to other standards • First draft of a profile 4
EGA Reference Model The role of Grid Management Entity (GME) Enterprise Accounting / Billing Policies External Events Manage Grid Management Entity Grid Component Monitor Other GME 5
EGA Reference Model Enterprise Grid Components & Dependencies GC GC GC GC … Accounting / Billing Policies Logical GME Manage Grid Management Entity Grid Component Monitor Accounting / Billing Policies Manage GC GC Other GME GC GC … Monitor 6
Reference Model – Virtualization Grid Components & Dependencies Manage Application GME Application Monitor Accounting / Billing Policies Logical GME Manage Virtualization Platform GME Hypervisor VS Scenario /Landscape Monitor Accounting / Billing Policies Manage VS VS System GME VS VS Monitor 7
EGA Reference Model Service Level Management Enterprise Accounting / Billing Policies Assigns / Provisions Grid Management Entity Reconciles Resources (other GCs) Metrics Manage Monitor Consumed Grid Component Generates 8
EGA Reference Model Lifecycle of a Grid Component Create / Discover Destroy Unconfigured Configure Unconfigure Provision Inactive Decommission Start Stop Active Manage 9
Use Cases Structure Create / Discover Image(s) Provision Virtual System(s) Configure Image(s) Deploy Virtual System(s) from Image(s) Manage Virtual System(s) System Virtualization Decommission Virtual System(s) 10
Use Cases Structure Migration Provision Virtual System(s) Dynamic Resizing Manage Virtual System(s) System Virtualization Monitoring Decommission Virtual System(s) Snapshotting 11
Virtualization Use Cases Migration of virtual system during runtime • Power Saving The resource manager may pool virtual systems to a reduced number of physical systems in order to save power. • Planned maintenance The physical system as well as the hypervisor could require maintenance activities to be performed (e.g. install patch, hw upgrade or driver etc.). The running job could be migrated to other machines without downtime. • Adaptation to changing capacity requirements & conditions (availability or offering) Capacity availability may change in the environment because of recently freed resources by the completed jobs or additional physical systems may be introduced. 12
Power Saving Policy-driven Monitor event Resource utilization below threshold Temperature above threshold External event From hierarchically higher GME Resulting management actions Communication with System Virtualization GME(s) for live migration Resource allocation / deallocation Notification of grid component (before and after live migration) Accounting event Enterprise Accounting / Billing Policies Assigns / Provisions Grid Management Entity Reconciles Resources (other GCs) Metrics Manage Monitor Consumed Grid Component Generates Virtualization Use Cases 13
Planned Maintenance External event Resulting management actions Communication with System Virtualization GME(s) for live migration Resource allocation / deallocation Notification of grid component (before and after live migration) Accounting event Enterprise Policies Accounting / Billing External Events Manage GME Grid Component Monitor Other GME Virtualization Use Cases 14
Virtualization Use Cases Dynamic Resizing • Dynamically changing capacity requirements During runtime the job may require additional capacity (e.g. CPU capacity, Memory capacity, I/O bandwidth etc.). If the underlying physical system is able to serve the requirements more capacity for the job / virtual system can be provided locally on the same physical system. • Dynamically changing capacity offering / availability Capacity availability may change in the physical system (e.g. CPU capacity, Memory capacity, I/O bandwidth etc.) because of recently freed resources by the completed jobs. In these situations available capacity can be utilized for the running jobs. 15
Dynamic Resizing Policy-driven Monitor event SLO is going to be missed (progress indicator, trend analysis) Grid component event Dynamic resource requirements External event Dynamic resource availabiliy Resulting management actions Communication with System Virtualization GME for system resizing Virtual resource reallocation Changing system parameters/settings Notification of grid component (before and after resizing) Accounting event Enterprise Accounting / Billing Policies Assigns / Provisions Grid Management Entity Reconciles Resources (other GCs) Metrics Manage Monitor Consumed Grid Component Generates Virtualization Use Cases 16
Virtualization Use Cases Snapshotting • Stateful cloning The execution of a job may require costly preparation steps, e.g. retrieving data from the backend, which might be common to all jobs of an activity / application. Rather than doing the preparation separately for each job one (or the first) job can be snapshotted after the preparation and the state can be cloned and distributed. • Reproducing situations For purpose of diagnosis (error or performance analysis) the user may repetitively re-run the same job from a certain persisted state. • Protecting (long running) jobs from software or hardware failures By providing recovery points that can be re-activated (long running) jobs can be restarted at a certain persisted state potentially on a different physical system. 17
Stateful Cloning Policy-driven or external request from top-level GME Subsequent provisioning scenario using the snapshot Resulting management actions Communication with System Virtualization GME to take snapshot Notification when snapshot has been taken Accounting event Top-level GME manages Cloning of the snapshot (distribution, postprocessing) Provisioning scenario Enterprise Policies Accounting / Billing External Events Manage GME Grid Component Monitor Other GME Virtualization Use Cases 18
Reproducing Situations External event to take snapshot External event to restart from persisted state Resulting management actions Communication with System Virtualization GME to take snapshot Communication with System Virtualization GME to restart from snapshot Enterprise Policies Accounting / Billing External Events Manage GME Grid Component Monitor Other GME Virtualization Use Cases 19
Virtualization Use Cases Isolation • Metering of job resource consumption For the purpose of accounting and billing. • Resource consumption control Through isolation the amount / degree of resource consumption can be controlled and leveled, i.e. greedy jobs can be controlled. 20
Metering of Job Resource Consumption Data collection at deprovisioning event Resulting management actions Communication with System Virtualization GME to get accurate metering data for the lifetime of the virtual system Enterprise Policies Accounting / Billing External Events Manage GME Grid Component Monitor Other GME Virtualization Use Cases 21
Resource Consumption Control Policy-driven Resulting management actions Communication with System Virtualization GME to set limits Enterprise Policies Accounting / Billing External Events Manage GME Grid Component Monitor Other GME Virtualization Use Cases 22
Virtualization Use Cases Provisioning Scenarios • Resource Provisioning Definition and activation of the desired runtime environment of a job: Rather than searching for and allocating resources for the job resources can be “created” on demand. Definition and provisioning of the required software stack (runtime environment) • Emulation Emulating an environment for legacy jobs: Legacy applications / jobs may require certain physical resources or a certain runtime environment (e.g. operating system). In this situation a virtual system may emulate the legacy environment. • Isolation Avoiding conflicts. Ensuring security: To protect the job from spyware the job can be executed in its own dedicated and certified virtual system. 23
Virtual Workspaces • A Virtual Workspace is an abstraction of an execution environment… • …that can be made dynamically available • …through well defined protocols, • …the software environment contained in the workspace and the user submitting the workspace are both trustworthy. • Virtual Workspaces is not a new idea! • Dynamically setting up cluster nodes • CoD: http://www.cs.duke.edu/nicl/cod/ • bcfg: http://trac.mcs.anl.gov/projects/bcfg2/ • • Providing access to existing installation • Dynamic Accounts: http://workspace.globus.org/da/ • Refining site configuration • Pacman: http://www.archlinux.org/pacman/ 24
Virtual Workspaces Representation of a Virtual Workspace Virtual Workspace Specification Virtual Workspace Deployment XML XML Deployment Request VM Image Metadata 25
Virtual Workspaces • Specification of a Virtual Workspace • VM Image • Metadata • XML Document • Includes deployment-independent information: • VMM and kernel requirements • NICs + IP configuratoin • VM image location Don‘t changes between deployments • Deployment Request • Specifies availability, memory, CPU%, disk Changes during or between deployments 26
Node Node Node Node Node Node Node Node Node Node Node Node Virtual Workspaces Node Pool VW Node VW Factory Service VW Service Image Node Trusted Computing Base 27
Node Node Node Node Node Node Node Node Node Node Node Node Node Agent Hypervisor Virtual Workspaces Node Pool Create- Metadata Instance- Deployment Req. VW Node VW Factory Service VW Service Manage- Start / Stop / Suspend- Migrate- Monitor etc. Image Node Trusted Computing Base 28
Node Node Node Node Node Node Node Node Node Node Node Node Virtual Workspaces Node Pool VW Node VW Factory Service Create- Metadata Instance- Deployment Req. VW Service Image Node Trusted Computing Base 29
Node Node Node Node Node Node Node Node Node Node Node Node Virtual Workspaces Node Pool VW Node VW Factory Service Create- Metadata Instance- Deployment Req. VW Service Image Node Trusted Computing Base 30
Node Node Node Node Node Node Node Node Node Node Node Node Virtual Workspaces Node Pool VW Node VW Factory Service VW Service Manage- Start / Stop / Suspend- Migrate- Monitor etc. Image Node Trusted Computing Base 31
Workstreams • Workstream 1: Refine Use Cases & align Grid Reference Architecture in the Context of System Virtualization • Define the requirements to the grid architecture for integration with system virtualization platforms • Workstream 2: Refine the Provisioning Use Case • Define Interaction among the components in the architecture to create / discover, configure and start a Virtual System • Define information model for definition of Virtual Systems • Exploit the concept of „Virtual Workspaces“ 32
Appendix • Project Homepage • https://forge.gridforum.org/sf/projects/gridvirt-wg • Mailing list • gridvirt-wg@ogf.org • Subscription: http://www.ogf.org/mailman/listinfo/gridvirt-wg 33