290 likes | 441 Views
“Cloud bursting” on SZTAKI Cloud. Attila Csaba Marosi Cloud Computing Research Group MTA SZTAKI LPDS marosi.attila@sztaki.mta.hu. Outline. Terminology Recap: SZTAKI Cloud and LPDS Cloud Cloud-Manager Cloud bursting definition, scalability in general Scaling scenarios @ SZTAKI Cloud
E N D
“Cloud bursting”on SZTAKI Cloud Attila CsabaMarosi Cloud Computing Research Group MTA SZTAKI LPDS marosi.attila@sztaki.mta.hu Summer School on Grid and Cloud Workflows and Gateways 2013
Outline • Terminology • Recap: SZTAKI Cloud and LPDS Cloud • Cloud-Manager • Cloud bursting definition, scalability in general • Scaling scenarios @ SZTAKI Cloud • Summary • Additional Reading and References Summer School on Grid and Cloud Workflows and Gateways 2013
Terminology I. • Based on deployment model: • Public Cloud – “The cloud infrastructure is made available to the general public or a large industry group and is owned by an organization selling cloud services.” 3 • Private Cloud – “The cloud infrastructure is operated solely for an organization. It may be managed by the organization or a third party and may exist on premise or off premise.”3 • Hybrid Cloud – Environment created by the combination of public and private cloud offerings • (Community Cloud) 3 Summer School on Grid and Cloud Workflows and Gateways 2013
Terminology II. • Based on location: • Internal Cloud – Subset of the Private Cloud model where it is offered by an IT organization to its own business1 (“on premise”3 ). • External Cloud – Not hosted by own organization and offered by a 3rd party. It can be either public or private 1 (“off premise”3 ). • Point of view of architectural service layers • Software as a Service (SaaS) • Platform as a Service (PaaS) • Infrastructure as a Service (IaaS) – Cloud bursting (scaling) at this level Summer School on Grid and Cloud Workflows and Gateways 2013
Recap • SZTAKI Cloud* • Institutional IaaS Cloud service by SZTAKI (private, internal) • 7 nodes (7*64 Core, 7*256GB RAM), 2*32TB Storage • OpenNebula 3.8.3 based • Quotas for users • LPDS Cloud* • Similar, but smaller scale • Internal private cloud for LPDS • Typically we use the LPDS Cloud for internal needs and scale out to SZTAKI Cloud when needed. Summer School on Grid and Cloud Workflows and Gateways 2013 * Sándor Ács: “SZTAKI Cloud”. Monday, 1st July @ 12:00.
Definition, scalability • Cloud Bursting: • “Cloud bursting is an application deployment model in which an application runs in a private cloud or data center and bursts into a public cloud when the demand for computing capacity spikes.”4 • However more generally, cloud bursting is a subset of the general scaling out problem • Can be split into 2 parts: • Capability to scale out to a cloudto maintain QoSrequirements (e.g., for handling short term spikes in computing capacity demand). • making the decision of (a) when, (b) how much, (c) how long and (d) where to scale out. Summer School on Grid and Cloud Workflows and Gateways 2013
The ability to scale out (to a cloud) + Making the decision • Scaling out scenarios (with SZTAKI Cloud) • In this talk • Auto-scaling techniques • “Cloud bursting from WS-PGRADE/ gUSE” • Thursday, 11:00-11:30 Summer School on Grid and Cloud Workflows and Gateways 2013
Cloud-Manager Generic Meta-Broker Service Cloud-Manager FCM Repository VAx..VAy • Part of the FCM5 (“Federated Cloud Management”) Architecture • We’ll now focus on the Cloud-Manager • For FCM c.f., Attila Kertesz: “Cloud Federation Approaches” – @ 11:00 Today • Schedules service calls to VMs and manages VMs • REST/SOAP Web service interface for service call and VM queues • The Cloud Resource Manager (CRM) component is responsible for the scaling decision (when/ where/ … ) • Initially it was intended for scaling services in a single cloud • We use this component internally for different scaling (bursting) multi-cloud scenarios. VAy VAx Q1 Clouda VMQx Clouda VMQy Service Handler CloudaVM Handler VMx1 VMy1 VMx2 VMy2 … … VMxn VMym Clouda Summer School on Grid and Cloud Workflows and Gateways 2013
Cloud-Manager Cloud-Manager VAy 1 VAx 2 • Single queue for incoming service calls (or tasks) • Multiple VM queues • Different one for each VA and resource combination • VM queues can be managed automatically (CRM) or manually • Manages VM lifecycle (EC2 REST API) • Performs the scheduling of service calls to resources (Q1→VM) Q1 Clouda VMQx Clouda VMQy CloudaVM Handler Service Handler 4 3 VMx1 VMy1 VMx2 VMy2 … … VMxn VMym Clouda Summer School on Grid and Cloud Workflows and Gateways 2013
Scenarios @ SZTAKI • Source: Current infrastructure type (not necessarily cloud based!) • Destination: targetcloud infrastructure type Summer School on Grid and Cloud Workflows and Gateways 2013
Scenario A: Private → Public Summer School on Grid and Cloud Workflows and Gateways 2013
Scenario A: Private → Public • Form a hybrid cloud: when local resources are insufficient allocate resources from a public cloud provider • Real world example: Prezi.com • Uses private resources w/ Amazon EC2 to handle peak traffic • Batch processing of tasks • Zip files for download, fetch images for presentations, conversion jobs • Prezi.com Scale Contest – http://prezi.com/scale/ • Jobs 5 seconds max in queue, VMs 2 minute boot time, instances paid by the hour – minimize cost while honor requirements Summer School on Grid and Cloud Workflows and Gateways 2013
Scenario A: Private → Public • In SZTAKI We have the following possibilities for bursting: • OpenNebula based bursting • Cloud-Manager based bursting • However we prefer to use private clouds over public ones – bursting to public clouds is set up as absolute last resort Summer School on Grid and Cloud Workflows and Gateways 2013
OpenNebula: Building a Hybrid Cloud (Scenario A)* • OpenNebula supports accessing multiple remote providers through the EC2 API – not necessarily just Amazon EC2 • Remote provider appears as new host in OpenNebula • Resource limits by administrator for number and type of instances • VMs can be started in EC2 or locally • VM counterpart at remote provider – EC2 section in VM template • Network connectivity via VPN Summer School on Grid and Cloud Workflows and Gateways 2013 * Sándor Ács: “OpenNebula”. Monday, 1st July @ 11:00.
OpenNebula: Hybrid Cloud Use Cases* On-demand Scaling of Computing Clusters On-demand Scaling of Web Servers E.g., elastic execution of the NGinx web server The capacity of the elastic web application can be dynamically increased or decreased by adding or removing NGinx instances E.g., elastic execution of a Condor computing cluster Dynamic growth of the number of worker nodes to meet demands using EC2 Private network with NIS and NFS EC2 worker nodes connect via VPN * Sándor Ács: “OpenNebula”. Monday, 1st July @ 11:00.
Cloud-Manager: multi-cloud (Scenario A) Cloud-Manager VAy VAx • Cloud-Manager supports multiple providers through the EC2 REST/ SOAP API • OpenNebula, OpenStack, Eucalyptus and Amazon EC2 • Primarily for scaling Distributed Computing Infrastructures (DCIs) • Service calls are bound to VA’s • Each configured provider must have the counterpart (AMI-ID) • Network connectivity via VPN when needed Q1 Clouda VMQx Cloudb VMQx Service Handler CloudaHandler CloudbHandler VMx1 VMx1 VMx2 VMx2 … … VMxn VMxm Clouda Cloudb Summer School on Grid and Cloud Workflows and Gateways 2013
Scenario B: Private → Private Summer School on Grid and Cloud Workflows and Gateways 2013
Scenario B: Private → Private • Scale from a private infrastructure to another private infrastructure • E.g., scale from your local infrastructure (e.g., private internal) to another academic cloud (e.g., private external) • Typical use case for us: scaling out from LPDS Cloud to SZTAKI Cloud (however both can be considered asinternal clouds) Summer School on Grid and Cloud Workflows and Gateways 2013
SZTAKI: Scenario B+A (1/2.) • We scale primarily computing clusters (Condor, BOINC) with Cloud-Manager • We use the LPDS Cloud (private) • Scale out to SZTAKI cloud (private) • As last resort scale out to Amazon EC2 (public) Summer School on Grid and Cloud Workflows and Gateways 2013
SZTAKI: Scenario B+A (1/2.) 2 • The master node (1) and the Cloud-Manager (2) are hosted usually on a dedicated resource • VPN head (3) must be typically on a public IP node • We use a patched version on TINC with public key authentication • The Cloud Resource Manager (4) is responsible for auto-scaling • New VM instances are created and destroyed through the EC2 REST/SOAP API (5) 4 5 1 3 Summer School on Grid and Cloud Workflows and Gateways 2013
Example: Scaling a Condor cluster with Cloud-Manager • CM Service calls → Jobs for Condor • ThroughREST/SOAP interface: (e.g., WS-PGRADE/ gUSE) • VPN Head on public IP • Manager node: Cloud-Manager and Condor Master • VAs are deployed at LPDS, SZTAKI, Amazon EC2 • Contextualization by Cloud-Manager: • Key for VPN • VPN Head public IP • Condor Master IP on VPN 4 1 3 4 2 4 Summer School on Grid and Cloud Workflows and Gateways 2013
Example: Scaling a Condor cluster with Cloud-Manager Summer School on Grid and Cloud Workflows and Gateways 2013
Scenario C: Volunteer → {Public, Private} Summer School on Grid and Cloud Workflows and Gateways 2013
Scenario C: Volunteer → {Public, Private} • LPDS runs multiple BOINC based volunteer computing projects – SZTAKI Desktop Grid, EDGeS@home • People donate their computers’ idle computing cycles to science • We do not own the resources • We do not have any control over the resources • These resources are “free” however not very reliable • Jobs might be returned late or gone missing • We burst to clouds to provide reliable computing resources for problematic jobs when needed • LPDS → SZTAKI → Academic Clouds →Amazon EC2 • C.f., Jozsef Kovacs: “Integrating clouds with grid systems – the SZTAKI-BOINC experience” @ 11:30 Summer School on Grid and Cloud Workflows and Gateways 2013
Summary • Bursting (scaling) consist of the capability + decision making • In this presentation I showed some scenarios from SZTAKI: • Private → {Public, Private}; Volunteer → {Private, Public} • OpenNebulaand Cloud-Manager based • The decision making process (i.e., auto-scaling) will be the topic of my presentation on Thursday • “Cloud bursting from WS-PGRADE/ gUSE” –Thursday, 11:00-11:30 Summer School on Grid and Cloud Workflows and Gateways 2013
References and Additional reading [1] Nair, S. K., Porwal, S., Dimitrakos, T., Ferrer, A. J., Tordsson, J., Sharif, T., Sheridan, C., Rajarajan, M. & Khan, A. U. (2010). Towards secure cloud bursting, brokerage and aggregation. Paper presented at the IEEE European conference on Web Services, 1 Dec 2010 – 3 Dec 2010, Cyprus. [2] D. McDysan: Cloud Bursting Use Case. IETF. http://tools.ietf.org/html/draft-mcdysan-sdnp-cloudbursting-usecase-00 [3] National Institute of Standards and Technology (NIST): The NIST Definition of Cloud Computing. September, 2011. http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf [4] SearchCloudComputinghttp://searchcloudcomputing.techtarget.com/definition/cloud-bursting [5] A. Cs. Marosi, G. Kecskemeti, A. Kertesz and P. Kacsuk, FCM: an Architecture for Integrating IaaS Cloud Systems. In Proceedings of The Second International Conference onCloud Computing, GRIDs, and Virtualization. Rome, Italy.September, 2011. Summer School on Grid and Cloud Workflows and Gateways 2013
Thank you!Questions? Summer School on Grid and Cloud Workflows and Gateways 2013