320 likes | 439 Views
Issues and Opportunities of Cloud Federations. Massimo Coppola in collaboration with Laura Ricci, Emanuele Carlini, Patrizio Dazzi, Ranieri Baraglia . Summary. Cloud Computing Where do we come from : HPC, Parallel Computing, Grids, P2P Federations of Clouds What and why
E N D
Issues and Opportunities of Cloud Federations Massimo Coppola in collaboration with Laura Ricci, Emanuele Carlini, Patrizio Dazzi, Ranieri Baraglia
Summary • Cloud Computing • Where do we come from : HPC, Parallel Computing, Grids, P2P • Federations of Clouds • What and why • What we inherit from our past experiences • Autonomic, P2P, Resource Scheduling • Cloud applied to virtual environments • Business models for cloud federations
Parallelism, to Grid, to Clouds ... • To approach today’s Clouds, and boldly go beyond them, many techniques and theoretical results can be reused • sometimes are reinvented with a different name... • Scheduling and resource management from Parallel and Grid Computing • P2P techniques to cheaply and widely spread information • Autonomic management based on performance models of applications
Grid and Cloud computing with XtreemOS Part 3 - Basic of System Administration Massimo Coppola ISTI-CNR, Italy with contributions by Christine Morin and countless collaborators within XtreemOS Eurosys 2010, Paris XtreemOS IP project is funded by the European Commission under contract IST-FP6-033576 XtreemOS IP project is funded by the European Commission under contract IST-FP6-033576 4
SRDS and RSS • SRDS (service and resource discovery service)as part of the XtreemOS releases • Requested for node selection by the AEM • New functionalities • Support of multiple underlying DHTs (Scalaris, Overlay Weaver) • Support of XACML policy filters • Support of the new mutithreaded DIXI • Tested using up to 500 machines from Grid'5000
XtreemOS IP project - EC IST-FP6-033576 - Eurosys 2010 Tutorial, Paris XtreemOS System
Contrail Iaas Federation A Contrail Federation integrates in a common platform multiple Clouds, of public and private kind. User identities, data, and resources are interoperable within the federation, thanks to • common supports for authentication and authorization • common mechanisms for policy definition, monitoring, and enforcing of all aspects of QoS : SLA, QoP, etc. • the basis of a common economic model
Federation Objectives • Develop a Federation support that integrates and actively coordinates SLA management provided by single Cloud providers • Do not disrupt provider’s business model • Cloud administration is not Federation management • Allow exploiting a Federation as a single Cloud • Cloudbursting to and from the Federation • Federation Support must be scalable • Number of apps running, providers, resources, users
Cloud revolutions • Is there a place for “small” Cloud providers? • they offer lower scalability, are not worldwide • Large Cloud providers are subject to contrasting forces • concentration data centers where management is cheaper • placing resources scattered over the internet structure, to improve the networking cost • m.media streaming and real time enjoy lower latencies and round-trips, less overall bandwidth
Cloud revolutions • Federations as a way to flexibly merge separate providers • Smooth the size disadvantage • Increase the “market size” • Provide a competitive edge as small providers are already geographically distributed
Distributed Architecture • Abstract API is replicated onto each Federation access point • FAP act as brokers, but share a common view • Security, provider status, user actions • FAP not restricted to “local” provider F F F F • Policies and auth/authZ are common • Contention issues • Final resource allocation is on providers • Shared info helps management • AP either hosted by provider, or on independent HW
Holistic approach to QoS • Extend the set of characteristics to be measured on the platform • Protection • Type of security mechanisms which are in place • Auth. Protocols, Encryption mechanisms, Isolation • Privacy • Guarantees offered by storage holder, network infrastructure • Geo-localization • Can have deep legal implications • More in the future • E.g. power consumption: overall power, efficiency
Planning for SLAs • Choose the best provider(s) and map the application on the virtual resources provided • Beside constraints, multiple criteria choice • Many user criteria • Federation has its own goals • balance user satisfaction • balance provider satisfaction • How do you choose the resources? • What if one provider is not enough?
Application and SLA splitting • Application deployment on multiple providers : a federation is more than the sum of its providers • Type and amount of resources needed • Sudden elasticity • Peculiar resource dislocation • Tough issue • Multi-criteria and problem size • Both at SLA negotiation and at run-time • Matching application structure and SLA • Identifying suitable set of providers and mapping
Standard interoperation • Standards are still “flowing” in the Cloud • except de facto ones • Interoperation is mandatory • We are building an open-source OVF toolkit a standard converter • with INRIA and XLAB • (de)serialize in memory Java structures from to OVF and other standards for VM and Application description • will be extended to deal with SLA standards
Future directions • Apply autonomic heuristics to Clouds and Federations, and develop new ones. • New business models to be applied in Cloud Federations • For Service Providers, Federation aggregators and/or end-users • W.r.t the security and trust counterpart: 24/7 UCON authorization and “geographic” SLA constraints
Digital Virtual Environments • Player can move and interact with the surrounding environment • Shared sense of space among players • Modifications of the environment visible to every players • Area Of Interest (AOI)
Virtual Environments • Complex and challenging applications • High number of players • Near real-time constraints • Quadratic (or cubic) load (bandwidth, cpu) depending on the number of players: seasonal • QoS requirements depends on the user behavior • movements vs interactions
Aim of the work • Distributed architecture for Virtual Environments • scalable in QoS and cost • Exploit the (illusion of) infinite resources of Cloud Computing and the free resources of user machines.
Hybrid Architecture? • Private server-racks are fine... but they are statically sized for the peak load • Pure P2P should scale up.. but makes it hard to manage the QoS in limit situations • Only cloud? Costly for large instances Combination of the Cloud and P2P to support the DVE in an inexpensive and QoS-aware fashion
Cloud & P2P Combination Letting the cloud manage the bootstrap and peak load
Concrete Architecture • State Action Manager (SAM) • manages the state. Medium rate, No error tolerance, Conflicts • Positional Action Manager (PAM) • manages the position. High rate, Some error tolerance, No conflicts
SAM • Cloud IAASs runs on a DHT together with users machines • Heuristics decide when moving load from users to Cloud • Backups for user machines w/o heuristic with heuristic
PAM (she likes to gossip!) • “Wisdom of the Crowds” • A best-effort gossip-based algorithm • Storage Cloud as support • Around 70-80% less requests to the Cloud Percentage of object retrieval using gossip accurate, slower heuristic faster heuristic
Workload for Simulations Load and number of players Positions of objects/avatar
What’s next? • Elastic provisioning and Prediction in SAM • Dynamic management of the AOI in PAM
Some References Carlini E., Coppola M., Dazzi P., Ricci L., and Righetti G.. “Cloud Federations in Contrail”. Euro-Par 2011: Parallel Processing Workshops, LLNCS 7155, 2012. Carlini, E., M. Coppola, and L. Ricci. “Flexible Load Distribution for Hybrid Distributed Virtual Environments”. submitted Carlini, E., M. Coppola, and L. Ricci. “Gossip-Based Best-Effort Interest Management for Distributed Virtual Environments”. submitted Carlini, E., M. Coppola, and L. Ricci (2010). Integration of P2P and Clouds to Support Massively Multiuser Virtual Environments. In: Network and Systems Support for Games (NetGames), 2010 9th Annual Workshop on. IEEE, pp.1–6. http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=5679660
Beware! • Backup slides behind.
Load Characterization Cloud P2P Cloud
PAM: Area Coverage Find a subset of areas that maximize the coverage is a NP problem Two heuristic: - greedy: slower, but more accurate - score: faster, but less accurate