1 / 27

Comparing Grids and Clouds – evolution or revolution?”

Comparing Grids and Clouds – evolution or revolution?”. Marc-Elian B égin Six² S à rl, Geneva, Switzerland www.sixsq.com ECHOGRID Athens, Greece, June 9, 2008. Background. This presentation is based on material developed for EGEE: www.eu-egee.org. Content. Context of comparative study

ken
Download Presentation

Comparing Grids and Clouds – evolution or revolution?”

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Comparing Grids and Clouds –evolution or revolution?” Marc-Elian BéginSix² Sàrl, Geneva, Switzerlandwww.sixsq.com ECHOGRIDAthens, Greece, June 9, 2008 June 9, 2008

  2. Background • This presentation is based on material developed for EGEE: www.eu-egee.org June 9, 2008

  3. Content • Context of comparative study • Grid: EGEE/gLite • Cloud: Amazon Web Service • Comparison summary • Conclusions • Recommendations June 9, 2008

  4. Context of comparative study • This presentation is a summary of the report: • “An EGEE Comparative study: Grids and Clouds-evolution or revolution?”, by Marc-Elian Bégin • https://edms.cern.ch/file/925013/3/EGEE-Grid-Cloud.pdf • Objective: • As cloud computing gains popularity and traction, need to position grid computing with respect to cloud computing • Compare real implementations and production offerings • EGEE/gLite grid production service • Amazon Web Services, with focus on EC2 and S3 • Outcome: • Identified convergence paths and • Recommendations for managing convergence going forward June 9, 2008

  5. Acknowledgment • Many people provided comments, suggestions and feedback • Special thanks got to: • Bob Jones, CERN • James Casey, CERN • Charles Loomis, CNRS and Six² partner June 9, 2008

  6. Archeology • Astronomy • Astrophysics • Civil Protection • Comp. Chemistry • Earth Sciences • Finance • Fusion • Geophysics • High Energy Physics • Life Sciences • Multimedia • Material Sciences • … >250 sites 48 countries >50,000 CPUs >20 PetaBytes >10,000 users >150 VOs >150,000 jobs/day June 9, 2008

  7. Grid: EGEE/gLite • EGEE highlights: • Federated but separately administered resources (multiple sites, countries and continents) • Heterogeneous resources • Distributed, multiple research user communities grouped in Virtual Organisations (VO) • Mostly publicly funded at local, national and international levels • Range of data models, ranging from massive data sources, hard to replicate to transient datasets composed of varied file sizes June 9, 2008

  8. Grid: EGEE/gLite (2) • Provided services: • Basic services (focus of comparison with AWS) • Computing Element (CE) • Storage Element (SE) • Higher-level services • Workload Management System (WMS) • File & Metadata Catalog Services • File Transfer Service (FTS) • Virtual Organization Management Service (VOMS) • For more info: • Bob Jones, EGEE Project Director, CERN, bob.jones@cern.ch June 9, 2008

  9. Amazon Web Services • EC2 (Elastic Computing Cloud) is the computing service of Amazon • Based on hardware virtualisation (Xen) • Users request virtual machine instances, pointing to an image (public or private) stored in S3 • Users have full control over each instance (e.g. access as root, if required) • Request can be issued via SOAP and REST • S3 (Simple Storage Service) is a service for storing and accessing data on the Amazon cloud • From a user’s point-of-view, S3 is independent from the other Amazon services • Data is built in a hierarchical fashion, grouped in buckets (i.e. containers) and objects • Data is accessible via SOAP, REST and BitTorrent June 9, 2008

  10. Amazon Web Services (2) • Other AWS services: • SQS (Simple Queue Service) • SimpleDB • Billing services: DevPay • Elastic IP (Static IPs for Dynamic Cloud Computing) • Multiple Locations June 9, 2008

  11. Costs • Cost study for computing upgrade at CERN for LHC (by Ian Bird, Tony Cass, Bernd Panzer-Steindel and Les Robertson) • Cost summary for providing 40 MSI2000 of computing: • Custom data centre construction: 4.4 MCHF (~2.7 M€) • Using EC2: 92 MCHF (~56.9 M€) • Cost of 4.4 MCHF doesn’t include software license and man-power costs • Comparison is made difficult by the choice of reference Amazon is using for its EC2 Compute Unit • e.g. “EC2 Compute Unit (ECU) provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor” • Our calculation was for 40 MSI2000 on EC2: 57 MCHF (~35.3 M€) June 9, 2008

  12. Costs: EGEE workload in 2007 CPU: 114 Million hours Data: 25PB stored 11PB transferred Estimated cost if performed with Amazon’s EC2 and S3: ~38 M€ http://gridview.cern.ch/GRIDVIEW/same_index.phphttp://calculator.s3.amazonaws.com/calc5.html? 17/05/08 $58688679.08 June 9, 2008

  13. High-level deployment of LCG grid resources Where could the cloud be? Since transferring data across the cloud border costs! June 9, 2008

  14. Can BitTorrent Help Using BitTorrent, transfers not metered by cloud if requesting the same files Where could the cloud be? Since transferring data across the cloud border costs! June 9, 2008

  15. Performance • EC2, S3 bandwidth performance summary • The conclusions from [6] regarding the EC2 -> EC2 transfers are that “basically we’re getting a full gigabit between the instances”. June 9, 2008

  16. Performance (2) • Like AWS, CERN has opted for a storage / compute farms separation • CERN can deliver a sustained 70 GB/s data throughput between the storage and compute farms • A large scale performance analysis not available on AWS June 9, 2008

  17. Scale • Is EC2 (Elastic Computing Cloud) really “elastic”? • Scale of EGEE is already established and well documented • Scale from AWS is unknown, while latest experiments seem to indicate good scaling • Both systems now have SLAs in place, including penalties (partial refund) from Amazon when not honoured • Elastic IP and Multiple Locations provide building blocks for users to deploy resilient services, while • EGEE is already massively distributed (>250 sites) June 9, 2008

  18. AWS Cloud interfaces No middleware!! Resource-side grid middleware? June 9, 2008

  19. Ease of Use • Key to the success of AWS is the choice of technologies • HTTP(S)/REST and support for ROA (Resource Oriented Architecture) • Hardware virtualisation (Xen based) • X.509 certificates • This backs-up the claim from Amazon that AWS requires “no middleware” (for the user!) • However, the level of service provided by AWS is lower than EGEE • For EGEE/gLite, several MB are required to use the grid June 9, 2008

  20. Service Mapping • “Ease of use comes at a cost: ‘The cost of simplicity’” • The basic constructs that EC2 and S3 services offer do not currently meet all the requirements of grid users and do not replace high-level services provided by gLite – e.g.: • File Transfer Service (FTS) • Workload Management System (WMS) • Grid catalogues such as ARDA Metadata Catalogue (AMGA), LCG File Catalog (LFC) or GANGA • Are all users using the grid the same way? • Should we revisit the way the grid is used and accessed? • Who should be responsible for providing different levels of functionality June 9, 2008

  21. Collaboration and Virtual Organisations • Grids are used by large and/or distributed communities of collaborators • Virtual Organisations support this concept, with services such as VOMS • Only primitive ACLs are provided by AWS, can we bridge the gap? • Scientific collaborations include the need for resources to be contributed and “connected” to the grid. Can the cloud be “augmented” by custom data centres June 9, 2008

  22. Application Software Deployment • Grid application software is often required to be installed at data centres for jobs to execute successfully • Several operating systems and platforms required to host grid jobs • Hardware virtualisation could alleviate these burdens • Grid application software can be “baked” in a virtual image • Data centres do not have to provide specific operating system – defined at the level of the VM • Hardware virtualisation provides high-level of control to user (e.g. root) and high control and security for hosts June 9, 2008

  23. Interoperability • Assuming that several cloud computing providers come to be… • Which interface matter? BOTH!!! June 9, 2008

  24. Standards • Since “simple is beautiful”, if the proposed interfaces by cloud services like AWS are to become popular with grid users, they might change the standardisation landscape • HTTP, REST, Xen and BitTorrent are already largely standardised • What is left at that level • REST access to storage • Virtual Image formats • Instantiation API (perhaps based on REST) • Metering interfaces (including monitoring) • A reference open source implementation is missing • What about higher-level services? Which ones? June 9, 2008

  25. Conclusions • Cloud computing is getting traction, especially with Amazon Web Services (AWS) commercial offering • Grid (e.g. EGEE) has a larger scope, however, technological choices and simple interfaces like AWS is relevant to the grid world • The question “what is the usage pattern that will emerge in the coming years?” remains unanswered and will have to be carefully tracked • None of the resources contributed to the EGEE grid come from commercial offerings, such as Amazon. While this change? • Technologies such as REST, HTTP, hardware virtualisation and BitTorrent could displace existing accesses to grid resources June 9, 2008

  26. Conclusion (2) • EGEE has an opportunity to lead the next generation e-Infrastructure by integrating new advancements such as cloud computing • Hardware virtualisation could lower the operations cost of large infrastructures • Important that new development is not a distraction from ensuring current production grid continuity • Roadmap should be defined to include cloud technology in current e-Infrastructures in an incremental and harmonious fashion June 9, 2008

  27. Recommendations • Promote/support the development of an open source cloud middleware distribution, based on interfaces similar to current commercial offerings • Promote the standardisation of the cloud, with the above mentioned implementation as a potential reference • Identify a convergence path between cloud services such as EC2 and S3 and the current EGEE security model based on VOMS • Virtualise all key grid services (e.g. information system, metadata catalogues, security service) with the goal of being able to deploy these on EC2-like resources • Promote/lobby the need for experiments (i.e. LHC/HEP, Life science) and other grid users to virtualise their application, with the goal of being able to deploy them on EC2-like resources • As a follow-on to point 5, promote/lobby the need for all service dependencies that grid user applications have to also be virtualised • Launch/support a feasibility study to verify that monitoring of cloud jobs can be performed at the hypervisor level, such that monitoring is independent from the virtualised applications • Upgrade current metadata catalogues to support HTTP(S) endpoints and S3-like metadata • Explore feasibility of running BitTorrent on grid sites June 9, 2008

More Related