1 / 19

Helix Nebula Big Science in the Cloud

Helix Nebula Big Science in the Cloud. Micheál Higgins Enterprise Solutions Architecture CloudSigma. A Collaboration Initiative. European Commission & relevant projects. User organisations Demand-side. European Cloud Computing Strategy. Commercial Service Providers Supply-side. 2.

arlen
Download Presentation

Helix Nebula Big Science in the Cloud

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Helix NebulaBig Science in the Cloud Micheál Higgins Enterprise Solutions Architecture CloudSigma

  2. A Collaboration Initiative European Commission & relevant projects User organisations Demand-side European Cloud Computing Strategy Commercial Service Providers Supply-side 2

  3. Helix Nebula Genomic Assemblyin the Cloud ATLAS High Energy Physics Cloud Use SuperSites Exploitation Platform To support the computing capacity needs for the ATLAS experiment A new service to simplify large scale genome analysis; for a deeper insight into evolution and biodiversity To create an Earth Observation platform, focusing on earthquake and volcano research • Scientific challenges with societal impact • Sponsored by user organisations • Stretch what is possible with the cloud today

  4. Helix Nebula – The European Science Cloud • Concept • Big Science needs to begin to use Public Clouds • The European GRID is aging out • Cloud-burst has a much better TCO • Avoid lock-in and Procurement issues • Federation and Identity Management • Disintermediation of Cloud Vendor Solutions • API’s • Drive Image formats (KVM, ESx, Zen, etc.) • Cost/Billing models • Etc. • Initial Membership • Demand Side: CERN, ESA and EMBL • Supply Side: CloudSigma, Atos, T-Systems, Logica, Interoute and 4 non-provider SME’s

  5. Helix Nebula – The European Science Cloud • Phase I – Flagship Proof-of-Concepts • Q4 2011 to Q4 2012 • CERN – LHC ATLAS jobs with Panda/Condor • EMBL – De Novo (non-human) Genome with StarCluster • ESA – SSEP Earth Observation Site • Environment • One-to-one tests, vendor API, vendor drive format, etc. • Success criteria was binary, no performance or cost data • CS successful in all 3 and continuing to run ATLAS live • Phase II – PoC’s with some Disintermediation • Q1 2013 • Many-to-Many • Blue Box 0.9 – remove complexity for the Customer • enStratus + Integration • Phase III – Expanded Membership and Disintermediation • More Demand and Supply Side partners • Increased Blue Box Functionality – Federated Clouds

  6. Blue Box Maturity Model • Release 0.9 – enStratus – January 2013 • Basic Services Catalog • Federation and Identity Management, Token pass-through • Web Portal and API Translations • Some Image Management and Cost reporting (not Billing) - as time allows • Release 1.0 • Service Catalog and Cube Filtering • Image Factory and Transport • Billing / Payment Module • Embedded Monitoring • On-screen Provisioning • Contextualization • Post 1.0 • Cluster Management plug-ins (StarCluster, SGE, G-POD, etc.) • Payment gateways • PaaS for Science • Recipes and Golden Image Management • SLA / OLA Reporting • Data Movement and Open Networking • Ecosystems

  7. The Blue Box 0.9 EC1/EC2

  8. The Blue Box - Production

  9. Helix Nebula – Learnings • Not All Things are Public Cloud Suitable • This is not the GRID • Some science middleware is not cloud friendly • IP’s and UUID’s change at re-boot • The Public Cloud is commodity hardware • 2.2 to 2.3 GHz CPU is common, less in older clouds • 32, 64 and 96 Gb maximum RAM sizing – no 1 Tb servers • Caution: You can’t eat the whole physical server • Parts of De Novo are single core, massive map reduce • Many science applications do not scale horizontally, yet • Software innovation by the Vendors is required • Putting Data in the Public Cloud is still perceived as a risk • My Firewall is better than your Firewall – or is it ? • Burst utilization must be somewhat predictable • No, you may not have 65,000 large VM’s for 2 hours later today • What do you mean I have to pay for it ?!!

  10. CloudSigma • Public IaaS Providersince 2008 • Locations (tier 4+ carrier neutral datacenters) • Zurich, Switzerland (Headquarters) • Las Vegas, USA • Amsterdam (in-work) • San Palo (planned 2012) • Key Values: • Open Platform and Networking • Constant Innovation (big secret: all SSD storgage !!) • High availability and Up Time SLA’s • Customer Relationships / Enterprise Architecture Team • Standards Adherence (OpenNubula, Jclouds, etc.) • We only sell IaaS, we partner for SaaS and PaaS • The Ecosystem Concept

  11. CloudSigma Features • Granular Resources not Bundles • CPU, Cores, RAM, Disk, SSD, GPU’s, etc. all virtualized • Graphic Equalizer – reboot required • Allows for the tuning of the Server to the Work-load • E.g. Oracle requires 1.5x to 2x Memory of Standard config • HEP/HPC applications are also not Typical configurations • Open Architecture • KVM Hypervisor with full Virtualization, no sniffing, no root • Any x68 O/S and Application (no you can’t run Mac) • Public Drives Library, Pay-per-use and Bring-your-own • Open Networking – 2x 10GBs NIC’s • SigmaConnect and IX – private back-haul lines • Peering (e.g. Switch, Geant, etc.) • No Customer Lock-in – upload and download easily

  12. CloudSigma Features • In-Work for the Future • All SSD Storage – only S3 is magnetic • JSON API • Virtualized GPU’s • Virtualized H/W for Transcoding and Rendering • Virtual Desk Top – Command and Control • Additional Science Applications: Panda/Condor, etc. • PaaS for Media and PaaS for Science • The Ecosystem Concept • Hold the Data and the World will come to the Cloud • Low to Zero cost Data hosting • More margin in CPU and RAM than in Storage • Meta meta-data – joining the Databases • Known Point + Vector + distance vs. Lat/ Lon • ESA EO + WHO = Mosquito outbreak predictions

  13. The Easiest Way to Understand The CS Ecosystem Concept

  14. Questions / Discussion

More Related