1 / 10

Challenges in Compute Management and Provisioning

Challenges in Compute Management and Provisioning. Tim Bell CERN tim.bell@cern.ch 10 th December 2013. What Happened Last Time?. Now–Worldwide LHC Computing. Typical simultaneous jobs: more than 250,000 Global Transfer rate: 12-15 billion bytes per second. Current Models.

mercia
Download Presentation

Challenges in Compute Management and Provisioning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Challenges inCompute Management and Provisioning Tim Bell CERN tim.bell@cern.ch 10th December 2013 Openlab Compute Challenges

  2. What Happened Last Time? Openlab Compute Challenges

  3. Now–Worldwide LHC Computing Typical simultaneous jobs: more than 250,000 Global Transfer rate: 12-15 billion bytes per second Openlab Compute Challenges

  4. Current Models • Software limits are being reached • Scheduling–7K workers handing 100K jobs • Flexibility–Linux migrations • Staffing limits • Significant effort to maintain code base • Operations cost is high • Migrations require major team effort • Expect continued growth of compute • 300,000 cores by 2015 Openlab Compute Challenges

  5. Learn from Others • Move towards a cloud model • OpenStack based IaaS • PaaS and SaaS to come • Exploit open source software • Strong communities • Staff motivation • Replication to other labs Openlab Compute Challenges

  6. Challenge: Scale • Running OpenStack with 300,000 cores will be non-trivial • Simple operations become major • Upgrading while running production workloads • Deploying kernel upgrades on hypervisors • Installing the latest hardware deliveries Openlab Compute Challenges

  7. Challenge: Efficiency • Need to measure efficiency • Get the most out of what we’ve got • Analytics for low efficiency identification • Minimise overhead • Containers • Hypervisors • Maximise usage • Quota vs spot market • Minimise cost • Pricing services (Private, Public) • Optimise for performance • Optimise for power Openlab Compute Challenges

  8. Challenge: Management • Service Levels • What is affordable and achievable ? • How can we measure it ? • Capacity Planning and Quota • Elastic resources • Delegation of resource management • Fair share allocation • Process optimisation • Hardware repair window vs redundancy Openlab Compute Challenges

  9. Challenge: Research Usability • Choose my service • Single federated account • Flexibility • OS version, CPUs, RAM, Disk • Self-Service • No tickets • CLIs and one-click interfaces • Support sharing of applications • Transparent costing • Simple model Openlab Compute Challenges

  10. Conclusions • Compute Management and Provisioning is undergoing dramatic transformation • Agility • Scale • Many challenges ahead for CERN • These are common with others • Solve them together Openlab Compute Challenges

More Related