140 likes | 172 Views
Explore the use of Amazon Web Services for offloading peak-load requests in processing experimental data and theoretical calculations at ESRF. Gain valuable insights and solutions for cloud-based HPC processing.
E N D
ESRF ExperienceUsing AWS for HPCR. Wilckewith contributions from: C. Ferrero, A. GötzESRF, Grenoble R. Wilcke, Helix Nebula 8th General Assembly, Frascati, 20-22/09/2016
ESRF: European Synchrotron Radiation Facility - international institute for research with (hard) X-rays - molecular biology, physics, chemistry, archaeology, … - electron storage ring with 6 GeV, 844 m circumference - X-ray spectrum: 10 to 120 KeV (0.10 to 0.01 nm wavelength) - 42 experimental stations - 6500 scientific users and » 2000 publications / year R. Wilcke, Helix Nebula 8th General Assembly, Frascati, 20-22/09/2016
R. Wilcke, Helix Nebula 8th General Assembly, Frascati, 20-22/09/2016
Data processing at ESRF Want to use cloud for offloading peak-load computing requests Two quite different types of tasks: 1) analysis of experimental data - processing time / dataset typically short (seconds) - but large amounts of data (10 TB / day) - difficult to transmit required data to cloud 2) theoretical calculations and modeling - small amounts of data (a few GB max) - but long processing times (80 cores for 5 days not uncommon) - not difficult to get processing power on cloud Þ move theoretical calculations and modeling to cloud R. Wilcke, Helix Nebula 8th General Assembly, Frascati, 20-22/09/2016
Background of Tests Wanted to gain experience with commercial cloud before Helix Nebula becomes operational Main reason: ensure our processes run on commercial cloud: no showstoppers due to platforms, performance, object store, ... 1) approached a broker (Prologue) for advice - Prologue contacted multiple providers - based on their response they proposed Amazon Web Services - Prologue set up account and billing 2) our work was to - learn how to use AWS for HPC - port and run our HPC applications on AWS R. Wilcke, Helix Nebula 8th General Assembly, Frascati, 20-22/09/2016
AWS support for HPC AWS have invested in HPC: https://aws.amazon.com/hpc features: - 12 computing centers (2 in Europe) - Linux and Windows - pre-installation of MPI and Sun Grid Engine - wide range of large machines : < 64 cores + 8 GB RAM/thread - cost 0.05 - 0.10 $ / thread / hour ("on-demand" services) - mass storage space available, cost » 0.03 $ / GB / month - fast interconnect (dedicated 10 Gbit/s network) usage: - cfncluster: command to configure an HPC cluster in 10 min - good documentation - but relatively steep learning curve at the beginning R. Wilcke, Helix Nebula 8th General Assembly, Frascati, 20-22/09/2016
Software candidates for Cloud processing R. Wilcke, Helix Nebula 8th General Assembly, Frascati, 20-22/09/2016
Installing software on cloud problems encountered: - OS type and version different home and cloud - missing software on cloud (libraries, tools, …) lessons learned: - link programs statically, or - link non-standard libraries statically - avoid proprietary software (compiler, libraries, …) cook-book procedure: - use GNU compilers (gcc, gfortran) - use OpenMP / OpenMPI for parallel programs - pack needed non-standard shared libraries with executable - provide setup script to define required parameters (PATH, …) R. Wilcke, Helix Nebula 8th General Assembly, Frascati, 20-22/09/2016
Virtual Machine Images (I) Motivation: - ESRF has its own local setup conventions for software - programs and startup-procedures are set up accordingly - want to have same environment on the cloud, including Debian - solution: create and use private cloud images Þ easier to use and maintain ESRF software on the cloud But: so far not able to make it work - images do not run on Amazon EC2 service - on S3 storage, image does not even start - on EBS storage, image starts but kernel boot does not complete We are working on it! R. Wilcke, Helix Nebula 8th General Assembly, Frascati, 20-22/09/2016
Virtual Machine Images (II) Procedure to convert OpenStack images (QCOW2 format) to AMI format for Amazon 10 R. Wilcke, Helix Nebula 8th General Assembly, Frascati, 20-22/09/2016
Benchmarking tests ESRF - AWS 11 R. Wilcke, Helix Nebula 8th General Assembly, Frascati, 20-22/09/2016
Benchmarking results - good scaling for FDMNES and Geant4 - bad scaling for Quanty (unclear why) - good agreement between predicted and observed performance for programs that scale well - reasonable pricing: » 0.1 $ / hour CPU time (1 core) - easy to use (once you know how) - still some to issues to solve 12 R. Wilcke, Helix Nebula 8th General Assembly, Frascati, 20-22/09/2016
"Snowball" use 10 days: 50 TB 200 $, 80 TB 250 $ Open Issues - payment via purchase order (possible according to Amazon) - billing and user accounting: create individual user accounts to avoid overspending? - licenses for commercial software on cloud - optimize pricing: On-demand, Reserved Instances, Spot Instances - data transfer for large data quantities: ship disks R. Wilcke, Helix Nebula 8th General Assembly, Frascati, 20-22/09/2016
Conclusions Experience has proved commercial cloud is adapted to HPC Þ do a real user service as next step Helix Nebula should be at least as good as AWS R. Wilcke, Helix Nebula 8th General Assembly, Frascati, 20-22/09/2016