140 likes | 155 Views
Explore the status of reference workloads, packaging, and benchmarking in the LHC experiments. Learn about containerization, benchmark requirements, and the current status of the benchmark. Get insights into using Docker and Singularity containers and extracting performance scores.
E N D
HOW Workshop, 20-03-2019 Status of reference workloads, packaging and benchmarking Domenico Giordano Michele Michelotto Andrea P. Sciabà
Outline • Reference workloads • Containerization of workloads • Benchmarking applications
Reference workloads • All LHC experiments provided and maintain instructions to manually submit jobs representative of the main workloads • Simulation, digitisation, reconstruction, … • But analysis still missing • Instructions are currently in Google docs • Also in gitlab for some of them • The initial goal was to have fixed references to be used for performance studies • They became natural candidates for a HEP benchmark
Description • ALICE (link) (GitLab) • A simple p-p simulation job using Geant3 • ATLAS (link) • A ttbar simulation job using Geant4 (GitLab) • A digitization + reconstruction job • A DxAOD derivation job • CMS (link) • Generation + simulation of ttbar events (GitLab) • Digitization and trigger simulation with premixed pile-up • Reconstruction job producing AODSIM and MINIAODSIM • LHCb (link) • Generation + simulation using Geant4 of D*(→π(D0→Kπ))πππ events (GitLab)
Summary of performance metrics https://cernbox.cern.ch/index.php/apps/files/?dir=/__myprojects/wlcg-cost-model/Workloads/sciaba& Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
Status of benchmarking in WLCG • Still using HS06 since a decade even if scaling with “real” LHC applications is poor • First experiences with SPEC CPU 2017 show that it is no better in this respect • DB12 proposed at some point, but found to be inadequate • Dominated by front-end calls and branch prediction units in the Python interpreter code M. Alef (KIT)
Quantitative comparison • Used Trident (see next talk) to quantitatively compare how the CPU is used by LHC applications vs. synthetic benchmarks • Percentage of time spent in front-end, vs back-end, vs bad speculation • Used as coordinates • Measured L1 distance in 3D space • Memory transactions and bandwidth usage
Hierarchical clustering of benchmarks • All LHC applications clustered together (apart from ALICE gensim) • Rather far from clusters consisting in HS06 or SPEC CPU 2017 benchmarks • Strong argument in favour of building an LHC benchmark using LHC applications
Benchmark requirements • Requirements • No need for network connectivity • Not too large package • Easy to use • Reproducible results • Ingredients • SW repository (CVMFS) • Input data (event and conditions data) • An orchestrator script per workload • Sets the environment • Runs the application • Parses the output to generate scores • Packaging • Containing also the needed SW from CVMFS and input data
Repository and Continuous Integration (CI) • A suite of HEP benchmarks requires stable procedures to build and distribute the benchmarking workloads. We have realized an effective and user-friendly infrastructure, leveraging • CVMFS Trace and Export [5] utilities to export the workloads’ software from cvmfs to local • GitLab CI/CD for fully automated continuous integration • GitLab Registry for container distribution • Experts from the Experiments focus on providing the HEP workloads: software, data, result parser • Experts on benchmarking focus on running the containers and profiling the compute resources • Two container solutions offered: Docker and Singularity • Simple instructions to run the benchmark and produce the results in JSON format • $ docker run -v /some_path:/results $IMAGE$ singularity run -B /some_path:/results docker://$IMAGE
Build Procedure • Starting from Gitlab repo of orchestrator and CI scripts • (1) Build interim HEP WL image • (2) Generate cvmfs traces, running the HEP WL in the container, with cvmfs mounted • (3) Export the accesses cvmfs files to a local folder in the container (standalone) • (4) Validate the standalone container and publish the image (Docker and Singularity) 3 4 2 1
Current status of benchmark • All (GEN-)SIM workloads available as containers • https://gitlab.cern.ch/hep-benchmarks/hep-workloads • https://twiki.cern.ch/twiki/bin/view/HEPIX/HEP-Workloads • DIGI and RECO are WIP • Will also include a ROOT analysis job • Useful to select hardware for analysis workloads • Support from experiment experts • On containerizing the workloads • On how to best extract a score {"copies":10 , "threads_x_copy":4 , "events_x_thread" : 100 , "througput_score": {"score": 0.9910, "avg": 0.0991, "median": 0.0990, "min": 0.0985, "max": 0.1000} , "CPU_score": {"score": 0.2497, "avg": 0.0250, "median": 0.0249, "min": 0.0248, "max": 0.0253} , "app": "CMSSW_10_2_9 GEN-SIM TTBAR”}
Open Data files • The experiments must provide data files to be inserted in the containers • E.g. snapshot of the conditions database to run GEN and SIM • Minimum bias event in the DIGI benchmark • These files should be OPEN ACCESS if we need to distribute the benchmark outside HEPiX/WLCG • E.g. to the vendors that want to participate to a tender for the procurements of future worker nodes • This is a policy issue, not a technical one
Conclusions • Common effort between performance modelling and cost model working group, and the HEPiX benchmarking working group • From simple performance studies to a working and already usable LHC benchmarking suite • Next steps • Add remaining workloads • Finalize the score calculations • Start using for real-world benchmarking