210 likes | 414 Views
The PROOF Benchmark Suite Measuring PROOF performance. Sangsu Ryu (KISTI, Korea) Gerardo Ganis ( CERN). Measuring PROOF performance. PROOF aims at speeding-up analysis using N ROOT sessions in parallel
E N D
The PROOF Benchmark SuiteMeasuring PROOF performance SangsuRyu (KISTI, Korea) Gerardo Ganis (CERN)
Measuring PROOF performance • PROOF aims at speeding-up analysis using NROOT sessions in parallel • Scalability vs. N is a natural metrics to studyPROOF performance and to understand the main bottlenecks under different conditions • The new benchmark suite is a framework toperform scalability measurements on a PROOFcluster in a standard way ACAT 2011, Brunel Univ.
Who’s supposed to use it? • PROOF site administrators, private users • Check the installation • Find bottlenecks • Find optimal configuration parameters • PROOF developers • Understand / Improve PROOF ACAT 2011, Brunel Univ.
Design requirements • Easiness-of-use • Default case must be straightforward to run • Fine control also supported • Flexible • Must be possible to run {user, experiment}-specific cases ACAT 2011, Brunel Univ.
Address both PROOF modes • Data-driven • Unit of process is entries of a TTree fetchedfrom distributed files • Typically I/O intensive • Can be network, RAM, CPU intensive • Cycle-driven • Unit of process is independent tasks • Ex) generation of MC events • Typically CPU-intensive • Can be I/O, network, RAM intensive ACAT 2011, Brunel Univ.
The new proofbench module • The suite is made of a set of client sideclasses • New module proofbench under$ROOTSYS/proof • Set of default selectors • Set of default Par packages • Steering class TProofBench • TProofBench • Test initizialization (open PROOF, setup a file for results) • Interface to run the tests • Tools to display the results • Interface to customize the tests ACAT 2011, Brunel Univ.
proofbench features (cnt’d) • Statistical treatment • 4 measurements (default) for each point • {Value, error} from {Average, RMS} • Two types of scan • Worker scan: 1 … Nwrk (the usual one …) • Core scan: 1wrk/node … Nwrk/node • study scalability inside a node • Possibility to save performance TTree • All relevant parameters configurable ACAT 2011, Brunel Univ.
Easiness-Of-Use root [] TProofBenchpb(“<master>”) root [] pb.RunCPU() Cycles/s a+ b×Nwrk Cycles/s / Nwrk a / Nwrk+ b ACAT 2011, Brunel Univ.
Easiness-Of-Use (2) root [] TProofBenchpb(“<master>”) root [] pb.MakeDataSet() root [] pb.RunDataSetx() Event / s Normalized plots MB / s ACAT 2011, Brunel Univ.
Default tasks • Cycle-driven • Intensive random number generation to testthe CPU scalability • Merging TH3D: study impact of big outputs • Data-driven • Based on $ROOTSYS/test/Event.h,.cxx • Study impact of {file size, event size, …} onI/O device scalability ACAT 2011, Brunel Univ.
User defined tasks • Change the TSelector to be processed • Change/Add PAR files, if required • Use an existing dataset TProofBench::SetCPUSel(const char *selector) TProofBench::SetDataSel(const char *selector) TProofBench::SetCPUPar(const char *par) TProofBench::SetDataPar(const char *par) TProofBench::RunDataSet(const char *dataset) TProofBench::RunDataSetx(const char *dataset) Can benchmark any user specific case ACAT 2011, Brunel Univ.
Examples • To illustrate the tool we show some resultsobtained from runs on • ALICE CERN Analysis Facility (CAF) • ALICE KISTI Analysis Facility (KIAF) • PoD clusters on clouds • Courtesy of A. Manafov, GSI • Using a non-default data task • for ALICE ESDs (courtesy of ALICE) • Shows also the sort of issues that can bespotted ACAT 2011, Brunel Univ.
Example 1: CPUtask on ALICE CAF lxbsq lxfssi lxfssl • 58 nodes, 8 cores/node, max 2 workers/node, 2 GB memory/core • 3 different types of CPU • The transition between CPU types are visible on the plot ACAT 2011, Brunel Univ.
Example 2: CPUtask on a cloud Courtesy of A. Manafov, GSI 500 workers 971 workers • Breakdown of scalability between 200-300 workers likely due tosingle master (packetizer) scalability issues (under study) ACAT 2011, Brunel Univ.
Hardware Configuration • KIAF (kiaf.sdfarm.kr, homogeneous) ACAT 2011, Brunel Univ.
Example 3: Default data task on KIAF ACAT 2011, Brunel Univ.
Example 4: ALICE data task on KIAF • Non-default data task • Full read of esdTree • Courtesy of ALICE ACAT 2011, Brunel Univ.
Availability / Doc • ROOT versions • From ROOT 5.29/02 on • It can be imported in previous versions • See doc • Webpage http://root.cern.ch/drupal/content/new-benchmark-framework-tproofbench ACAT 2011, Brunel Univ.
Future plans • Analysis of the results • Modeling of dependencies and fit to relevantparameters • Tools to analysis the performance tree(s) • Better problem digging using per-packetinformation • More graphical options ACAT 2011, Brunel Univ.
Summary • The new PROOF benchmark suite providesa standardized framework for performancetests • Allows to measure scalability in differentconditions • Cross-check installation • Spot unusual / unexpected / weird behavior • Identify places for improvement • … • Distributed with ROOT 5.30 ACAT 2011, Brunel Univ.