230 likes | 421 Views
Recent PROOF developments. G. Ganis PROOF workshop, 29 November 2007. files. scheduler. The PROOF approach in a nutshell. catalog. Storage. PROOF farm. query. PROOF job: data file list, myAna.C. final outputs (merged). MASTER. feedbacks (merged).
E N D
Recent PROOF developments G. Ganis PROOF workshop, 29 November 2007
files scheduler The PROOF approachin a nutshell catalog Storage PROOF farm query PROOF job: data file list, myAna.C final outputs (merged) MASTER feedbacks (merged) • farm perceived as extension of local PC • same syntax as in local session • more dynamic use of resources • real time feedback • automated splitting and merging G. Ganis, PROOF workshop 2007
Issues addressed by the developments • User interface • Processing of generic jobs • Data set, software handling • Performance and responsiveness • Load balancing within a query • Access to data • Monitoring tools • Processing information at the end of a query • Memory usage • Resource usage control • Enforce priorities • Improve responsiveness in multi-user environment • Testing, tutorials, installation G. Ganis, PROOF workshop 2007
Dataset manager J. Iwaszkiewicz + G. Bruckner + J.F. Grosse-Oetringhaus (more on Jan-Fiete’s talk) • Metadata about a set of files stored in sandbox on the master on dedicated subdirectory • <DatsetDir>/group/user/dataset or <SandBox>/dataset • Data-sets are identified by name • Data-sets can be processed by name • No need to create the chain locally (i.e. on the client) root[0] TProof *proof = TProof::Open(“master”); root[1] TFileCollection fc(“dum”,””,”file.list”); root[2] proof->RegisterDataSet(“MyDataSet”, &fc); root[3] proof->ShowDataSets(); Existing Datasets: MyDataSet root[] proof->Process(“MyDataSet”, “MySelector.C+”); G. Ganis, PROOF workshop 2007
output list Selector • Begin() • Create histos, … • Define output list • Terminate() • Final analysis • (fitting, …) Time Process() analysis 1…N Generic, non-data-driven analysis L. Tran-Thanh Implement algorithm in a TSelector New TProof::Process(const char *selector, Long64_t times) // Open the PROOF session root[0] TProof *p = TProof::Open(“master”) // Run 1000 times the analysis defined in the // MonteCarlo.C TSelector root[1] p->Process(“MonteCarlo.C+”, 1000) G. Ganis, PROOF workshop 2007
Generic, non-data-driven analysis • New packetizer TPacketizerUnit • Time-based packet sizes • Processing speed of each worker measured dynamically • Included in ROOT 5.17/04 G. Ganis, PROOF workshop 2007
Output file merging L. Tran-Thanh • Large output objects (e.g. trees) create memory problems • Solution: • save them in files on the workers • merge the files on the master using TFileMerger • New class TProofFile defines the file and provide tools to handle the merging • Unique file names are created internally to avoid crashes • Merging will happen on the Master at the end of the query • Final file is left in sandbox on the master or saved where the client wishes • Included in ROOT 5.17/04 G. Ganis, PROOF workshop 2007
Output file merging: example void PythiaMC::SlaveBegin(TTree *) { // Meta file object: to be added to the output list fProofFile = new TProofFile(); fOutput->Add(fProofFile); // Output filename (any format understood by TFile::Open) TNamed *outf = (TNamed *) fInput->FindObject(“PROOF_OUTPUTFILE”); if (outf) fProofFile->SetOutputFileName(outf->GetTitle()); // Open the file with a unique name fFile = fProofFile->OpenFile(“RECREATE”); // Create the tree and attach it to the file fTree = new TTree(…); fTree->SetDirectory(fFile); … } Bool_t PythiaMC::Process(Long64_t entry) { fTree->Fill(); } void PythiaMC::SlaveTerminate() { if (fFile) { fFile->cd(); // Write here big objects fTree->Write(); fFile->Close(); } } G. Ganis, PROOF workshop 2007
Software handling • Package handling • Separated behaviour client / cluster for enabling • Real-time feedback during build • API to modify include / library paths on the workers • Use packages globally available on the cluster • Load mechanism extended to single class / macro • Selectors / macros / classes binaries cached • Decreases initialization time if selector did not change • Version check for binaries based also on SVN revision • Support for multiple ROOT versions root[] TProof *proof = TProof::Open(“master”) root[] proof->Load(“MyClass.C”) G. Ganis, PROOF workshop 2007
Software handling • Next steps • Package versioning (e.g. ESD-v1.12.103-new) • Directory structure including also the ROOT version • Filter the selector code into a “client” and “cluster” parts • Clients should not be obliged to load tons of experiment libraries typically needed only for processing on the cluster ~$ pwd <SandBox>/packages/ESD/1.12.103-new/root_v5.17.05-r20920/ESD G. Ganis, PROOF workshop 2007
Load balancing: improved packetizer J. Iwaszkiewicz • Packetizer’s goal: optimize work distribution to process queries as fast as possible • Standard TPacketizer’s strategy • first process local files, than try to process remote data • End-of-query bottleneck Active workers Processing time G. Ganis, PROOF workshop 2007
New strategy: TPacketizerAdaptive • Predict processing time of local files for each worker • Keep assigning remote files from start of the query to workers expected to finish faster • Processing time improved by up to 50% Processing rate for all packets OLD Remote packets NEW Same scale G. Ganis, PROOF workshop 2007
Data access • Tree cache enabled (+ asynchronous reading) • Expect improvements in the case of many users and non-local files • Under study: • Exploit large number of cores and relatively large amount of memory of new machines • Separate thread for unzipping the data • Use xrootd as dynamic pre-loader G. Ganis, PROOF workshop 2007
Monitoring the resource usage • Per-query information • CPU time, wall time, bytes read, events, user, group • Posted by the master via TVirtualMonitorWriter • E.g. MonAlisa, MySQL • Used for monitoring or to correct priorities based on usage history (see M.Meoni’s talk) G. Ganis, PROOF workshop 2007
Memory consumption monitoring A. Kreshuk • Workers monitor their memory usage and save info in the log file • New button in the dialog box to display the evolution of memory usage per node in real time • Client get warned of high usage • The session may be eventually killed • Prototype being tested G. Ganis, PROOF workshop 2007
Motivation for scheduling? • Controlling resources and how they are used • Improving efficiency • assigning to a job those nodes that have data which needs to be analyzed. • Implementing different scheduling policies • e.g. fair share, group priorities & quotas • Efficient use even in case of congestion G. Ganis, PROOF workshop 2007
PROOF specific requirements • Interactive system • Jobs should be processed as soon as submitted. • However when max system throughput is reached some jobs has to postponed • I/O bound jobs use more resources at the start and less at the end (file distribution) • Try to process data at its location for performance • User defines a dataset not the #workers G. Ganis, PROOF workshop 2007
Enforcing experiment priority policies • Based on group priority information defined in dedicated files • Technology • “renice” low priority non-idle sessions • Priority = 20 – nice ( -20 <= nice <= 19) • Limit max priority to avoid over killing the system • May be centrally controlled • Master updates the priorities and broadcast them to the active workers • Feedback mechanism – e.g. via monitoring tool – allows to adjust the priorities (see M.Meoni’s talk) G. Ganis, PROOF workshop 2007
Central Scheduler • Assigning a set of workers for a job based on: • The data set location • User priority (Quota + historical usage) • The current load of the cluster • First implementation: • # of Workers ≈ relativePriority * nFreeCPUs • Assign least loaded workers first • Missing ingredients • Come&Go functionality for workers • Needed also by the Condor interface G. Ganis, PROOF workshop 2007
Dataset Lookup 2: dataset 3: file locations Client PROOF master 4: Job info Scheduler 1: Job {dataset, …} Load, history, policy, … 5: workers 6: workers Start workers Central scheduling • Schematic view G. Ganis, PROOF workshop 2007
Tutorials, Testing, installation • Tutorials • Frame for PROOF examples: • $ROOTSYS/tutorials/proof/runProof.C • Currently available: • « simple »: histogram filling with random entries • « h1-http »: H1 analysis reading data via HTTP • Testing • Frame for PROOF tests: • $ROOTSYS/test/stressProof.C • Installation • Interactive script to simplify the installation of a small cluster • $ROOTSYS/etc/proof/utils/proofinstall.sh G. Ganis, PROOF workshop 2007
PROOF and SVN • PROOF development branch • http://root.cern.ch/svn/root/branches/dev/proof • Synchronized daily with the main trunk • PROOF tags • http://root.cern.ch/svn/root/branches/dev/proof-tags • Specific « snapshots » of the dev branch • Binaries installed on AFS at /afs/cern.ch/sw/lcg/contrib/proof/root G. Ganis, PROOF workshop 2007
Questions? • Credits • G.G., J. Iwaszkiewizc, A. Kreshuk, F. Rademakers, L. Tran-Thanh (summer student ‘07) • G. Bruckner, J.F. Grosse-Oetringhaus, M.Meoni, A. Peters (ALICE) • F. Furano (CERN) • A. Hanushevsky (SLAC) G. Ganis, PROOF workshop 2007