120 likes | 215 Views
PROOF and AnT in PHOBOS. Kristjan Gulbrandsen March 25, 2004 Collaboration Meeting. What is PROOF?. A system integrated into ROOT which allows for interactive analysis of large data sets using parallel processing and I/O
E N D
PROOF and AnT in PHOBOS Kristjan Gulbrandsen March 25, 2004 Collaboration Meeting
What is PROOF? • A system integrated into ROOT which allows for interactive analysis of large data sets using parallel processing and I/O • Transparent – difference between running a local session and over multiple computers is minimal • Adaptable – can react network conditions, system performance and multiple architectures • Scalable – no manifest limitations on size
Slave Slave Master Slave Slave Internet PROOF Architecture • Client connects to a master server local tocluster • Master server talks to slaves on nodes where (ideally) data is located • Slaves run in parallel • Master server collects results minimizing slow interaction with client User
TSelector Interface (n) Slaves: SlaveBegin() Process() … Process() … Process() … SlaveTerminate() Client: Begin() Terminate() Create histograms class TSelector{ Begin() SlaveBegin() Process() SlaveTerminate() Terminate() } } code normally in for loops If a tree exists,tree->MakeSelector() creates a skeleton class deriving from TSelector A copy of each object exists in each slave
Using PROOF • Call gROOT->Proof(“proof://<cluster>”) to begin a proof session • A set of file names must be added to a TDSet similar to adding files to a TChain • Call TDSet->Process(<selector file>)where <selector file> contain TSelector code • Additional supporting files/libraries can be used by creating PAR files
stdout/obj proof ana.C proof TFile TFile TFile proof TNetFile proof proof proof = master server proof = slave server #proof.conf slave node1 slave node2 slave node3 slave node4 PROOF Execution Local PC Remote PROOF Cluster root *.root node1 ana.C *.root $ root root [0] .x ana.C $ root root [0] .x ana.C root [1] gROOT->Proof(“remote”) $ root root [0] tree->Process(“ana.C”) root [1] gROOT->Proof(“remote”) root [2] dset->Process(“ana.C”) $ root node2 *.root node3 *.root node4
PROOF in PHOBOS • PROOF is installed on the Pharm cluster • Newest ROOT version (4.00/03) is needed and exists in /usr/local/root • Proofserver is compiled with libnew (for now to allow PhatII classes to be used without modification • PhatII structure is ideal for transferring individual libraries among slave nodes
AnT Trees • A tree format has been created to hold summary information for analyses • Trees are designed to have basic summaryinformation used for analyses and allow pieces of data to be ignored (not read)decreasing I/O • TRefs allowing partial information to be read in while maintaining the ability to cross reference information (i.e. tracks referring to their hits)
AnT Structure TriggerInfo: IsCol L0 L1 EOct ERing TrgT_Extra[] TrgE_Extra[] Tracks[]: PID Charge MeandE SigmadE Prob Chi^2 Xprod[3] Mom[3] HitArray[]-> Vertex[]: Status ID Prob Pos[3] Sigma[3] EventInfo: Run Seq Ev_No Date Time Polarity Prim_vtx-> Hits[]: Layer SensorLabel dE Pos[3] Pad[2] Paddle: TruncMeanP TruncMeanN SumP SumN TDiff ZDC: SumP SumN TZDCP TZDCN TOF Info? PCAL Info? HitArrays are being developed
Current AnT Trees • Prototype AnT trees currently exist onPharm (10 runs, 56 Seqs) and can be used • Analysis personnel needed to use the trees and provide information about necessary additions making them useful for many analyses
Analysis using AnT/PROOF • AnT/PROOF has been used to generate pt distributions from current data • Using AnT/PROOF speeds up analysis froman hour to a minute Disabling hit read in speeds up processing by more than factor of 10
Summary • PROOF is ready for use on Pharm. • Simple example macros exist explaining how to use PROOF • AnT trees have been created for quick analysis of large data sets in conjunction with PROOF • Users are needed to test/try both PROOF and AnT to provide information on data format and stressPROOF architecture