280 likes | 393 Views
The ALICE Computing. F.Carminati May 4, 2006 Madrid, Spain. ALICE Collaboration ~ 1/2 ATLAS, CMS, ~ 2x LHCb ~1000 people, 30 countries, ~ 80 Institutes. Total weight 10,000t Overall diameter 16.00m Overall length 25m Magnetic Field 0.4Tesla. 8 kHz (160 GB/sec).
E N D
The ALICE Computing F.Carminati May 4, 2006 Madrid, Spain
ALICE Collaboration • ~ 1/2 ATLAS, CMS, ~ 2x LHCb • ~1000 people, 30 countries, ~ 80 Institutes Total weight 10,000t Overall diameter 16.00m Overall length 25m Magnetic Field 0.4Tesla 8 kHz (160 GB/sec) level 0 - special hardware 200 Hz (4 GB/sec) level 1 - embedded processors 30 Hz (2.5 GB/sec) level 2 - PCs 30 Hz (1.25 GB/sec) data recording & offline analysis fca @ CIEMAT
The history • Developed since 1998 along a coherent line • Developed in close collaboration with the ROOT team • No separate physics and computing team • Minimise communication problems • May lead to “double counting” of people • Used for the TDR’s of all detectors and Computing TDR simulations and reconstructions fca @ CIEMAT
The framework FLUKA G3 G4 ISAJET Virtual MC AliEn + LCG AliRoot AliReconstruction HIJING AliSimulation EVGEN MEVSIM STEER PYTHIA6 PDF PMD EMCAL TRD ITS PHOS TOF ZDC RICH HBTP STRUCT CRT START FMD MUON TPC RALICE ESD HBTAN JETAN AliAnalysis ROOT fca @ CIEMAT
The code • 0.5MLOC C++ • 0.5MLOC “vintage” FORTRAN code • Nightly builds • Strict coding conventions • Subset of C++ (no templates, STL or exceptions!) • “Simple” C++, fast compilation and link (see R.Brun’s talk) • No configuration management tools (only cvs) • aliroot is a single package to install • Maintained on several systems • DEC-Tru64, Mac OSX, Linux RH/SLC/Fedora (i32:i64:AMD), Sun Solaris • 30% developed at CERN and 70% outside fca @ CIEMAT
The tools • Coding convention checker • Reverse engineering • Smell detection • Branch instrumentation • Genetic testing (in preparation) • Aspect Oriented Programming (in preparation) fca @ CIEMAT
G3 G3 transport User Code VMC G4 G4 transport FLUKA transport FLUKA Reconstruction Geometrical Modeller Visualisation Generators The Simulation See A.Morsch’s talk fca @ CIEMAT
TGeo modeller fca @ CIEMAT
The reconstruction See P.Hristov’s talk • Incremental process • Forward propagation towards to the vertex TPCITS • Back propagation ITSTPCTRDTOF • Refit inward TOFTRDTPCITS • Continuous seeding • Track segment finding in all detectors TOF TRD TPC ITS • Combinatorial tracking in ITS • Weighted two-tracks 2 calculated • Effective probability of cluster sharing • Probability not to cross given layer for secondary particles Conflict ! fca @ CIEMAT Best track 2 Best track 1
Calibration From URs: Source, volume, granularity, update frequency, access pattern, runtime environment and dependencies shuttle Physicsdata files calibration procedures API ECS DAQ API calibration files Trigger API Calibration classes AliEn+LCG metadata file store DCS API AliRoot DCDB API API HLT API API – Application Program Interface fca @ CIEMAT
Alignment Simulation Reconstruction Ideal Geometry Misalignment Ideal Geometry Alignment procedure File from survey Raw data fca @ CIEMAT
Tag architecture Reconstruction Index builder Bitmap Index proof#1 guid#{ev1…evn} proof#2 guid#{ev1…evn} proof#3 Selection List of ev#guid’s guid#{ev1…evn} … Analysis job proof#n guid#{ev1…evn} fca @ CIEMAT
Visualisation See M.Tadel’s talk fca @ CIEMAT
ALICE Analysis Basic Concepts p • Analysis Models • Prompt reco/analysis at T0 using PROOF infrastructure • Batch Analysis using GRID infrastructure • Interactive Analysis using PROOF(+GRID) infrastructure • User Interface • ALICE User access any GRID Infrastructure via AliEn or ROOT/PROOF UIs • AliEn • Native and “GRID on a GRID” (LCG/EGEE, ARC, OSG) • integrate as much as possible common components • LFC, FTS, WMS, MonALISA ... • PROOF/ROOT • single + multitier static and dynamic PROOF cluster • GRID API class TGrid(virtual)TAliEn(real) fca @ CIEMAT
If you thought this was difficult ... NA49 experiment: A Pb-Pb event fca @ CIEMAT
… then what about this! ALICE Pb-Pb central event fca @ CIEMAT Nch(-0.5<<0.5)=8000
ALICE Collaboration ~ 1000 Members (63% from CERN MS) ~30 Countries ~80 Institutes fca @ CIEMAT
CERN computing power • “High throughput” computing based on reliable commercial components • More tha 1500 double CPU PC’s • 5000 in 2007 • More than 3 PB of data on disks & tapes • > 15 PB in 2007 Far from enough! fca @ CIEMAT
EGEE production service Situation 20 September 2005 • >180 sites • >15 000 CPUs • ~14 000 jobs completed per day • 20 VOs • >800 registered users that represent thousand of scientists fca @ CIEMAT http://gridportal.hep.ph.ic.ac.uk/rtm/
ALICE view on the current situation Exp specific services (AliEn’ for ALICE) AliEn Exp specific services AliEn arch + LCG code EGEE EGEE, ARC, OSG… EDG LCG fca @ CIEMAT
Submits job User ALICE central services Site Registers output ALICE Grid VO-Box LCG ALICE Job Catalogue ALICE File Catalogue User Job ALICE catalogues Optimizer packman Execs agent xrootd WN File access Workload request GUID CE SRM SA LFC SURL Computing Agent MSS RB fca @ CIEMAT
Output file 1 Distributed analysis File Catalogue query User job (many events) Data set (ESD’s, AOD’s) Job output Job Optimizer Grouped by SE files location Sub-job 1 Sub-job 2 Sub-job n Job Broker Submit to CE with closest SE CE and SE CE and SE CE and SE processing processing processing processing processing Output file 2 Output file n File merging job fca @ CIEMAT
Data Challenge • Last (!) exercise before data taking • Test of the system started with simulation • Up to 3600 jobs running in parallel • Next will be reconstruction and analysis fca @ CIEMAT
ALICE computing model • For pp similar to the other experiments • Quasi-online data distribution and first reconstruction at T0 • Further reconstructions at T1’s • For AA different model • Calibration, alignment, pilot reconstructions and partial data export during data taking • Data distribution and first reconstruction at T0 in the four months after AA run (shutdown) • Further reconstructions at T1’s • T0: First pass reconstruction, storage of RAW, calibration data and first-pass ESD’s • T1: Subsequent reconstructions and scheduled analysis, storage of a collective copy of RAW and one copy of data to be safely kept, disk replicas of ESD’s and AOD’s • T2: Simulation and end-user analysis, disk replicas of ESD’s and AOD’s fca @ CIEMAT
Detector Projects Software Projects Management Board LCG SC2, PEB, GDB, POB US Grid coord. EU Grid coord. DAQ HLT Regional Tiers Offline Board Chair: Comp Coord Offline Coord. (Deputy PL) International Computing Board Core Computing and Software • Production Environment Coord. • Production environment (simulation, reconstruction & analysis) • Distributed computing environment • Database organisation • Framework & Infrastructure Coord. • Framework development (simulation, reconstruction & analysis) • Persistency technology • Computing data challenges • Industrial joint projects • Tech. Tracking • Documentation • Simulation Coord. • Detector Simulation • Physics simulation • Physics validation • GEANT 4 integration • FLUKA integration • Radiation Studies • Geometrical modeler • Reconstruction & Physics Soft Coord. • Tracking • Detector reconstruction • Global reconstruction • Analysis tools • Analysis algorithms • Physics data challenges • Calibration & alignment algorithms • Offline Coordination • Resource planning • Relation with funding agencies • Relations with C-RRB fca @ CIEMAT
Conclusions • ALICE has followed a single evolution line since eight years • Most of the initial choices have been validated by our experience • Some parts of the framework still have to be populated by the sub-detectors • Wish us good luck! fca @ CIEMAT