470 likes | 703 Views
Calcolo CMS. M. Paganoni CSN1, 18/9/07. Goals of Computing in 2007. Support of global data taking during detector commissioning commissioning of the end-to-end chain: P5 --> T0 --> T1s (tape) data transfers and access through the complete DM system 3-4 days every month starting in May
E N D
Calcolo CMS M. Paganoni CSN1, 18/9/07
Goals of Computing in 2007 • Support of global data taking during detector commissioning • commissioning of the end-to-end chain: P5 --> T0 --> T1s (tape) • data transfers and access through the complete DM system • 3-4 days every month starting in May • Demonstrate Physics Analysis performance using final software with high statistics (Validation) • Major MC production of up to 200M events started in March • Analysis starts in June, finishes by September • Ramping up of the distributed computing at scale (CSA07) • 50 % challenge of 2008 system scale • Adding new functionalities • HLT farm (DAQ storage manager -> T0) • T1 - T1 and non regional T1 - T2 • Increase the user load for physics analysis • The crucial part starts on September 24th
Data processing: roles of Tiers • T0 prompt reco, analysis/calibration object generation, … • Skimming • Data selection at T1s based on physics filters (followed by data transfer to T2s for analysis) • Exercised at large scale during CSA06 only (several filters running) • Re-reconstruction • Data reprocessing at T1s with access to new calibration constants • Successfully demonstrated during CSA06 (0.1-1 k reco evts/T1, reading conditions data from local FroNTier) • Re-processing-like DIGI-RECO MC production ramping up at T1s • Simulation • Submission of jobs (simulation code) at T2s • In the transient: also at T1s so far (to catch all available SL4 resources) • Analysis • Submission of jobs (analysis code) at T2s • Analysis users complemented by robotic submissions • In the transient: also at T1s so far (due to MC data placed there) T1 T2
Data processing: concepts and tools • Computing Model: a data-placement based approach • Jobs are sent where data is, no data transfer in response to jobs • Data placement system calls for data replication prior to processing (if needed) • Data directly read from storage system via posix-like I/O access protocols • No stage-in from local storage to WNs • Computing Model: organized vs ‘unpredictable’ processing • Scheduled data processing at T1s • Organized MC production at T2s • ‘Unpredictable’ user analysis at T2s • Tools • ProdAgent • unique CMS workflow management tool for MC production, data skimming and reprocessing (also reconstruction at Tier-0) • CRAB • CMS tool for data analysis
CSA07 workflow Prompt reco Data Transfer Skims, re-reco Calibration Data Transfer Data Transfer MC prod upload Skims download
Possible Analysis Workflows in CSA07 First pass at Tier0/CAF RAW RECO AOD RECO, AOD shipped at Tier1 Central analysis skims at Tier1 AOD + AOD Analysis algos Analysis skimoutput shipped at Tier2 Analysis Data Final analysis pre-selection at Tier2 Final samplesshipped at Tier3 Further selection, Reduced output AOD + Fewer AOD coll. Analysis Data Analysis Data fast processing and FWLite at Tier3
CMS planned a Computing Software and Analysis data challenge to test the status one year from data taking(Jun-Oct 07) All the goals are scaled by 50% wrt to the final ones 7 T1s ~ 40 T2s CSA07 goals
T0 -> T1 transfer Reliable tape archiving/retrieval still an issue
SAM for CMS: a screenshot compile a site quality map SAM outputs on the web Click for history Click for log file 80% 37% of sites over the 80% availability mark 62% of Tier-0/1 sites define a metrics and construct an availability matrix
Data processing: 1 year • Currently: >20k jobs/day Massive MC production Constant, significant presence of analysis jobs • complemented by JobRobot-driven submissions Middleware tests, CRAB Analysis Server tests, … mw testing CSA06 job submission (dominated by Job Robot) Significant, constant user analysis More JobRobot… Massive MC production for CSA07 Terminated jobs / 6 days
JobRobot • A simple wrapper around job submission system • Submits short analysis jobs using Grid (EGEE/OSG) • ~100/jobs/day/site to all CMS Tier-1/2’s • Web access to statistics and log files • Same data, same code, same users at all Tiers, every day • It’s flat, and tuned by robot managers • Standard candle for user’s analysis: “why then do those jobs fail?”
Loadtest07 Goal T2 -> * ~ 2.5TB/weekGoal * -> T2 > 12 TB/week CNAF->* *->CNAF
LNL->* * ->LNL Roma->* * ->Roma
Bari->* * ->Bari * ->Pisa Pisa->*
Produzione eventi MC CSA07 Roma LNL Pisa Bari
Moving around MC events 65 Mevt/month (23 evt/s) 20 kjob/day Occupancy ~ 50 % Job eff ~ 80% data moved > 1PB/month
Utilizzo CNAF Problemi legati a: • CASTOR • mancanza disco • commissioning canali FTS CNAF
Risorse CNAF Nuove assegnazioni:65 TB: 1/3 su CASTOR (D0T1) e 2/3 su STORM (D1T0)
Stato Tier2 • Tutti i T2 di CMS sono passati a dCache nella primavera-estate 2007 • Il passaggio a SLC4 nei T2 si sta rivelando molto più complesso e lento del previsto (problemi di distribuzione software e manpower centrale di CMS) • Dopo l’upgrade di luglio, sollecitato da CMS, CASTOR (2.1.3) è più stabile come performance di base • optato per 1 sola service class nel transiente (per maggiore competività in import con FNAL) • Esposti problemi di ordine superiore (e.g. uso ottimale tapes in migrazione/recall) • Resta il principale potenziale show-stopper per le attività di CMS • Tutti i canali FTS da/verso il CNAF ai T2 INFN sono stati commissionati • >70% links da/verso CNAF comissionati anche nel progetto DDT
rack Knuerr Roma Il volano termico (3 T acqua) L’impianto di distribuzione dell’acqua chiller • I lavori hanno avuto piu’ di 2 mesi di ritardo • Lavori elettrici e idraulici finiti • Chiller, rack Knuerr e UPS installati e collegati • Manca il collaudo • Per questo motivo Roma farà CSA07 nella sede attuale e migrerà subito dopo • Le nuove CPU (8 quadcore) saranno comunque usate spegnendo macchine di altri esperimenti
Bari • La Sezione INFN sta predisponendo la nuova sala CED • spostamento e riunificazione di tutte le risorse di calcolo della Sezione • superficie complessiva ~ 90 m2 • lo spazio all’interno della nuova sala CED è adeguato per ospitare un eventuale Tier2 (CMS+ALICE) • Stato lavori di adeguamento dei locali (impianti elettrico e antincendio, impianto distribuzione per raffreddamento rack): • Il CD del 20 Luglio 2007 ha deliberato l’iscrizione dei lavori nell’Elenco Annuale dei lavori pubblici dell’INFN per l’anno 2007 • avviata la gara per l’individuazione del progettista (conclusione prevista per Settembre) • completamento dei lavori di adeguamento previsto per i primi mesi del 2008 • UPS, Chiller, isola APC (6 rack + 4 refrigeratori) già acquisiti con fondi 2006 (da aggiungere a parte dell’infrastruttura in uso nell’attuale sala CED)
Ganglia Local monitoring EGEE official site: http://www3.egee.cesga.es/gridsite/accounting/CESGA/egee_view.html
Links commissionati per CSA07Bari->CERN, Bari->CNAF, FNAL->Bari, FZK->Bari Stato di Bari al 10 Settembre 2007 Per commissionare un link serve trasferire 1.7TB alla settimana = (4*300Gbyte/day)+500Gbyte/day FNAL->Bari: 12TByte in 10 giorni Due link per direzione commissionati sin dall’inizio di CSA07
CSA06 (Sept/Oct 06): Total events processed >100M Performance (on 1kSi2k CPUs) ~ 25 s/ev on ttbar ~ 3 s/ev on minimum bias Memory tops at 500 MB/job after hours/thousands of events Crash rate < 10-7/event First definition of data Tiers (FEVT, RECO, AOD) Re-reconstruction, skimming demonstrated Software at CSA06/07 • CSA06 (Sept/Oct 07): • Includes also HLT and analysis skim workflows working from raw data • Extensive physics validation on Reconstruction code • Complete switch took 2 years • More realistic calibration flow • Enhanced software: • Tracking optimized • Electrons/photons optimal • Btagging, tau tagging • Many more jet algos • Lots of vertexing algos • Muons optimal • Particle Flow
Validation results Muon PT resolution PT = 10 GeV PT = 50 GeV PT = 100 GeV PT = 500 GeV PT = 1000 GeV Tracking Singlept 10 GeVfully efficient Tracker detector Electron classification Electron Energy
Validation Results Tau HLT Efficiencies (QCD and Z to tau tau) Btag efficiency Track counting Jet resolution standard/Particle Flow
A powerful visualization tool, iguana is already used for both commissioning, geometry checks and algorithm development Visualization A real cosmic muon seen while commissioning
HLT: 43 ms/event Ok with CMS TDRs Offline reconstruction Minbias ~3 s/ev QCD high Pt ~ 25 s/ev TDR numbers ~20 s/ev on todays machines (T = 25 kSi2k*s / P ) Simulation Still a factor ~2 more than in TDR (T = 90 kSi2k*s / P) Data sizes Currently Event size (FEVT) = 1.5 MB Reduced events: RECO: 500 kB AOD: 50 kB a factor 2 more than anticipated, but a lot of room for improvement Performance AOD (50 kB/ev) RECO (500 kB/ev) • Tracks RecHits 113 kB/ev • HCAL RecHits 42 kB/ev • ECAL RecHits 36 KB/ev • Calo Towers 26 kB/ev
Richieste 2008 • Le richieste finanziarie 2008 presuppongono che a Settembre si discuta per lo sblocco SJ (125 kEuro) • Chiediamo con forza di approvare Pisa come T2 per i meriti acquisiti sul campo • Il profilo temporale per il 2008 richiesto da CMS e’ di avere meta’ delle risorse per inizio aprile 2008 (convenzione WLCG). • La richiesta per il T1 e’ di seguire il “piano Forti”, visto che, tra cosmici e LHC, nel 2008 iniziera’ la presa dati reale
Commissioning Detector • TK: dati processati alla TIF nel 2007 (25 % Tracciatore) = 20 TB --> 100 TB • ECAL: • acquisizione dati con trigger cosmico (Global Running) • acquisizione dati in modalità locale ECAL con trigger cosmico • acquisizione dati in modalità locale ECAL con trigger di piedistallo, test-pulse o laser. Evento ECAL non zero-suppressed = 1.8 MB Evento CMS zero-suppressed = 1.5 MB (0.3 MB senza Tracciatore) Rate cosmici in caverna = 100 Hz Rate trigger locale ECAL = 5 Hz 6 TB/day per run globale con trigger cosmico 500 GB/day per run locale ECAL
Richieste Bari • SJ 2007 • 5 WN (75 kSI2K) [17,5 kE] • 2 server [7 kE] • 2008 • 12 WN (180 kSI2K) [45 kE] • 1 switch rete [10 kE] • 90 TB disco non SAN [135 kE]
Richieste Pisa • SJ 2007 • 20 TBN disco [30 kE] • Testa storage [15 kE] • 2008 • 22 WN (330 kSI2K) [80 kE] • 125 TBN disco SAN [190 kE + 40 kE] • 1 centro stella [25 kE]
Richieste Roma • SJ 2007 • 1 testa storage [15 kE] • 1 switch centro stella [3.5 kE] • 2008 • 25 WN (375 kSI2K) [112 kE] • 100 TBN disco SAN + server controllo[150 kE + 25 kE] • 1 centro stella a 10 Gb/s [25 kE] • Dispositivi di controllo remoto[10 kE]
Richieste Legnaro • SJ 2007 • 11 WN (165 kSI2K) [45 kE] • 2008 • 27 WN (400 kSI2K) [100 kE] • 120 TBN disco SAN [180 kE + 25 kE] • 8 server per servizi [28 kE] • 1 centro stella [40 kE] • 2 switch di rack [20 kE]
Riassunto richieste 2008 15 KSi2K/WN (dual cpu quad core) 0.25 E/SI2K 1.5 E/GB 5 kE/server (1 ogni 20 TB) per SAN