190 likes | 208 Views
CDF Offline Status and Plans. Ray Culbertson for the Offline Group. Ray Culbertson, co-leader Aidan Robson, co-leader Elena Gerchtein , Assoc Head for Production Stephan Lammel , Assoc. Head for Services. Ending Starting. Production Elena Gerchtein Calibration
E N D
CDF Offline Status and Plans Ray Culbertson for the Offline Group
Ray Culbertson, co-leader Aidan Robson, co-leader Elena Gerchtein, Assoc Head for Production Stephan Lammel, Assoc. Head for Services Ending Starting Production Elena Gerchtein Calibration Willis Sukamoto (L) Dan Goldin calibrators Production Elena Gerchtein (L) operators Ntupling operators Monte Carlo Costas Vellidis (L) OksanaTadevosyan Liqun Li MC Reps Services Stephan Lammel (Mike Kirby) Grid/DH Joe Boyd (L,CAF) Eric Wicklund (L,DH) Mike Wang (Robert Illingworth) (Marc Mengel) (REX) (Site Coordinators) Databases Eric Wicklund (L) (Barry Blumenfeld) (Dennis Box) (DBAs and admins) Infrastructure Stephan Lammel (L) (CD) Code Management Jim Bellinger (L) Lynn Garren (Donatella Torretta) Remote Sites Aidan Robson (L) Ray Culbertson (L) Site Coordinators 9/28/2011
Calibrators SVX Align Dominik Horn SVX Timo Aaltonen COT Kevin Burkett dE/dx Keith Matera Beamlines Roberto Carosi TOF Jesus Manuel Vizan Garcia PES Halley Brown PEM Willis Sakumoto, CHA Fabio Happacher CEM Larry Nodulman Cal Timing Adam Aurisano, CP2/CCR Azeddine Kasmi PASS Tom Riddick MC Reps HDG ShalhoutShalhout TOP Dave Mietlicki EWK Maria D'Errico BOT Hideki Miyaki EXO John Strologas QCD Erik Jens Brucken Ntuple maintainers TopntupleHyunsu Lee BStntuple Michael Morello Angelo Di Canto Stntuple Ray Culbertson Ending Starting Site Coordinators Fermigrid Steve Timm LCG/CNAF Donatella Lucchesi Silvia Amerio Matteo Bauce PACAF Masakazu Kurata Yuji Takeuchi Suen Hou Tsan Hsieh KISTI Seo-Young Noh Beob Kyun Kim Heejun Yoon Christophe Bonnaud MIT Max Goncharov SAM Shifters Ivan Vila Giovanni Piacentino Stefano Giagu Barry Blumenfeld Peter Bussey Thomas Kuhr Alberto Ruiz Aidan Robson Operators Olga Terlyga OksanaTadevosyan Zhenbin Wu Jon Wilson Aristotle Calamba 9/28/2011
Major Systems Overview Raw Data enstore (tape) ILP Project Desktops production servers dCache Monte Carlo servers diskpool Off-site farms EURGRID/ PACCAF/ namgrid KISTI/MIT/ GP/CMS cdfgrid (5500 slots)
Data Handling ► Enstore tape system: 16 LTO3 drive, 26 LTO4 drives ○ Delivering typically 20TB per day, 50TB peak ► dCache, main disk cache, ~400TB, with tape backend ○ Delivering typically 50TB per day, 150TB peak ► overall very smooth operations last year! 50TB/day > < one year >
Data Handling ► dCache major upgrade 300TB → ~800TB in May ○ smooth transition on this major project – great success! ○ appears to have relieved large backlogs - a significant difference! capacity upgrade < 14 months >
Data Handling ► Planning needs ○ no new tape drives added last year ○ loads expected to be manageable (thanks to dCache upgrade) ○ new tape storage robot arrived in July, plenty of space ○ purged ~5% in unused datasets ► Tape generation migration ○ from LTO-4 (0.8TB) to T10K (5TB) ○ testing recently signed off ○ 6 T10K drives in FY11, 4 more in FY12 ○ start migrating raw data this year, the bulk of our 9PB in FY12, 13 < 14 months >
CdfGrid ► 5500 slots ► Smooth operations! ► last maintenance replacements purchased in FY11, start decreasing in FY13 ► Loads: - heavy but manageable Jan-Jul 5K > 40K > < 1 year >
NamGrid ► A portal to offsite farms running OSG, ~20% of CdfGrid ○ achieved regular access to GP and CMS farms! ○ last fall achieved solid integration of the KISTI site! ○ accessing SAM cache and CdfCode ○ regularly runs MC ► MIT site continues to be very reliable! ►moderate loads 1000 > < 1 year >
Tier1 Farm at CNAF Eurogrid ► Italian colleages reorganized the CNAF and LCGCAF farms ○ glideinWMS VO Frontend GlideIn Factory CNAF head-node LCG
Eurogrid ► the glidein layer prevents LCGCAF from swallowing jobs this makes a huge difference!! ► data transfer are also faster due to general network improvement ► A huge success! Users are voting for it! ► European CDF resources which were languishing are now used! 1.5K > < 3 months >
Diskpool ► 325TB of “persistent” dCache assigned to physics groups ► mostly smooth operations ► Alexei Varganov, our diskpool expert, has left for a new job ► Physics groups have decided the diskpool is less critical now, and we can backup data and live with the greater uncertainty ►backup process is very tedious – thanks to production group for their persistence – it will be done soon! GB status top 63 85% backed up, progressing hdg 33 done, almost signed off ewk 31 down to 3 users bnt 26 almost done exo 17 done qcd 16 investigating A lot of work: ~2000 datasets!
Code Management ► Last winter developed large file (2 to 8GB) support ○ contributes to improved tape access speed done and deployed! ► New major project: develop legacy code releases ○ incorporate accumulated patches ○ modernize all support packages ○ finally migrate to root version 5! ○ improve infrastructure first test release is out now, hope to be done this fall!
Production Operations ► smooth operations over the year ►New data Production ○ 3 billion events, 450 TB ► Ntupling, 3 flavors, data and MC ○ 6 billion events data/MC, 300TB ► Monte Carlo operations ○ Generated 890M events, 150TB last year - continued strong demand! ► Reprocessing…
Reprocessing ► About half of our data has non-optimal Si clustering, reduces tagging 5-10% (Periods 18-28 out of 38) ► Rerun production and ntupling to recover efficiency we met the Higgs group request for Mar 11 delivery of their data!! ► now only a tiny bit left to do…
Two More Projects Two more projects targeted for B physics… ► BStntuple ○ re-ntuple all the B physics data streams ○ add covariance matrix for tracks - allows arbitrary vertexing choices at the ntuple level ○ enables many new analyses, is flexible for the future, and will replace several custom ntuples ► Generic B MC ○ generate, simulate, produce, ntuple bb Monte Carlo ○ has been a wanted for a long, now becoming more urgent ○ details and targeted dataset size are under design
The Final Push planning to get the last data out to physics fast !! P37 prd BStn reprocessing P37 ntp P38 calib P38 prd P38 ntp Special low energy data! s-scan s-scan reprocessing diskpool upload Sep Oct Nov Dec
In the Long Term ► FY12: cdfgrid, dCache, services continue as-is ► FY13: cdfgrid, dCache reduced by size of production needs ► Continue with full functionality, reduced capacity for 5y - farm and interactive CPU - access to all data - production and ntupling capability - full MC simulation, with all generators ► Continuing past 5y ○ concepts under discussion ○ how will LHC discoveries overlap Tevatron data? ○ will require funding and attracting experts
The Bottom Line ► CDF Offline had another very successful year! ○ smooth operations, manageable loads ○ clearing off diskpool ○ preparing future releases ○ delivered reproduced data ► As we dive into the new era ○ finishing strong … and fast! ○ long term still requires work, resources Thanks for your IFC contribution, it is crucial!