1 / 32

Offline Discussion

Datarec status Reprocessing plans MC status MC development plans Linux Operational issues Priorities AFS/disk space. Offline Discussion. M. Moulson 22 October 2004. Datarec DBV-20. Run > 31690. DC geometry updated Global shift: D y = - 550 μ m, D z = - 1080 μ m

marlis
Download Presentation

Offline Discussion

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Datarec status • Reprocessing plans • MC status • MC development plans • Linux • Operational issues • Priorities • AFS/disk space Offline Discussion M. Moulson 22 October 2004

  2. Datarec DBV-20 Run > 31690 • DC geometry updated • Global shift: Dy = -550 μm, Dz = -1080 μm • Implemented in datarec for Run > 28000 • Thickness of DC wall not changed (-75 μm) • Modifications to DC timing calibrations • Independence from EmC timing calibrations • Modifications to event classification (EvCl) • New KSTAG algorithm (KS tagged by vertex in DC) • Bunch spacing by run number in T0_FIND step 1 for ksl • 2.715 ns for 2004 data (also for MC, some 2000 runs) • Boost values • Runs not reconstructed without BMOM v.3 in HepDB • px values from BMOM(3) now used in all EvCl routines

  3. Datarec operations • Runs 28479 (29 Apr) to 32380 (21 Oct, 00:00) • 413 pb-1 to disk with tag OK • 394 pb-1 with tag = 100 (no problems) • 388 pb-1 with full calibrations • 371 pb-1 reconstructed (96%) • 247 pb-1 DSTs (except K+K-) • fsun03-fsun10 decommissioned 11 Oct • Necessary for installation of new tape library • datarec submission moved from fsun03 to fibm35 • DST submission moved from fsun04 to fibm36 • 150 keV offset in s discovered!

  4. 150 keV offset in s • Discovered while investigating ~100 keV discrepancies between physmon and datarec • +150 keV adjustment to fit value of snot implemented • in physmon • in datarec • when final BVLAB s values written to HepDB • Plan of action: • New Bhabha histogram for physmon fit, taken from data • Sync datarec fit with physmon • Fix BVLAB fit before final 2004 values computed • Update 2001-2002 values in DB records • histogram_history and HepDB BMOM 2001-2002 currently from BVLAB scan, need to add 150 KeV • Update of HepDB technically difficult, need a solution

  5. Reprocessing plans • Issues of compatibility with MC • DC geometry, T0_FIND modifications by run number • DC timing modifications do not impact MC chain • Additions to event classification would require new MCDSTs only • In principle possible to use run number range to fix px values for backwards compatibility • Use batch queues? • Main advantage: Increased stability

  6. Further datarec modifications • Modification of inner DC wall thickness (-75 μm) • Implement by run number • Cut DC hits with drift times > 2.5 μs • Suggested by P. de Simone in May to reduce fraction of split tracks • Others?

  7. MC production status

  8. Generation of rare KSKL events KS pen pmn gg p+p-p0 3p0 KL p+p- p0p0 gg p+p-g (DE) p0pn • Peak cross section: 7.5 nb • Approx 2x sum of BRs for rare KL channels • In each event, either KS or KL decays to rare mode • Random selection • Scale factor of 20 applies to KL • For KS, scale factor is ~100

  9. MC development plans • Beam pipe geometry for 2004 data (Bloise) • LSB insertion code (Moulson) • Fix rp generator (Nguyen, Bini) • Improve MC-data consistency on tracking resolution (Spadaro, others) • MC has better core resolution and smaller tails than data in Emiss - pmiss distribution in pp background for KS pen analysis • Improving agreement would greatly help for precision studies involving signal fits, spectra, etc. • Need to systematically look at other topologies/ variables • Need more people involved

  10. Linux software for KLOE analysis • P. Valente had completed an earlier port based on free software • VAST F90-to-C preprocessor • Clunky to build and maintain • M. Matsyuk has completed a KLOE port based on the Intel Fortran compiler for Linux • Individual, non-commercial license is free • libkcp code compiles with zero difficulty • Reconsider issues related to maintenance of KLOE software for Linux

  11. Linux usage in KLOE analysis • Most users currently processing YBOS DSTs into Ntuples on farm machines and transferring Ntuples to PCs • AFS does not handle random-access data well • i.e., writing CWNs as analysis output • Multiple jobs on a single farm node stress AFS cache • Farm CPU (somewhat) limited • AFS disk space perennially at a premium • KLOE software needs minimal for most analysis jobs • YBOS to Ntuple: No DC reconstruction, etc. • Analysis jobs on user PCs accessing DSTs via KID and writing Ntuples locally should be quite fast • Continuing interest on part of remote users

  12. KLOE software on Linux: Issues • Linux machines at LNF for hosting/compilation • 3 of 4 Linux machines in Computer Center are down, including klinux (mounts /kloe/soft, used by P. Valente for VAST build) • KLOE code distribution • User PCs do not mount /kloe/soft • Move /kloe/soft to network-accessible storage? • Use CVS for distribution? • Elegant solution but user must periodically update… • 3. Individual users must install Intel compiler • 4. KID • Has been built for Linux in the past • 5. Priority/manpower

  13. Operational issues • Offline expert training • 1-2 day training course for all experts • General update • PC backup system • Commercial tape backup system available to users to backup individual PCs

  14. Priorities and deadlines • In order of priority, for discussion: • Complete MC production: KSKL rare • Reprocessing • MC diagnostic work • Other MC development work for 2004 • Linux • Deadlines?

  15. Disk resources 2001 – 2002 Total DSTs 7.4 TB Total MCDSTs 7.0 TB 2004 DST volume scales with L • 3.2 TB added to AFS cell • Not yet assigned to analysis groups • 2.0 TB available but not yet installed • Reserved for testing new network-accessible storage solutions

  16. Limitations of AFS • Initial problems with random-access files blocking AFS on farm machines resolved • Nevertheless, AFS has some intrinsic limitations: • Volume sizes at most 100 GB • Already pushed to the limit – max spec is 8 GB! • Cache must be much larger than AFS-directed data volume for all jobs on farm machine • Problem characteristic of random-access files (CWNs) • Current cache sizes 3.5 GB on each farm machine • More than sufficient for a single job • Possible problems with 4 big jobs/machine • Enlarging cache sizes requires purchase of more local disk for farm machines

  17. Network storage: Future solutions • Possible alternatives to AFS • NFS v. 4 • kerberos authentication – use klog as with AFS • Size of data transfers smaller, expect fewer problems with random-access files • Storage Area Network (SAN) filesystem • Currently under consideration as a Grid solution • Works only with Fibre Channel (FC) interfaces • FC – SCSI/IP interface implemented in hardware/software • Availability expected in 2005 • Migration away from AFS probable within ~6 months • 2 TB allocated to tests of new network storage solutions • Current AFS system will remain interim solution

  18. Current AFS allocations 365 200 400

  19. A fair proposal? • Each of the 3 physics WGs gets 1400 GB total • Total disk space (incl. already installed) divided equally • Physics WGs similar in size and diversity of analyses • WGs can make intelligent use of space • e.g.: Some degree of Ntuple sharing already present • Substantial increases for everyone anyway

  20. Additional information

  21. Offline CPU/disk resources for 2003 • Available hardware: • 23 IBM B80 servers: 92 CPU’s • 10 Sun E450 servers: 18 B80 CPU-equivalents • 6.5 TB NFS-mounted recall disk cache • Easy to reallocate between production and analysis • Allocation of resources in 2003: • 64 to 76 CPU’s on IBM B80 servers for production • 800 GB of disk cache for I/O staging • Remainder of resources open to users for analysis

  22. Analysis environment for 2003 • Production of histograms/Ntuples on analysis farm: • 4 to 7 IBM B80 servers + 2 Sun E450 servers • DST’s latent on 5.7 TB recall disk cache • Output to 2.3 TB AFS cell accessed by user PC’s • Analysis example: • 440M KSKL events, 1.4 TB DST’s • 6 days elapsed for 6 simultaneous batch processes • Output on order of 10-100 GB • Final-stage analysis on user PC/Linux systems

  23. CPU power requirements for 2004 Input rate (KHz) B80 CPU’s needed to follow acquisition 76 CPU offline farm Avg L (1030 cm-2s-1)

  24. CPU/disk upgrades for 2004 • Additional servers for offline farm: • 10 IBM p630 servers: 10×4 POWER4+ 1.45 GHz • Adds more than 80 B80 CPU equivalents to offline farm • Additional 20 TB disk space • To be added to DST cache and AFS cell • More resources already allocated to users • 8 IBM B80 servers now available for analysis • Can maintain this allocation during 2004 data taking Ordered, expected to be on-line by January

  25. Installed tape storage capacity • IBM 3494 tape library: • 12 Magstar 3590 drives, 14 MB/s read/write • 60 GB/cartridge (upgraded from 40 GB this year) • 5200 cartridges (5400 slots) • Dual active accessors • Managed by Tivoli Storage Manager • Maximum capacity: 312 TB (5200 cartridges) • Currently in use: 185 TB

  26. MC DST recon raw Tape storage requirements for 2004 Stored vol. by type (GB/pb-1) Tape library usage (TB) 118 2002 98 43 16 free 2004 est. Incl. streaming mods 57 49 43 16 Today +780 pb-1 +1210 pb-1 +2000 pb-1

  27. Tape storage for 2004 • Additional IBM 3494 tape library • 6 Magstar 3592 drives: 300 GB/cartridge, 40 MB/s • Initially 1000 cartridges (300 TB) • Slots for 3600 cartridges (1080 TB) • Remotely accessed via FC/SAN interface • Definitive solution for KLOE storage needs Bando di gara submitted to Gazzetta Ufficiale Reasonably expect 6 months to delivery Current space sufficient for a few months of new data

  28. Machine background filter for 2004 • Background filter (FILFO) last tuned on 1999-2000 data • 5% inefficiency for ppg events, varies with background level • Mainly traceable to cut to eliminate degraded Bhabhas • Removal of this cut: Reduces inefficiency to 1% • Increases stream volume 5-10% • Increases CPU time 10-15% • New downscale policy for bias-study sample: • Fraction of events not subject to veto, written to streams • Need to produce bias-study sample for 2001-2002 data • To be implemented as reprocessing of a data subset with new downscale policy • Will allow additional studies on FILFO efficiency and cuts

  29. Other offline modifications for 2004 • Modifications to physics streaming: • Bhabha stream: keep only subset of radiative events • Reduces Bhabha stream volume by 4 • Reduces overall stream volume by >40% • KSKL stream: clean up choice of tags to retain • Reduces KSKL stream volume by 35% • K+K- stream: new tag using dE/dx • Fully incorporate dE/dx code into reconstruction • Eliminate older tags, will reduce stream volume • Random trigger as source of MC background for 2004 • 20 Hz of random triggers synched with beam crossing allows background simulation for L up to 21032 cm-2s-1

  30. KLOE computing resources DB2 server IBM F50 4×PPC604e 166 • online farm • 7 IBM H50 4×PPC604e 332 • 1.4 TB SSA disk AFS cell 2 IBM H70 4×RS64-III 340 1.7 TB SSA + 0.5 TB FC disk 100 Mbps 1 Gbps CISCO Catalyst 6000 nfs afs offline farm 19 IBM B80 4×POWER3 375 8 Sun E450 4×UltraSPARC-II 400 analysis farm 4 IBM B80 4×POWER3 375 2 Sun E450 4×UltraSPARC-II 400 file servers 2 IBM H80 6×RS64-III 500 nfs nfs managed disk space0.8 TB SSA: offline staging 6.5 TB 2.2 TB SSA + 3.5 TB FC: latent disk cache tape library IBM 3494, 5400 60GB slots, 2 robots, TSM 324 TB 12 Magstar E1A drives, 14 MB/sec each

  31. 2004 CPU estimate: details Extrapolated from 2002 data with some MC input • 2002 • L = 36 mb-1/s • T3 = 1560 Hz • 345 Hz f + Bhabha • 680 Hz unvetoed CR • 535 Hz bkg • 2004 • L = 100 mb-1/s (assumed) • T3 = 2175 Hz • 960 Hz f + Bhabha • 680 Hz unvetoed CR • 535 Hz bkg (assumed constant) • From MC: • sf = 3.1 mb (assumed) • f + Bhabha trigger: s = 9.6 mb • f + Bhabha FILFO: s = 8.9 mb • CPU(f + Bhabha) = 61 ms avg. • CPU time calculation: • 4.25 ms to process any event • + 13.6 ms for 60% of bkg evts • + 61 ms for 93% of f + Bha evts • 2002: 19.6 ms/evt overall – OK • 2004: 31.3 ms/evt overall (10%)

  32. 2004 tape space estimate: details • 2001: 274 GB/pb-1 • 2002: 118 GB/pb-1 • Highly dependent on luminosity • 2004: Estimate a priori • Assume: 2175 Hz @ 2.6 KB/evt • Raw event size assumed same for all events (has varied very little with background over KLOE history) • Assume: L = 100 mb-1/s • 1 pb-1 = 104 s: • 25.0 GB for 9.6M physics evts • 31.7 GB for 12.2M bkg evts • (1215 Hz of bkg for 104 s) • 56.7 GB/pb-1 total Include effects of streaming changes: raw recon Assumes 1.7M evt/pb-1 produced f  all (1:5) and fKSKL (1:1) MC

More Related