180 likes | 332 Views
Run IIb DAQ / Online status. Stu Fuess Fermilab. Introduction. The plan to meet the DAQ and Online computing requirements for Run IIb : Level 3 farm node increase Brown, Univ. of Washington, Fermilab Host system replacements / upgrades Hardware: Fermilab, Software:various
E N D
Run IIb DAQ / Online status Stu Fuess Fermilab
Introduction • The plan to meet the DAQ and Online computing requirements for Run IIb : • Level 3 farm node increase • Brown, Univ. of Washington, Fermilab • Host system replacements / upgrades • Hardware: Fermilab, Software:various • Control system node upgrade • Fermilab • Will discuss status and M&S ETC
Level 3 • The issue: • Processing time per event rises with luminosity • See next slide • Overall L3 processing capacity sets maximum L2 accept rate • The plans • Increase the capacity of the L3 farm • This project • Continue to improve the L3 filter code to reduce L3 processing time • Several improvements coming up in next releases • Speed up the reconstruction algorithms, especially the Muon andTracking • Optimization of the Trigger Menu • Reduce the number of clustering/tracking tools used per event This area will require continued diligence as the luminosity increases
Level 3 Filtering Time vs Luminosity Converted to the equivalent time on a 1 Ghz processor System types This is a function not only of Luminosity but also of trigger mix Conversion factor Trigger list V13.40 Note: ~10% faster with current (V13.50) trigger list
Level 3 farm node purchase plans • Plan: Single purchase, by summer 2005, of $210K* of nodes • 3 racks of 32 (40) = 96 (120) nodes plus infrastructure (if pricing allows) • Strategy: • This is an “off the shelf” purchase, but a major one • Similar to a CompDiv farms purchase • Used a Run IIa purchase to refine the procedure: • 32 node addition operational on 6/23/04 Dual Nodes “GHz” plan 48 1.0 to be retained (initially) 34 1.6 existing 32 2.0 existing 32 2.0 borrowed, to be returned 96 (120) 2.2 to be added 464 GHz equiv CPUs now 758 (864) GHz equiv CPUs for start of RunIIb The plan: Thanks to Computing Division for help! * Unburdened FY02 $
Level 3: M&S Summary • This purchase will nearly double existing processor capacity • But is this enough? • We would like a contingency to add 64 more nodes (up to limit of network switch) • Need based on evaluation of performance at highest observed luminosities by Summer 2005 Will use this format at WBS Level 3, then summarize at end
Host systems • Need • Replace old 3-node Alpha cluster, which had the functions: • Event data logger, buffer disk, transfer to FCC • Oracle database • NFS file server • Detector monitoring, logging (alarms, e-log, web,…) • Plan • Replace with Linux servers • Install a number (~4) of clusters which supply "services“ • Shared Fibre Channel (FC) storage and failover software to provide flexibility and high availability • $247K* for processor and storage upgrades • Status • In operation since Fall shutdown in a minimal configuration * Unburdened FY02 $
DAQ Services Cluster Database Cluster File Server Cluster Online Services Cluster DØ Online Linux clusters Working in this configuration, but with a minimal population in each box Clients Network Switch SAN Fibre Channel Switch Fibre Channel Switch Legacy RAID Array Legacy JBOD Array RAID Array JBOD Array
Control System • Need: • The current control system processors (~100 of them) • are no longer available for purchase • are limiting functionality in some areas • Plan: • Upgrade ~1/2 of the control system processors • Purchase a current generation processor • Use replaced processors for Calorimeter readout crates,test stands, and spares • Eliminate all mv162 (68K processor) boards • Strategy: • $140K* to upgrade processors • Scheme for replacement on next slide: * Unburdened FY02 M&S $
Control System Status • Implementation status • mv5500 works in Muon system crates • Prototyped in CFT/SMT tracking readout crates • Targeting Summer ’05 for transitions • Purchasing status • 45 new processors planned55 adapter boards required • 19 processors in hand6 back-ordered order for 20 more pending • All adapter board flavors ordered
Conclusion • Three activities: • Level 3 • Host systems • Control system • Level 3 is an “addition of nodes” • Host system changes are most revolutionary • Successful transition during Fall 2004 shutdown • Now expanding beyond minimal configuration • Control system is a “replacement of nodes” • With current generation processor • Creating a spare pool for long-term operation On track, with “at the last moment” spending