270 likes | 391 Views
WBS 3.2 – Data Acquisition. Paris Sphicas/CERN-MIT US CMS Level-2 DAQ manager DOE/NSF Review May 8, 2001. Outline. Overview of DAQ & High Level Trigger Status and Technical Progress Scope and Contingency Since Last Review Committee Concerns and Issues Plans Summary and Conclusions.
E N D
WBS 3.2 – Data Acquisition • Paris Sphicas/CERN-MIT • US CMS Level-2 DAQ manager • DOE/NSF Review • May 8, 2001
Outline • Overview of DAQ & High Level Trigger • Status and Technical Progress • Scope and Contingency Since Last Review • Committee Concerns and Issues • Plans • Summary and Conclusions
System Overview: DAQ Original design: Lvl1 @ 100 kHz Rescope in 1997: 75kHz But design all elements to be able to do 100 kHz 40 MHz 16 Million channels DETECTOR CHANNELS COLLISION RATE 3 Gigacell buffers LEVEL-1 TRIGGER Charge Time Pattern Energy Tracks 75 kHz 1 MB EVENT DATA 1 Terabit/s 200 GB buffers READOUT ~ 400 Readout memories 50,000 data channels EVENT BUILDER. A large switching network (400+400 ports) with total throughput ~ 400Gbit/s forms the interconnection between the sources (deep buffers) and the destinations (buffers before farm CPUs). 500 Gigabit/s SWITCH NETWORK ~ 400 CPU farms EVENT FILTER. A set of high performance commercial processors organized into many farms convenient for on-line and off-line applications. 100 Hz FILTERED 5 TeraIPS EVENT Computing Services Gigabit/s Petabyte ARCHIVE SERVICE LAN
Detectors Detectors Lvl-1 Front end pipelines Lvl-1 Front end pipelines Readout buffers Readout buffers Lvl-2 Switching network Switching network Lvl-3 HLT Processor farms Processor farms “Traditional”: 3 physical levels DAQ architecture • Must reduce 1 GHz of input interactions to 100 Hz • Do it in steps/successive approximations: “Trigger Levels” CMS: 2 physical levels
Lvl-1 Lvl-2 Lvl-1 Lvl-3 HLT 2 vs 3 physical levels • Two Physical Levels • Investment in: • Bandwidth • Commercial Processors • Three Physical Levels • Investment in: • Control Logic • Specialized processors (possibly) Model Data Bandwidth Data Access Bandwidth Processing Units
US: Event Manager + Builder Units CMS DAQ: US contribution CERN: Inputs+ Switch Detector Front-end Level 1 Trigger Readout Systems Event Manager Builder Networks Run Control US US: Outputs+ EVM Builder and Filter Systems ComputingServices BU Other responsibilities: Detector Front-Ends: detector groups Computing Services: infrastructure FU Filter Units not included in “outputs” FU FU
Developments last year (I) • Multistep Event Building no longer necessary • Initial decision to invest in networking and computing technologies proving correct • Today: two alternatives: Myrinet 2000 (2.5 Gb/s links) and/or two Gbit/s Ethernet links/RU • Tomorrow: + Infiniband (?)
Joint Technical Board TRIDAS (Online farm) Core Computing & Software Physics Reconstruction & Selection Martti Pimia David Stickland Paris Sphicas Sergio Cittolin ReconstructionGroup. RPROM(Stephan Wynhoff) Simulation Group. SPROM (Albert De Roeck) Architecture Task Force. CAFE(Jim Branson) … Developments last year (II) • Physics Reconstruction and Selection (PRS): new project in CMS; along with CCS and TriDAS (online): CPT
Developments last year (III) • High Level Trigger: included in “PRS” • Defining “Level-2” as anything doable without tracking information, Level-2 is ~ complete • New LHC schedule new date for DAQ TDR • First beams in early 06, first physics in Aug 06 • Submission date was always set to T0(LHC)-3.5 yrs. • With new schedule, submission goes to end (Nov 30) 2001 • Schedule & Milestones: • Unchanged, especially for the HLT/PRS part(s) • What gets delayed is decisions on technologies to use, etc., but not the results of the studies. • However, with another year’s technology with us, we can expect that most of the data transfer issues are no longer with us, so we just concentrate on (a) the algorithm itself and (b) the CPU needed
Progress Since Last Review • 16x16 Event Builder Demonstrator complete: • Based on Myrinet-2000: • Barrel-shifter works at close to 100% (raw) efficiency • Based on Gbit Ethernet: • Looks very promising – especially if 10 Gbit Ethernet in time • Designs for 500x500 switch available • Simulation results very pomising • Builder Unit prototype: • Two solutions being looked at: • Custom-made board (commercial components) • Recycling of units made for Readout into a PC • High Level Trigger: • “Level-2” equivalent algorithms in place • Now working on “Level-3” (~ includes tracker information)
Progress: Readout Unit Aim: complete chain test in 2001
RU0 RU1 RU2 RU3 ... ... ... ... BU0 BU1 BU3 BU2 Progress: switch • 16x16 EVB based on Myrinet and on Gbit Ethernet now complete • Barrel-shifting gives non-blocking behavior 4k 2k
Current device Done at UCSD Copper Gbit Ethernet NIC PowerPC CPU RAMlink interface Progress: BU prototype
Progress: HLT algorithms • PRS groups in place since 4/99 priority on HLT • Using new (OO) software reconstruction (ORCA) • “Level-2” equivalent code in place; now “Level-3” • The question: when should we add tracking information?
PARTITION k PARTITION j PARTITION i GUI GUI RUN MANAGER RUN MANAGER RUN MANAGER EF Mngr CS Mngr EVB Mngr DCS Mngr DCS System EF System CS System EVB System Sub-Systems Managers Sub-SystemsResources Progress: control software Trigger Mngr Trigger System Sub-Systems: - EVB = Event Builder - EF = Event Filters - DCS = Detector Control System - CS = Computing Service - LHC = LHC Main Control CMS Sub-System
nxm mxn rxr nxn S1 D1 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... nxn mxn nxm rxr nxn nxn 1 2 S2 D2 ... ... ... ... ... nxn nxn mxn nxm rxr Dr-1 nxn rxr nxm nxn mxn n Sr Dr Progress: towards a composite switch (I) • Using Myrinet 2000 • (available today) • bisection bandwidth 1 Tbps • 6 layer - 512 minimal • routes for each source – destination pair Issue: design a 500x500 switch fabric out of smaller (e.g. 32x32, 64x64) basic switches Clos Network (93) Banyan Network (46) Clos-128 switch
Progress: towards a composite switch (II) 1 20 BUs 20 BUs 1 1 20 RUs 20 RUs 2 3 1 2 2 4 2 40 Ports 25 Ports 10G 40 Ports 25 25 20 20 Ports 2 Ports 10G 25 Ports
DAQ - BCWS and BCWP • Cumulative BCWP/BCWS = 95%; little schedule slippage • DAQ has completed BCWP/EAC = 18% of the project. Change Control to account for demonstrator schedule Change in accounting (AY$) (+ delayed actuals reported) FNAL Software Engr added And work starts going faster
DAQ - Contingency Use DAQ decreased its cost in FY00$ – drop of ~ 3%. Most of the change due to M&S (prices dropping). Effect increased in AY$ units. Moved profile to later: 2003-04 procurements (now) scheduled for 2004-05. AY$ increase. Recosting (mainly)
2406 861 404 DAQ - Yearly BCWS Old schedule: most of the cost (3.3M$ out of 4.4M$ total) in FY03 & FY04 New schedule: same cost is now distributed in years FY03-FY04-FY05 Not final… exact schedule for (crucial) 04-05 period to be defined at TDR
DAQ – ML 1-2 • All of them have been met • Two in past 12 months • Only change: DAQ TDR is anticipated for end 2002 • Will determine set of milestones for “production/building” stage @ TDR time
Last Review Concerns • Concerns from last time: • Add a physicist or software professional familiar with data acquisition to the data acquisition effort. This project has made good progress with the manpower it has from the CMS project and the support of the base high energy physics program, but additional manpower, as recommended last year, is still important. It would be best to hire an individual in the next year who could then participate in the development of the TDR for data acquisition and would remain committed to CMS through the turn-on of the data acquisition system in 2005. • Response: US CMS have made a high priority request to the base program for additional support at U.C. San Diego. This request was made at the meeting between US CMS and DOE/NSF on Sept. 11 and it was well received. It is therefore assumed that an additional postdoc will be available to work on the DAQ effort. Should that not come about, the recommendation will be revisited in order to find an alternative solution.
Plans for this year • HLT: complete Level-3 equivalent code • Goal is to get rate down to ~ few kHz (Lvl-3) • Create first trigger table for O(100)Hz output (Lvl-4) • DAQ: complete demonstrator to 32x32 • Complete comparison with simulation • Test out 2-Gbit scenarios • Vertical chain test • Integrate Readout Unit, Switch, Builder Unit + Event Manager in one testbed • Check hardware/software interoperability • TDR: aim for first draft at end of 2001
Summary & Conclusions • Technology is moving fast in the right direction • Single-step EVB is now the baseline design • EVB prototype program • Very good results from traffic shaping (16x16) • EVM and BU on track • High Level Trigger • Organizational changes: PRS project • Full Lvl-2 results in July 2000; now on Lvl-3 • Project Management • Schedule reasonable (95% on track) • Cost experience so far • TDR: new date: end 2002; aim for draft end 2001
DAQ - Estimate to Complete DAQ Cost to complete: 4.389 M$ Contingency: 2.371 M$ (54%) (adequate, given most of cost is in M&S)
DAQ Resource Usage • Engineering and Technical resources are compared to the people called out in the annual SOW. This tracking ensures that the needed labor is deployed.