270 likes | 278 Views
This proposal suggests upgrading the IBL ROD system by implementing the RCE/CIM concept and utilizing the ATCA platform. The upgrade includes features such as high-speed I/O capabilities, extensive software infrastructure, and a distributed TTC interface.
E N D
Update of IBL ROD proposal based on RCE/CIM concept and ATCA platform Rainer Bartoldus, Andy Haas,Mike Huffer, Martin Kocian, Su Dong, Emanuel Strauss, Matthias Wittgen (SLAC) Erik Devetak, Dmitri Tsybychev (Stonybrook)
IBL ROD Upgrade Scheme Read Out Module Initial mode:pure ROD behavior to output via S-link to ROS from each ROM. Upgrade Mode:combined ROD+ROS behavior directly output to Ethernet.
Essential Features of RCE on ATCA • Generic DAQ concept with RCE born out of analysis of previous HEP DAQ systems to establish basic building blocks serving common needs of broad range of applications. • Explore the modern System-On-Chip technology with e.g. Vertex-4 FPGAs with versatile integrated resources. • High speed I/O capabilities for multi Gb/s transmissions to fully utilize FPGA processing power and reduce system footprint. • Implementation over ATCA based crate infrastructure to benefit from modern telecommunication technology. • A system consists of RCE processing boards and Cluster Interconnect Modules (CIM) to utilize ATCA point-point serial backplane connections for high bandwidth data movements and 10GE ethernet access. • Rear Transition Modules (RTM) to facilitate custom user I/O. • Extensive software infrastructure and utilities are integral part of the design.
Combinatoric reset & bootstrap Logic options 450 MHZ PPC-405 Processor Combinatoric Combinatoric Combinatoric Logic Logic Logic Reconfigurable Cluster Element (RCE) Next generation with Virtex 5 RCE memory 1-2 Gbytes Current implementation On Virtex-4 FPGA MGTs DSP tiles DX Ports Core Boot Options DSP tiles DSP tiles Memory Subsystem Cross-Bar 512 MByte RLDRAM-II Configuration 128 MByte Flash DX Ports Resources + Extensive associated software infrastructure and utilities (192 MAC units) MGTs DSP tiles
RCE Software & Development • Cross-development… • GNU cross-development environment (C & C++) • remote (network) GDB debugger • network console • Operating system support… • Bootstrap loader • Open Source Real-Time kernel (RTEMS) • POSIX compliant interfaces • Standard IP network stack • Exception handling support • Object-Oriented emphasis: • Class libraries (C++) • Plugin support • Configuration Interface
A 48 channel ROM (Read Out Module) Need update for S-link from CIM switch to P3 2x6
RCE Development Lab at CERN HSIO & pixel module at back of rack
RCE Development Status • An open collaboration of anyone interested in exploring the RCE platform for ATLAS upgrades. • Training workshop June/09 at CERN: http://indico.cern.ch/conferenceOtherViews.py?view=standard&confId=57836 contains documentations, online examples, mailing list instructions, RCE lab account signup etc. Everyone is welcome to explore ! • Work are underway to port current pixel calibrations to RCEs with modern pixlib+TDAQ code, aiming at FE-I4 tests, test beam and IBL stave-0. • A compact RCE+HSIO test stand board planned for Feb. • Full set of prototypes in coming months to demonstrate integration of IBL I/O, S-link, TTC interface. • A significantly upgraded generation-2 RCE with Xilinx Virtex 5 is envisioned this year (more memory and user firmware space).
IBL Upgrade Hardware Components (I) • ROM • Regular ROM assumes all functionalities of present ROD and with room to host ROS functionalities. • Each ROM has 6 RCEs (FPGAs) hosting 12 cores to process 48x160Mb/s input FEs (only ~1/5-10 of RCE I/O capacity). • RCE includes all resources for data formatting, DAQ data flow, calibration + memory in present ROD. • Each ROM has one additional CIM FPGA hosting network switch, S-link data gathering/formatting, TTC interface. • RTMROM • Assumes only the front-end communication roles of the present BOC, while only hosting simple drivers for S-links. • 48 channel compact optical components for TX/RX functionalities. • No need to deal with 8b/10b encoding as the RCE has embedded native utilities to encode/decode.
IBL Upgrade Hardware Components (II) • CIM • Assumes the network interconnect management and external interface roles to cover present SBC and TIM functionalities. • RCE master + 2 Fulcrum FM224s ASICs for 10 GE network switching. • RTMCIM • Ethernet I/O connections. • Some functionalities of present TIM and drivers for I/O with the pixel system TTC crate. • There is no longer a dedicated TIM module in the system. TTC interface is distributed with TTCrx ASIC next to each RCE. • There is no longer an SBC in the system. The role of the CPU in SBC for interfacing TDAQ to VME board is taken up by the distributed RCE CPUs (therefore avoiding the limitations of single ethernet port and VME backplane bandwidth at the SBC).
TTC Distribution in RCE/CIM crate Half crate QPLL will be absorbed into RCE Distributed interface with TTCrx ASIC paired with each RCE
ROM RTM Interface • General strategy is to bring plenty signals from each RCE to the P3 connector in a uniform way to keep the ROM completely generic while RTM can be changed for different applications. • Connection count for IBL RTM: • 1 pair wires for clock40 from each RCE (2 cores) • 4 wires (2 up + 2 down) per channel per core (2xcores/RCE, 4 channels/core) • 3+1 wires of I2C per core (+1=spare) • additional 8 wires per core spare ? a 7 RCE x ( 2 + 4x4x2 + 4x2 + 8x2 ) = 406 Current RCE board already has 500 pin P3 p
Upgrade ROM Benefits • Allow more frequent/extensive/faster calibration • Calibration histogram data output path via 10GE ethernet will completely remove data shipping timing concerns. • 4x (12x) more memory per pixel than current VME ROD for IBL (current outer layer), and the memories are internal within RCE with much faster access. • Power PC programming environment much easier than DSPs for complex algorithms, while the 192 DSP tiles/RCE offers large processing power for repetitive simple processing. • Smaller footprint modern hardware for easier production, installation and maintenance. • Simpler variation of the ROM with present RCEs offers prototype and test stand boards to meet FE-I4 tests, stave test needs and same software preserved into full system. • Has built-in architecture evolution flexibility to explore upgrade schemes such as integrated ROD+ROS and potential services to trigger with the very high bandwidth.
Backward Compatibility & Commissioning • Despite the different look of hardware, the user interface will be no different to the existing pixel detector and interface to the rest of pixel DAQ and TDAQ will also look like just another pixel crate (until we try to become ROD+ROS). • New system can also be made to be able to run on present b-layer so that fiber splitting can be done early on with real system as parasitic DAQ commissioning (as extensively used in BaBar/Tevatron). Old b-layer ROD can become (plenty) spares for outer layers. • The most important issue is software compatibility: • Most existing calibration DSP code are adoptable to RCE CPU which is an much easier environment. • The SBC TDAQ interface and infrastructure code can also run on the RCE. VME is not the magic word to guarantee software backward compatibility, while a flexible modern hardware may do better than naïve prejudice...
Existing Pixel Module 3 Gb/s /CIM 10-GE Ethernet HSIO Application of RCE to Pixel Calibration Pixel Digital Calibration Demo by Martin Kocian After a few mask stages End of calibration Demonstrated at RCE training workshop Jun/15-16/2009 at CERN
Calibration Software Progress • June setup was a bit of hack on old pixlib while we would like to have software written once for teststand->full DAQ system. • Martin created a new framework adapting the current Pixel Action Server based on the TDAQ IS/IPC infrastructure. This now runs on RCE. Compatibility to full system DAQ/calibration ! • June setup had only the digital calibration • Matthias ported the threshold scan DSP code to RCE and ran after a couple of weeks. Needed to convert a few floating point ops to i64 for the fit. • June setup data formatting software was a bit slow • Got some magic code from JJ Russell sped up by ~200 ! Erik and Martin debugged and commissioned the fast formatter. • June setup configuration was a bit slow • Martin and Erik commissioning blockwrite RCE firmware update • June setup was using a command line control • Emanuel/Andy integrating ST control interface
Software Time Line • On track to demonstrate the full calibration chain for digital and threshold scans by IBL gen meeting in Feb., with a modern pixel TDAQ code infrastructure and hopefully ST control interface. • Full set of calibrations can be ready by late summer this year for FE-I4 testing. • Multi-channel readout with existing RCE+HSIO is already being explored and can be expected to be in operation sometime this year for test beam and cosmic telescope if priority is assigned.
Hardware Time Line • Existing RCE + HSIO • Can already run all calibrations for FE-I4/sensor tests without optical link. HSIO serves the “eBOC” function (and much more) compared to the old system. • With a simple new RTM can also run multi-channel test beam and cosmic telescope and even stave-0 (may be a bit slow). • +Prototype RTM (spring this year?): • IBL and current pixel optical link validation • S-link RCE plugin validation • +Gen-1 RCE prototype ROM and CIM with TTC (summer ?) • TTC interface validation • +Gen-2 RCE full ATLAS ROM/CIM prototypes (early 2011) • Full blown multi-channel DAQ/calibration tests. • Ready for stave-0.
RCE+HSIO combined board • RCE boards have strong software base for flexible and fast development, but rather bulky with the ATCA crate infrastructure and excess resources not needed for test stand. • HSIO has the large variety and multiplicity of I/O channels to serve wide range of applications, but the vast FPGA resources is not easy to explore with coding only in firmware. • Dave Nelson is working on a combined test stand board merging RCE and HSIO: • A slimmed down single FPGA RCE and software support • A separate Virtex-5 FPGA play original HSIO role • Same variety of I/O channels as HSIO • Same simple stand alone bench operation as HSIO with just an external 48V, but can also just plug in an ATCA crate • Expecting to roll for production in Feb/Mar with a rather large demand (~15) from strip upgrade.
Summary • RCE/ATCA based ROD for IBL can easily meet the IBL ROD requirements and offers extra margin for much improved performance. • The project is very much realizable on the IBL time frame owing to the well advanced R&D already carried out at SLAC for other projects, so that the manpower needs is not excessive. • The upgrade system has a small hardware foot print and moderate cost. • The application software effort can benefit from integrated core software utilities and now have a clear path forward for fast progress in calibration implementation. • This can be a very beneficial forward looking step for ATLAS pixel and DAQ in general to evolve smoothly into a modern architecture with extra capacity to allow potentially more innovative use of the pixel system (e.g. in trigger). Additional collaborating effort are very much welcome !
RCE Hardware Resources • Multi-Gigabit Transceivers (MGTs) • up to 12 channels of: • SER/DES • input/output buffering • clock recovery • 8b/10b encoder/decoder • 64b/66b encoder/decoder • each channel can operate up to 6.5 gb/s • channels may be bound together for greater aggregate speed • Combinatoric logic • gates • flip-flops (block RAM) • I/O pins • DSP support • contains up 192 Multiple-Accumulate-Add (MAC) units 23
The Cluster Interconnect (CI) Q0 Q1 Management bus 10-GE L2 switch 10-GE L2 switch 10-GE L2 switch RCE Q2 Q3 • Based on two Fulcrum FM224s • 24 port 10-GE switch • is an ASIC (packaging in 1433-ball BGA) • XAUI interface (supports multiple speeds including 100-BaseT, 1-GE & 2.5 gb/s) • less then 24 watts at full capacity • cut-through architecture (packet ingress/egress < 200 NS) • full Layer-2 functionality (VLAN, multiple spanning tree etc..) • configuration can be managed or unmanaged 24
1-GE 1-GE 10-GE XFP XFP 10-GE 10-GE Cluster Interconnect board + RTM (Block diagram) P3 XFP XFP 10-GE XFP (fabric) XFP (fabric) Q0 Q2 XFP fabric CI MFD P2 base Q1 Q3 (base) XFP XFP (base) XFP P3 Payload RTM 25
To L2 & Event Building (X12) 10 gb/s Rear Transition Module Rear Transition Module P3 P3 L1 fanout L1 fanout switch management switch management 10-GE switch 10-GE switch 10-GE switch 10-GE switch CIM CIM Backplane (x4) 10 gb/s ROMs (x4) 10 gb/s Shelf Management from L1 To monitoring & control from L1 sLHC Upgrade Read-Out-Crate (ROC) 26