Krzysztof Korcyl Institue of Nuclear Physics Polish Academy of Sciences &

Modeling event building architecture for the triggerless data acquisition system for PANDA experiment at the HESR facility at FAIR/GSI Krzysztof Korcyl Institue of NuclearPhysicsPolishAcademy of Sciences & CracowUniversity of Technology

Agenda • PANDA experiment – detectors and requirements for DAQ system • thepush-onlyarchitecture • ComputeNodein ATCA standard • data flowinthearchitecture • shortintroduction on discreteeventmodelling • modelingresults • latency • queuesdynamics • load monitoring • summary

PANDA detectors Experimentat HESR (High Energy Storage Ring) in FAIR (Facility for Antiproton and Ion Research ) complexat GSI, Darmstadt. • Trackingdetectors: • Micro VertexDetector • Central Tracker • Gas ElectronMultiplierStations • ForwardTracker • Particlesidentification: • DIRC (Detection of InternallyReflectedCherenkov) • Time of Flight System • MuonDetection System • Ring ImagingCherenkovDetector • Calorimetry: • Electromagneticcalorimeter

PANDA DAQrequirements • interactionrate: up to 20 MHz (luminosity 2* 1032 cm-2s-1) • typicaleventsize: ~4 kB • expectedthroughput: 80 GB/s (100 GB/s) • richphysics program requires a high flexibilityineventselection • front end electronics workingincontinuoussamplingmode • lack of hardware triggersignal

Thepush-onlyarchitecture SODA Time Distribution System Passivepoin-to-multipointbidirectionalfibernetworkproviding time referencewithprecissionbetterthan 20 ps and performssynchronization of data taking

ATCA crate and backplane ATCA – AdvancedTelecommunicationsComputingArchitecture Backplane: one of possibleconfigurationisfullmesh (connectseachpair of moduleswithdedicatedpoint-to-pointbidirectionalllink

ComputeNode Each board is equipped with 5 Virtex4 FX60 FPGAs. High bandwidth connectivity is provided by 8 Gbitopticallinksconnected to RocketIOports (6.5 Gb/s). In additiontheboardisequippedwithfiveGbit Ethernet links

Inter-cratewiring 416 Module inslot N attheFEElevelconnects, with 2 linkstrunks, to modulesatslots N attheCPUlevel. 416 Theoddeventspacketsatthe FEE levelare first routed via thebackplane and thenoutbound to the CPU level via a propertrunk. Theeveneventspacketsoutbound to the CPU level first and thenusebackplaneto changethe slot.

Inter-crateroutinganimation 205 366

OnboardVirtexconnections Virtex0 – handlesallcommunications to/fromthebackplane Virtex1-4 - manage 2 input and 2 outputportsatthe front panel

Discreteeventmodeling • Model – computer program simulating system dynamicsintime (supportfromSystemClibrary) • Fixed time step • Discreteevents t events t State of the system remainsconstantbetweenevents Processing system ina state maylead to scheduling a neweventinthefuture

Parameterization of ports SendFifo – occupationcangrowifmultiplewritesareallowed OR the link speedissmallerthanthespeed of write OR therecentpacketsaresmallerthantheformerones. ReceiveFifo – occupationcangrowifthequeueheadcan not be transferreddue to destinationbeing busy withanother transfer OR therecentpacketsaresmallerthantheformerones. The transfer speedis a parameter – duringthesimulationsit was set to 6.5 Gb/s (RocketIO)

Models of source and sink of data • Data source – Data Concentrator: • generates data packetwith a sizeproportional to the sum of number of inter-interactionscalculatedfrom Poisson distributionwithaverage of 20 MHz • Burst: 2 µs of interactions + 400 ns of silence gap • Super-burst: 10 bursts • Simulatesthe 8b/10b conversion, tagspacketswithdestination CPU number and pushesintothearchitecture. • Data sink – eventbuilding CPU: • Simulateseventbuilding of 416 fragmentswiththe same tag • size of burst: ~300 kB • size of super-burst: ~3 MB

Eventbuildinglatency Constantlatencyin time indicatesability of thearchitecture to switch not only 100 GB/s but also 173 GB/s

LoaddistributionbetweenCPUs The CPU for nexteventiscalculatedwiththeformula: Nt+1 = mod (Nt + 79) , 416)

Monitoring queues’ evolution Averagedmaximalqueuelengthininputportsatthe FEE level. Theaveraging was overtheportswiththe same indexinall FEE modules.

Monitoring links’ load Load on fiberlinksconnectingoutputportsfromthe FEE levelwithinputportsatthe CPU level. Homogeniousloadindicatesproperroutingscheme – also for trunking.

Monitoring queues Average of maximallength of inputqueuesatthe CPU level. Theaverage was madewithportswiththe same index on allmodules.

Monitoring Virtexlink’soccupation Atthe CPU level, thepacketsheading for odd-numberedCPUs go via thebackplane to the slot withdestination CPU.

Summary • We proposetheeventbuildingarchitecture for thetriggerless DAQ system for the PANDA experiment. • ThearchitectureusesComputeNodemodulesin ATCA standard. • We builtsimplifiedmodels of thecomponentsusingSystemClibrary and run thefullscalesimulations to demonstraterequired performance and to analysedynamics of the system. • Thepush-onlymodeoffers 100 GB/s throughputwhichallows to performburst/super-burstbuilding and to run selectionalgorithms on fullyassembled data. • withtheinputlinksloadedup to 70% of theirnominalcapacitythearchitecturecan handle 173 GB/s

Krzysztof Korcyl Institue of Nuclear Physics Polish Academy of Sciences &