170 likes | 179 Views
Explore the nature of data in high-energy physics experiments at Fermi National Accelerator Laboratory, focusing on detectors, raw data, and data life cycles. Learn about distributed computing, data storage, WAN architecture, and network issues.
E N D
Wide Area NetworksinHigh Energy Physics The nature of the problem Addressing the needs Fermi National Accelerator Laboratory SC2006
The nature of HEP data • Accelerators • Produce beams of high energy particles CERN which are utilized by… Fermilab
The nature of HEP data • Detectors • View the beam collisionsat MHz rates, generate… The D0 Experiment The CDF Experiment The CMS Experiment
Raw Data With operational duty factors included get: The nature of HEP data • Selected (by fast hardwaresignals) collision “events” • ~1M channels, digitized, zero-suppressed • Typical parameters • ~ MB per event • ~ 100 Hz recorded • ~ 100 MB/sec • Organized into files • Several GB in size • Statistically independent • Grouped by • Time • Beam / detector conditions • Physics process 1 – 5 PB/yr per detector
Data life cycle Reconstruction 1 or more passes over entire data set Summary & Selection Multiple physics “streams” Frequent access to data sets Analysis Multiple passes over reduced data sets Huge data sets,large processing requirements, too great for any single site Results
Tier-0 Tier-1 Tier-1 Tier-2 Tier-2 Distributed Computing • For example, CMS computing model
Data Storage at Fermilab • Storage resources • 8 automated tape libraries • >4 PB of existing custodial data (300 TB in Oct06!) • >500 TB distributed disk cache
Data Storage at Fermilab Peaks of > 35 TB to & from tape/day It’s active data! 25 TB/day CMS read 250 TB from disk in one day It has to get off the Fermilab site to other locations… 50 TB/day
WAN architecture • Separate production and high-impact paths • “Circuits” to remote sites • Multiple technical implementations • e.g. Policy routing • Increases complexity
PB/mo = < 3 Gb/s> WAN utilization at Fermilab High impact path Production path A large amount of the data transfer is to/from external sites
WAN transfers, CERN Tier-0 to Tier-1s Fermilab component Is ~ 150 MB/s = 1.2 Gb/s
WAN transfers, FNAL Tier-1 to Tier-2s Many Tier-2s, Aggregate up to 300 MB/s = 2.4 Gb/s
Network Issues • Remote sites have different network providers
Network Issues • Transatlantic traffic
Network Issues Path between sites is complex e.g. IN2P3 to Fermi Configuration, debugging, monitoring, challenging
Network Issues • Redundancy and recovery • Example:Loss of high-impact trans-atlantic link
Summary • Experiments producing multi-PB/yr data sets • Data storage and processing distributed world-wide • Multi-Gb networks provisioned