120 likes | 407 Views
CERN. What is CERN? Particle physics laboratory Europe old and new (plus collaborations with USA, Canada, Japan, India, Pakistan, Russia, China… ) Planning for ~5 PB per year, 2-5 GB/s in 2007: data storage problem!. Ski slopes. LHC ring. Geneva Airport. CERN: current tape situation.
E N D
CERN • What is CERN? Particle physics laboratory • Europe old and new (plus collaborations with USA, Canada, Japan, India, Pakistan, Russia, China… ) • Planning for ~5 PB per year, 2-5 GB/s in 2007: data storage problem! Ski slopes.. LHC ring Geneva Airport
CERN: current tape situation • Main drive: 9940B (50), already very busy • Peak test rate is ~ 1 GB/s • Secondary: 9840A (20), for ‘small’ files • Do we modernise to 9840C? • Main robotics: Powderhorn • Secondary: L700e (exotics, LTO1, SDLT..) • Efficiency is low, especially for read • Lots of drives needed now, so in 2007?
Media • Main media: 9940 200GC 13,600 • Secondary: 9940 60GC 8,800 • Being converted to 200GC, ~60% done • 9840 A • 2,100 user data (move to 200GC, reuse?) • 2,800 Legato data (legacy, reuse?) • 2,900 ADSM data (legacy, reuse?) • Oddities: SDLT, LTO, legacy 3590, DLT….
Current plans • Avoid purchase of 9840 or 9940 media • Re-use existing media as far as possible, OK in 2004, but 2005? • Consolidate backups, some aging out, but a lot of equipment! (virtualisation?) • 1 Timberwolf, 6 DLT700 for AFS • 2 Powderhorns, 14 IBM 3590E for ADSM • 2 Powderhorns, 6 STK 9840 for Legato • 1 Powderhorn, 4 STK 9940B for TSM • Decide on LHC system components in 2005 • Call for Tender • Drive: STK, IBM 3592, LTO 2, other? • Library: STK Powderhorn/8500, IBM, ADIC Scalar 10000, other?
Minimal drives for LHC Use at peak throughput assumed, realistically need 3 x this, > 3 x? Powderhorns can cope re drive numbers (40/silo)…. But speed?
Minimal drives? • Write can be reasonably effective, often >50% of possible maximum • Many GB (10?) in one mount • Drive definitely streams • ~60s unit reserved/pick/load/position • 350s writing, say, for 9940B • ~60s rewind/unload/place • We write ANSI standard tape files, minimum 3s per file today… • Reading in CASTOR is poor, depends on files picked • 1 file, 1 GB, ~25% of possible maximum, depends heavily on robot speed • ~60s pick/load/position • 35s writing, say, for 9940B • ~60s rewind/unload/place • Some improvement in next CASTOR version… (marshalling requests) • But we READ more than we WRITE, except for data recording
Minimal cartridge slots for LHC 2010- 100K SAIT (or 78K 3592b?) is 18(14?) Powderhorns, so => new building? 2010- 100K SAIT is fine with 8500 in existing zones, but not supported A ‘3592b’ does not exist today. SAIT exists, 500GB, ~30 MB/s..
Costs for LHC, 2010 • Libraries: 20 8500 ~ 10 M $ (?) 33% • Media: 100K SAIT ~ 10 M $ (‘usually’ 100$/cart) 33% • Drives: ~300 SAIT ~ 10 M $ (‘usually’ 30 K$/drive) 33% • Why so many? Because read is poor at CERN but frequent.. • However, drives/media in 8500 not a ‘monopoly’ problem • Today? Consider only major use, drives important… • Libraries: 6 Powderhorn ~ 1.5 M $ 28% • Media: 25 K 9940 ~ 2.5 M $ 28% • Drives: 50 9940B ~ 1.5 M $ 44%
Major operational interests • Benefits of 8500 very clear • 99.9% available machinery, easy upgrading… • Speed very helpful in disorganised reading, common at CERN • Drive/media mix very helpful (but might not be used..) • Benefits of SAIT-like capacity very clear • Higher capacity, no building needed • Data recording looks ‘easy’ at ~40 drives for 1 GByte/s • Linux driver from STK? • Hard to write your own and maintain it, hard to adapt to ‘new drive’ quickly • Might they eventually do this? • Better (WWW access) • Library/drive/media monitoring and logging features • Predictions of imminent failure, and timely requests for intervention • Access to MIR data for media monitoring, problem prediction, otherwise? • Customers ask for it, and it would save STK time, money…