110 likes | 241 Views
Linux Servers with JASMine. K. Edwards, A. Kowalski, S. Philpott HEPiX May 21, 2003. JASMine. JASMine JLab’s Mass Storage System i.e. CASTOR, Enstore, … Distributed Servers Data Movers (tape and disk) Two tape drives per Data Mover 600+GB of staging disk space (3 9840B tapes)
E N D
Linux Servers with JASMine K. Edwards, A. Kowalski, S. Philpott HEPiX May 21, 2003
JASMine • JASMine • JLab’s Mass Storage System • i.e. CASTOR, Enstore, … • Distributed Servers • Data Movers (tape and disk) • Two tape drives per Data Mover • 600+GB of staging disk space (3 9840B tapes) • Need fast access to/from disk to keep up with the 9940B tape drives and gigabit ethernet • Cache Servers (disk) • 1-2TB file servers • JASMine manages the files • Copies from Data Movers via JASMine’s jcp protocol • User access via NFS (read-only)
Lastest Data Mover • Operating System • RedHat 7.3, kernel 2.4.20-xfs • XFS File System • Hardware • Dual 2.2GHz Xeon CPUs • SuperMicro P4DPE Motherboard • 2 GBytes RAM • 2 LSI Logic MegaRaid 320-2 raid controllers • 14 Seagate 73GB disk drives (hot swap) • Qlogic 2342 dual port fiber card ($$) • 2 9940B tape drives • Intel PRO/1000XT Server Ethernet Card • 3U Chassis with N+1 power supplies • $14,200.00 US (without the 2 9940B tape drives)
Disk Performance Tests • Used Standard Tests (Disktest, Bonnie++, IOZone) • 4GB file size used • Wanted to try the Fermi test (lack of time) • Parameters tested • Write-through vs Write-back cache policy • Optimum disk read/write block sizes • RAID-5 vs. RAID-50 performance • RAID 5 array done in hardware (1 RAID card) • RAID 50 • 2 RAID-5 arrays done in hardware (1 per RAID card) • RAID-0 array done in software
Issues/Problems Discovered • LSI Logic MegaRAID 320-2 raid controllers • Vendor support only if you use standard RedHat kernels • These do not have XFS support • RAID monitor software from LSI Logic • Causes SCSI Bus Resets • Occurs every 20 seconds (not changeable) • Throughput drops to 4-5MB/sec when occurring as it resets the bus and flushes cache • Work Around • Turn off Raid monitoring • Without this, there is no real way to monitor the status of the disks and raid hardware • Disk failures go unnoticed • Looking into Adaptec 2200S RAID cards
Disk Test Results • Disk Results • Use Write-back cache on RAID card • 32K block sizes are optimum • Raid 50 was fastest (no real surprise) • Idle System (1 reader or 1 writer) • 210MB/sec disk read throughput • 140MB/sec write throughput • Busy system (8 readers and 8 writers) • 40MB/sec aggregate read throughput • 110 MB/sec aggregate write throughput
Tape Performance Testing • Used JASMine test program (Java) • Double-buffered • Threads simultaneously reading and writing from/to the buffer • Calculates/Verifies file checksum • Moves file between disk and tape • Used real raw data from the experiments • 2GB files • HallA and HallC data in CODA format • Does not compress • CLAS data in BOS format • Does compress
Tape Test Results • No Issues or Problems • Qlogic 2342 dual port fiber card works well with Linux • Some Extra CPU required for checksums • Hyper-Threading really helps the performance here • 9940B Results as Expected • Direction does not matter (read/write) • 30MB/sec if file is not compressible • Up to 45MB/sec if file is compressible • Depends on the compressibility of the file • Two simultaneous copies • 30MB/sec each if file is not compressible (no change) • Expected 37.5MB/sec each for compressible file read from tape - Observed 30MB/sec each
Latest Cache Server • Operating System • RedHat 7.3, kernel 2.4.18-xfs • XFS File System • Hardware • Dual 2.0GHz Xeon CPUs • SuperMicro P4DPE Motherboard • 2 GBytes RAM • 2 3ware 7850 IDE/ATA RAID controllers (RAID-5) • 16 Hot Swap Disk Drives • Maxtor 160GB ATA133 • Western Digital 180GB ATA100 • Intel PRO/1000XT Server Ethernet Card • 4U Chassis with N+1 power supplies • $9,000.00 US
Issues/Problems Discovered • Western Digital 180GB/200GB ATA100 Drives • Drives go offline/idle (WD feature) • 3ware card thinks the drive died • Solution • Get Disk Firmware Version 63.13F70 from Western Digital • Use Maxtor 160GB ATA133 drives
Experience with IDE/ATA Drives in General • High failure rates during the first two months of use • 1-3 per week • Need a longer burn in period • Failure rates decrease after two months of use • 1 every 6-8 weeks • marginal drives gone? • They still fail more often than SCSI disks • Then again, we lost 2 SCSI disks today • Number of disks by type used in servers • 191 SCSI • 320 ATA