40 likes | 206 Views
RCF Status. Events related to Mass Storage System (HPSS) and Tape Libraries Broken “hand” on a robot causing problems w/ access to a fraction of the cartridges PHENIX HPSS Data Mover problem Started on Monday w/ loss of a controller + 1 LUN All remaining LUNs failed over to second controller
E N D
RCF Status • Events related to Mass Storage System (HPSS) and Tape Libraries • Broken “hand” on a robot causing problems w/ access to a fraction of the cartridges • PHENIX HPSS Data Mover problem • Started on Monday w/ loss of a controller + 1 LUN • All remaining LUNs failed over to second controller • Continued on Thursday when failed controller went offline again • Multiple disk failures, LUNs degraded but not lost • On Friday though failed controller was offline it interfered with production • All LUNs on system went offline • PHENIX lost ~700 files on the mover node but (according to Martin P.) all of them were still on the Buffer at the CH • Temporarily fixed the problem on Friday by replacing the failed PHENIX disk with ATLAS disk (including the controllers) • Load (at that time and expected over the WE) on the ATLAS system was only light • Replacement Controller arrived on Monday • System is being tested to verify problem was in failed controller
Raw Data Volume collected & archived since 11/26/07 PHENIX Raw Data 470 TB STAR Raw Data 125 TB 01/14 11/26
Results from Data Taking STAR PHENIX 500 Megabits/second 3 Gigabits/second STAR / RCF Network Links (2 * 1 Gbps max.) PHENIX / RCF Network Link (10 Gbps max.) Data Migration to Tape 1500 GB/hour