1 / 13

Repack at CERN Usage and Outlook Tim Bell Gordon Lee

Repack at CERN Usage and Outlook Tim Bell Gordon Lee. Agenda. Use Cases Bulk Repack Outlook. Use #1 : Data Recovery. When a media error is reported by a tape, arrange the repack to recover as many files as possible onto fresh media. Volumes of around 5 per week

parley
Download Presentation

Repack at CERN Usage and Outlook Tim Bell Gordon Lee

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Repack at CERN Usage and Outlook Tim Bell Gordon Lee

  2. Agenda • Use Cases • Bulk Repack Outlook 2

  3. Use #1 : Data Recovery • When a media error is reported by a tape, arrange the repack to recover as many files as possible onto fresh media. • Volumes of around 5 per week • Run by the tape operator with a small script to perform tape drive reservation and logging of the history • Reliability improved substantially with 2.1.6 and 2.1.7. More fixes in the pipeline for dual copy tapes along with enhancements for output tape pool management 3

  4. Use #2 : Defragmentation • Recovery of space from tapes when users have deleted files from Castor. Files cannot be deleted from tape so all remaining files copied off and the tape reclaimed • Looking to automate the detection and copy of badly fragmented tapes • Being run manually at the moment based on comparing number of segments on the tape and count of files in the name server • Around 20-50 tapes per week expected • Works well 4

  5. Use #3 : Bulk Media Change • Higher density tapes becoming available which require copy of old tapes to new • Limited time and resources since need to free up robot slots and not use too many drives • Requires data rates of 600+ MBytes/s for CERN’s data volumes (20PB) to complete within a year • Between 16 (at 80MBytes/s) and 50 drives (25MBytes/s) • Equivalent to a large LHC experiment • Legacy of small files to preserve • Average file size is 142MB • 9 seconds overhead per file on T10K drives • 100 million files to repack • Up to 154,000 files per tape 5

  6. The scale of the problem 6

  7. Approach to get to 80MB/s • Scale repack2 to meet the performance targets • Tuning of read/write policies to improve disk server I/O rates • Recall order sorting for large files per tape • Investigate more direct copies such as tape-to-tape • Reduce stager overheads • Avoid tape server ethernet bottlenecks • Review performance for tape marks • Unlabelled tapes ? Embedded labels ? 7

  8. Operational issues • Development underway for • Mapping between service class repack and tape pool where the tape data should be sent. • Enhanced error reporting on failures • Dual copy files • Tickets opened for • Repacking disabled tapes • Skipping over bad files without unmounting • Error analysis often requires developer • Root causes of failed or blocked repacks not easy to find for tape operations 8

  9. Summary • Castor 2.1.7 repack provides a solution for data recovery and de-fragmentation and is used in production at CERN • Bulk repacking performance is a major concern which requires a solution before year-end 2008 when new drives are expected to be ready 9

  10. Backup Slides

  11. Tape-to-Tape repack? CERN repack Stager Disk Server • Tape-to-tape copy rather than copying through the stager avoids network bottleneck • Initial tests indicate that the tape writing overheads are larger for our typical files 11

  12. Tests to scale repack 2 • 3 disk servers • 3 tape drives in • 3 tape drives out • File size of 2GB+ • 6 disk servers • 3 tape drives in • 3 tape drives out • File size of 500MB+ 12

  13. Tapes from the Dark Side • 9940s with 20000+ files • Use IBM NVC tapes for small file handling • Can take up to weeks to repack due to label overhead 13

More Related