1 / 13

Repack 2

Repack 2. Felix Ehm CERN / IT. Content. What is Repack? The old Repack Design and Features of the new Repack 2 Current Scenario at CERN Tests and Performance. What is Repack.

july
Download Presentation

Repack 2

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Repack 2 Felix Ehm CERN / IT

  2. Content • What is Repack? • The old Repack • Design and Features of the new Repack 2 • Current Scenario at CERN • Tests and Performance

  3. What is Repack “Repack is a synonym for a copy process which moves all of a tape’s data onto temporary disk storage and then rewrite it to another tape.” • The reasons • Prevent data loss by reaching tape’s mechanical lifetime • Mounts • Reported read errors by RTCPD • Stack data on high-density tapes (reduce number of tapes) • e.g. 2 x STK9940B (200GB)-> 1 IBM3592J1A (500GB) • Move data to more durable media • Release tapes for reuse • Optimise tape space usage

  4. 2 4 free space What is Repack • Optimization of tape space usage • ‘Resurrection’ of space, which is not marked as valid in the NameServer • e.g. user deletes file(s) NameServer Table deleted valid data Tape 1 1 2 3 4 5 invalid data Repack Tape 2 BOT EOT

  5. The old Repack • Mismatch to new CASTOR 2 architecture • Only available for CASTOR 1 • Files were staged and written in sets of segments (depend on stage_util_max_stcp_per_request) • up to several times of mounting for migration for one tape to repack • tapes were explicitly mounted for writing files • No defragmentation of files (around 22.000 !) • CASTOR 1 migration policy • > 1 mount for one file • No maintenance • Stateful, process takes up to 6-8 hours for 200G (9940B tape)

  6. Design and Features of the new Repack 2 • Design I • Client-Server Architecture • Using existing functionality from Castor II • Common recall / migration procedure • Easy to maintain (you know CASTOR2, you know Repack  ) • Parallel writing / reading from tape • Uses Stager API for interface • Stateless components • Process states is kept in DB (Oracle/MySQL) • Components are multi-threaded • Multi-Stager ability • Independent from Stager machine

  7. Design and Features of the new Repack 2 • Design II Tape Server VMGR Send Request validates tape Stager RepackClient repack files triggers RepackServer RepackClient Recaller RepackClient Migrator Get files from tape Data transfer Update location of file NameServer Diskserver

  8. Design and Features of the new Repack 2 • Features I • Adopted to CASTOR 2 • No direct intervention with mount/umount of tapes • Recall / migration of files is up to Stager • Defragmentation of files • Usage of DLF for logging • Repack has no limit in repacking tapes • Monitored repack process =========================================================================== vid cuuid total staging migrating status ------------------------------------------------------------------------------------------------------------------------------------ L30069 44e08287-0000-1000-b044-8861613b0000 1395 1395 0 STAGING L30214 44e08291-0000-1000-9bce-dbb2d4100000 0 30 1568 MIGRATING P01538 44fc5286-0000-1000-88f8-cde07a2800000 2120 0 0 START ===========================================================================

  9. Design and Features of the new Repack 2 • Features II • Repack targets are ServiceClasses • Usage of existing assigned recall/migration policies • Files from source tape are repacked to tapes in the assigned TapePool of the specified ServiceClass AtlasRepack Stager repack –V 1:2:3 –o Atlas –S AtlasStager Repack AtlasServiceClass 1 2 3 atlasPoolBTapePool Data transfer

  10. C2Test C2Public RepackServer C2Alice Design and Features of the new Repack 2 • Features II • Multi Stager ability • Can be used for load balancing of repack processes • Backup of old location of file is stored in Repack DB • Possibility of recovering of old tapes • Runs unattended • “Fire and forget”

  11. Current Scenario at CERN • Current Scenario at CERN • Around 22.000 tapes to repack • STK 9940B have to be replaced by IBM and StorageTek solution (IBM3592, STKT10000) • Estimated time to repack tapes 160days for 5PB of data • 20 x 9940 drives dedicated for process • Shared Instance for Repack with Users (C2Public)

  12. Test and Performance • Successful test with different kinds of scenarios • Different tape types • Configuring target ServiceClasses • One tape / many tapes in target TapePool • Deployment next week • Performance depends on Diskservers • Bottleneck for data throughput • Diskserver deals with too many copy processes : timeout • Connection is dropped from the tapeserver • Another mount is needed 

  13. Questions ? Felix.Ehm@cern.ch

More Related