1 / 30

The GSI Mass Storage for Experiment Data

The GSI Mass Storage for Experiment Data. DVEE-Palaver GSI Darmstadt Feb. 15, 2005 Horst Göringer, GSI Darmstadt H.Goeringer@gsi.de. Overview. different views current status last enhancements: - write cache - on-line connection to DAQ future plans conclusions.

Download Presentation

The GSI Mass Storage for Experiment Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.


Presentation Transcript

  1. The GSI Mass Storage for Experiment Data DVEE-Palaver GSI Darmstadt Feb. 15, 2005 Horst Göringer, GSI Darmstadt H.Goeringer@gsi.de

  2. Overview • different views • current status • last enhancements: - write cache - on-line connection to DAQ • future plans • conclusions GSI DVEE Palaver 15.2.2005

  3. GSI Mass Storage System Gsi mass STORagE system gstore GSI DVEE Palaver 15.2.2005

  4. gstore: storage view GSI DVEE Palaver 15.2.2005

  5. gstore: hardware view 3 automatic tape libraries (ATL): (1) IBM 3494 (AIX) 8 tape drives IBM 3590 (14 MByte/s) ca. 2300 volumes (47 TByte, 13 TByte backup) 1 data mover (adsmsv1) access via adsmcli, RFIO read read cache1.1 TByte StagePool, RetrievePool GSI DVEE Palaver 15.2.2005

  6. gstore: hardware view (2) StorageTek L700 (Windows 2000) 8 tape drives LTO2 ULTRIUM (35 MByte/s) ca 170 volumes (32 TByte) 8 data mover (gsidmxx), connected via SAN access via tsmcli, RFIO read cache 2.5 TByte StagePool, RetrievePool write cache ArchivePool: 0.28 TByte DAQPool: 0.28 TByte GSI DVEE Palaver 15.2.2005

  7. gstore: hardware view (3) StorageTek L700 (Windows 2000) 4 tape drives LTO1 ULTRIUM (15 MByte/s) ca. 80 volumes (10 TByte): backup copy of 'irrecoverable' archives ...raw mainly for backup of user data (~ 30 TByte) GSI DVEE Palaver 15.2.2005

  8. gstore: software view 2 major components: • TSM (Tivoli Storage Manager) commercial handles tape drives and robots data base • GSI software (~ 80,000 lines of code) C, sockets, threads - interface to user (tsmcli / adsmcli, RFIO) - interface to TSM (TSM API client) - cache administration GSI DVEE Palaver 15.2.2005

  9. gstore user view: tsmcli tsmcli subcommands: archive file* archive path retrieve file* archive path query file* archive path* stage file* archive path delete file archive path ws_query file* archive path pool_query pool* *: any combination of wildcard characters (*,?) allowed soon: file may contain list of files (with wildcard chars) GSI DVEE Palaver 15.2.2005

  10. gstore user view: RFIO rfio_[f]open rfio_[f]read rfio_[f]write rfio_[f]close rfio_[f]stat rfio_lseek GSI extensions (for on-line DAQ connection): rfio_[f]endfile rfio_[f]newfile GSI DVEE Palaver 15.2.2005

  11. gstore server view: query GSI DVEE Palaver 15.2.2005

  12. gstore server view: archive to cache GSI DVEE Palaver 15.2.2005

  13. gstore server view: archive from cache GSI DVEE Palaver 15.2.2005

  14. gstore server view: retrieve from tape GSI DVEE Palaver 15.2.2005

  15. server view: retrieve from write cache GSI DVEE Palaver 15.2.2005

  16. gstore: overall server view GSI DVEE Palaver 15.2.2005

  17. server view: gstore design concepts • strict separation of control and data flow • no bottleneck for data • scalable in capacity (tape and disk) I/O bandwidth • hardwareindependent (as long as TSM support) • platformindependent • uniquename space GSI DVEE Palaver 15.2.2005

  18. server view: cache administration • multithreaded servers for read and write cache • each with own metadata DB • main tasks: - lock/unlock files - select data movers and file systems - collect actual infos on disk space soon: data mover and disk load -> load balancing - trigger asynchronous archiving - disk cleaning • several disk pools with different attributes: StagePool, RetrievePool, ArchivePool, DAQPool, ... GSI DVEE Palaver 15.2.2005

  19. usage profile: batch farm batch farm: ~120 double processor nodes => highly parallel mass storage access (read and write) • read requests: 'good' user: stage all files before use wildcard chars 'bad' user: read lots of single files from tape 'bad' system: stage disk/DM crashes during analysis • write requests: via write cache distribute as uniformly as possible GSI DVEE Palaver 15.2.2005

  20. usage profile: experiment DAQ • several continous data streams from DAQ • keep same DM during life time of data stream • only via RFIO • GSI extensions necessary: rfio_[f]endfile, rfio_[f]newfile • disks faster emptied than filled: network -> disk: ~10 MByte/s disk -> tape: ~30 MByte/s => time to stage for on-line analysis • enough disk buffer necessary for case of problems (robot, TSM, ...) GSI DVEE Palaver 15.2.2005

  21. current plans: new hardware more and safer disks: • write cache: all RAID 4 TByte (ArchivePool, DAQPool) • read cache: +7.5 TByte new RAID StagePool, RetrievePool, new pools, e.g. with longer file life time • 5 new data movers: new fail-safe entry server • hosts query server, cache administration servers -> query performance! • take-over in case of host failure • metadata DBs mirrored on 2nd host GSI DVEE Palaver 15.2.2005

  22. current plans: merge tsmcli /adsmcli new command gstore: • replaces tsmcli and adsmcli • unique name space (already available) • users need not care in which robot data reside • new archive: policy computing center GSI DVEE Palaver 15.2.2005

  23. brief excursion: future of IBM 3494? • still heavily used • rather full • hardware highly reliable • should be decided this year! GSI DVEE Palaver 15.2.2005

  24. usage IBM 3494 (AIX) GSI DVEE Palaver 15.2.2005

  25. brief excursion: future of IBM 3494? 2 extreme options (and more in between): • no more money investment use as long as possible in a few years: move data to other robot • upgrade tape drives and connect to SAN 3590 (~30 GB, 14 MB/s) -> 3592 (300 GB, 40 MB/s) new media: => 700 TByte capacity access with available data movers via SAN new fail-safe TSM server(Linux?) GSI DVEE Palaver 15.2.2005

  26. current plans: load balancing • acquire actual infoon no. of read/writeprocesses for each disk, data mover, pool • new write request: select resource with lowest load • new read request: avoid 'hot spots' -> create additional instances of stage file • new option '-randomize' for stage/retrieve distribute equally to different data movers / disks split into n (parallel) jobs GSI DVEE Palaver 15.2.2005

  27. current plans: new org. of DMs • Linux platform more familar environment (shell scripts, Unix commands, ...) case sensitive file names current mainstream OS for experiment DV • '2nd level' data movers no SANconnection disks filled via ('1st level') DMs with SAN connection for stage pools with guaranteed life time of files GSI DVEE Palaver 15.2.2005

  28. current plans: new org. of DMs • integration of selected group file servers as '2nd level' data movers disk space (logically) reserved for owners pool policy according to owners many advantages: no NFS => much faster I/O files physically distributed over several servers load balancing of gstore disk cleaning disadvantages: only for exp. data, access via gstore interface GSI DVEE Palaver 15.2.2005

  29. current plans: user interface • a large number of user requests: - longer file names - option to rename files - more specific return codes - ... • program code consolidation • improved error recovery after HW failures • support for successor of alien • GRID support - gstore as Storage Element (SE) - Storage Resource Manager (SRM) -> new functionalities, e.g. reserve resources GSI DVEE Palaver 15.2.2005

  30. Conclusions • GSI concept for mass storage successfully verified • hardware and platform independent • scalable in capacity and bandwidth to keep up with - requirements of future batch farm(s) - data rates of future experiments • gstore able to manage very different usage profiles • but still a lot of work ... to fully reach all discussed plans GSI DVEE Palaver 15.2.2005

More Related