1 / 14

High Performance Storage System (HPSS) Jason Hick Mass Storage Group jhick@lbl HEPiX

This presentation discusses the features and future directions of the HPSS, a hierarchical storage management system that allows for efficient data transfer, storage, and management. It also explores the challenges of extreme-scale computing and the latest updates in the HPSS version 8.1.

heider
Download Presentation

High Performance Storage System (HPSS) Jason Hick Mass Storage Group jhick@lbl HEPiX

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. High Performance Storage System (HPSS) Jason Hick Mass Storage Group jhick@lbl.gov HEPiX October 26-30, 2009

  2. Agenda • How HPSS Works • Current Features • Future Directions (to Extreme Scale) HEPiX, October 26-30, 2009

  3. Latency Capacity HPSS as a Hierarchical Storage Management • Top of pyramid is the Class of Service (COS) • Pyramid is a single hierarchy, we have many of these • Each level is a storage class, each storage class can be striped (disk & tape) and produce multiple copies (tape only) • Migration copies files to lower levels • Files can exist at all levels within a hierarchy • Continually replacing all hardware within a level for technology refresh Fast Disk High Capacity Disk Local Disk or Tape Remote Disk or Tape HEPiX, October 26-30, 2009

  4. A HPSS Transfer Client Cluster Client Cluster Client Cluster HPSS Movers HPSS Core Server Metadata DataDisks 1. Client issues READ to Core Server 2. Core Server accesses metadata on disk LAN 3. Core Server commands Mover to stage file from tape to disk Switch 4. Mover stages file from tape to disk Tape 5. Core Server sends lock and ticket back to client 6. Mover reads data and sends to client over LAN HEPiX, October 26-30, 2009

  5. HPSS Current Features (v7) • Single client transfer optimizations • Globus gridFTP service • Striping a single file across Disk or Tape drives • Aggregation capable clients (HTAR, PSI) • Manage 10’s of PBs effectively • Dual copy on tape, delayed or real-time • Technology insertion • Recover data from another copy • Aggregation on migration to tape • Data Management Possibilities • User-defined attributes on files • File System Interfaces • GPFS/HPSS Integration – IBM • Lustre/HPSS Integration – CEA/Sun-CFS • Virtual File System interface HEPiX, October 26-30, 2009

  6. HPSS Feature – gridFTP Transfers • Data Transfer Working Group • Data transfer nodes at ORNL-LCF, ANL-LCF, and LBNL-NERSC with ESNet • Optimize WAN transfers between global file systems and archives at the sites • Dedicated WAN nodes are helping users • Several 20TB days between HPSS and DTN global file system • Several large data set/project movements between sites • Have plans for • SRM: BeStMan to aid in scheduling and persistent transfers between sites • Increasing network (ESNet), and transfer nodes as usage increases HEPiX, October 26-30, 2009

  7. I/O Node Client Cluster Client Cluster HPSS Movers HPSS Core Server Metadata DataDisks HPSS Feature – Striping transfers across disk/tape LAN Client network BW is the bottleneck Switch Tape HEPiX, October 26-30, 2009

  8. I/O Node Client Cluster Client Cluster HPSS Movers HPSS Core Server Metadata DataDisks HPSS Feature – Multi-noded transfers & striping in HPSS LAN Match client BW to HPSS mover BW Switch Tape HEPiX, October 26-30, 2009

  9. HPSS Feature – Virtual File System • HPSS accessed usingstandard UNIX/Posix semantics • Run standard applications on HPSS such as IBM DB2, IBM TSM, NFSv4, and Samba • VFS available for Linux Unix/Posix Application HPSS VFS Extensions& Daemons Posix File System Interface DataBuffer HPSS Client API Linux Client Control Data Optional SAN Data Path HPSS Data Movers HPSS Core Server HPSS ClusterAIX or Linux HEPiX, October 26-30, 2009

  10. HPSS Feature – User-defined Attributes • Goals: • Provide an extensible set of APIs that will insert/update/delete/select UDAs from database • Provide robust search capability • Storage based on DB2 pureXML • Possible uses: • Checksum type w/value • Application specific • Expiration/action date • File version • Lustre path • Tar file TOCs • Planned uses: • HSI: cksum, expiration date, trashcan, annotation, some application specific • HTAR: creator code and expiration date HEPiX, October 26-30, 2009

  11. Extreme Scale (2018-2020) • Series of workshops conducted by users, applications, and organizations starting in 2007 • Proposed new program within DOE to realize computing at exascale levels • Challenges: • Power • 20 MW - ? • Cost (size of the system, # of racks) • 3.6 - 300PB of memory • Storage • Exabytes of data, millions of concurrent accesses, PBs dataset movement between sites • HPSS held a ES workshop and determined the following challenges: • Scalability • Data Management • System Management • Hardware HEPiX, October 26-30, 2009

  12. HPSS v8.1 • Multiple Metadata Servers • Optimizes multiple client transfers • Enables managing Exabytes of data effectively • On-line Upgrades • Ability to upgrade HPSS software while system available to users HEPiX, October 26-30, 2009

  13. HPSS post 8.1 • Advanced Data Management • Collaboration with data management community (SRMs, Content Managers…) • Integration with 3rd party tape monitoring applications • Crossroads, HiStor, Sun solutions? • Metadata footprint reduction • New client caching for faster pathname operations HEPiX, October 26-30, 2009

  14. Thank you, Questions? HEPiX, October 26-30, 2009

More Related