200 likes | 388 Views
HPSS. The High Performance Storage System. Developed by IBM, LANL, LLNL, ORNL, SNL, NASA Langley, NASA Lewis, Cornell, MHPCC, SDSC, UW with funding from DoE, NASA & NSF Presented by Christopher Ho, CSci 599. Motivation. In last 10 years, processor speeds have increased 50-fold
E N D
HPSS The High Performance Storage System Developed by IBM, LANL, LLNL, ORNL, SNL, NASA Langley, NASA Lewis, Cornell, MHPCC, SDSC, UW with funding from DoE, NASA & NSF Presented by Christopher Ho, CSci 599
Motivation • In last 10 years, processor speeds have increased 50-fold • Disk transfer rates have increased < 4 X • RAID now successful, inexpensive • Tape speeds have increased < 4 X • tape striping not widespread • Performance gap is widening! • Bigger & bigger files (10s, 100s of GB, soon TB) • => Launch scalable storage initiative
IEEE Mass Storage Reference Model • Defines layers of abstraction & transparency • device, location independence • Separation of policy and mechanism • Logical separation of control and data flow • Defines common terminology • “compliance” does not imply inter-operability • Scalable, Hierarchical Storage Management • see http://www.ssswg.org/sssdocs.html
Introduction: Hierarchical Storage • Storage pyramid Decreasing cost & speed,Increasing capacity Memory Disk Optical disk Magnetic Tape
HPSS Objectives • Scalable • transfer rate, file size, name space, geography • Modular • software subsystems replaceable,network/tape technologies updateable, API access • Portable • multiple vendor platforms, no kernel modifications,multiple storage technologies, standards-based, leverage commercial products
HPSS Objectives (cont) • Reliable • distributed software and hardware components • atomic transactions • mirror metadata • failed/restarted servers can reconnect • storage units can be varied on/offline
Access into HPSS • FTP • protocol already supports 3rd party transfers • new: partial file transfer (offset & size) • Parallel FTP • pget, pput, psetblocksize, psetstripewidth • NFS version 2 • most like traditional file system, slower than FTP • PIOFS • parallel distributed FS on IBM SP2 MPP • futures: AFS/DCE DFS, DMIG-API
I/O node I/O node HPSS architecture Processing node Processing node HPSS server MPP interconnect Storage System Mgmt I/O node Control Network HiPPI/FC/ATM data network Network Attached Disk NFS FTP DMIG-API -NETWORK Network Attached Tape
Software infrastructure • Encina transaction processing manager • two-phase commit, nested transactions • guarantees consistency of metadata, server state • OSF Distributed Computing Environment • RPC calls for control messages • Thread library • Security (registry & privilege service) • Kerberos authentication • 64 bit Arithmetic functions • file sizes up to 2^64 bytes • 32 bit platforms, big/little endian architectures
Software components • Name server • map POSIX filenames to internal file, directory or link • Migration/Purge policy manager • when/where to migrate to next level in hierarchy • after migrated, when to purge copy on this level • purge initiated when usage exceeds administrator-configured high-water mark • each file evaluated by size, time since last read • migration, purge can also be manually initiated
Software components (cont) • Bitfile server • provides abstraction of bitfiles to client • provides scatter/gather capability • supports access by file offset, length • supports random and parallel reads/writes • works with file segment abstraction (see Storage server)
Software components (cont) • Storage server • map segments onto virtual volumes, virtual volumes onto physical volumes • virtual volumes allow tape striping • Mover • transfers data from a source to a sink • tape, disk, network, memory • device control: seek, load/unload, write tape mark, etc.
Software components (cont) • Physical Volume Library • map physical volume to cartridge, cartridge to PVR • Physical Volume Repository • control cartridge mount/dismount functions • modules for Ampex D2, STK 4480/90 & SD-3,IBM 3480 & 3590 robotic libraries • Repack server • deletions leave gaps on sequential media • read live data, rewrite on new sequential volume,free up previous volume
Software components (cont) • Storage system management • GUI to monitor/control HPSS • stop/start software servers • monitor events and alarms, manual mounts • vary devices on/offline
Parallel transfer protocol - goals • Provide parallel data exchange between heterogeneous systems and devices • Support different combinations of parallel and sequential source/sink • Support gather/scatter and random access • combinations of stripe width, both regular and irregular data block size • Scalable I/O bandwidth • Transport independent (TCP/IP, HiPPI, FCS, ATM)
A A A A B B B B S1 S2 S3 x x x y y y z z z Gather/scatter lists logical window D1 D2 D3
Parallel transport architecture control connections client control connections S1 Sn D1 Dn parallel data flow
Parallel FTP transfer (pget) 1 Parallel FTPd Name server Parallel FTP client 2 6 Client mover 6 Bitfile server Client mover 3 Storage server 4 4 5 Mover 5 Mover
Summary • High performance • up to 1 GB/s aggregate transfer rates • Scalable storage • parallel architecture • terabyte-sized files • petabytes in archive • Robust • transaction processing manager • Portable • IBM, Sun implementations available
Conclusion • Feasability has been demonstrated for large, scalable storage • Software exists, is shipping, and is actively used in the national labs on a daily basis • Distributed architecture and parallel capabilities mesh well with grid computing