220 likes | 465 Views
Storage Systems in HPC. John A. Chandy Department of Electrical and Computer Engineering University of Connecticut. Research Summary. Storage Systems Active Storage Parallel File Systems Reliable Data Storage Active Storage Networks. Storage Systems. Parallel Computing
E N D
Storage Systems in HPC John A. Chandy Department of Electrical and Computer Engineering University of Connecticut
Research Summary • Storage Systems • Active Storage • Parallel File Systems • Reliable Data Storage • Active Storage Networks
Storage Systems • Parallel Computing • Building parallel file systems to support HPC • Computation at the storage node • Data organization methods to improve performance • Reliable Data Storage • Customizable and extensible storage for reliability • Backup strategies using personal storage devices • Data security, trust, and reliability in the cloud
Parallel File Systems • Network Attached Storage • Put the storage on the network with a computer (server) acting as the go-between Network
Parallel File Systems • Separate the metadata from the storage Metadata Network
Parallel File Systems • How do you improve metadata performance? • Distribute metadata services on data nodes • Use active storage and object services
Active Storage • Allows us to run applications on storage nodes • Can dramatically reduce data traffic • Eliminate large network latencies • Take advantage of fast RAID arrays and SSDs • Drives bottle-necked by slow networks • Run applications in parallel across multiple nodes • Make use of unused processor time
Programming Model • Based on object storage • RPC based • Executable objects • RPC calls have full access to all object functions – read, write, create, set attribute, etc. • Functions can be synchronous or async • Supports multiple languages (C, Java, Python)
Programming Model • Based on work by Acharya, Riedel - Stream based • Our model is Remote Procedure Call (RPC) based • Use executable objects • Added command to begin execution • Allow full access to all OSD functions • Functions can be run sync or async • Due to iSCSI 30sec timeout • Working to allow queries for async • Allow parallel execution using async • Support multiple languages (c, java, python)
Security • Multiprocess implementation • Limits AS functions from directly accessing objects • Limits access to the object services library • Enforces use of object security mechanisms • chroot sandboxing • C/Java engines run in a chroot directory • Allows limited system libraries – e.g. libc
Security • Multiprocess Implementation • Limits AS functions from directly accessing objects • Limits access to the OSD services library • Forces the use of RPC • Enforces the use of OSD security mechanisms • Chroot Sandboxing • Applied to engines • Limits engines inside a single directory • Allows limiting of libraries • AS versions of libraries possible
High Performance Computing • Active storage network • Computing in the network • SIMD-like processing of data in motion • Adaptive computing network elements • Application optimizations for database queries, scientific applications, data mining, sort, etc.
Active Storage Networks Data Sort
BECAT Collaboration • Large Data Problems • Parallel File Systems Implementation