120 likes | 236 Views
SDM CENTER. Active Storage Processing in Parallel File Systems Jarek Nieplocha Evan Felix Juan Piernas-Canovas. Active Storage in Parallel Filesystems. Active Storage exploits the old concept of moving computing to the data source
E N D
SDM CENTER Active Storage Processing in Parallel File Systems Jarek Nieplocha Evan Felix Juan Piernas-Canovas
Active Storage in Parallel Filesystems • Active Storage exploits the old concept of moving computing to the data source • Avoids data movement across the network in parallel machine by allowing applications use compute resources on the I/O nodes of the cluster for data processing Traditional Approach Active Storage Y=foo(X) Y=foo(X) Y P P FS x FS P P Network Network P P FS FS P P compute nodes I/O nodes compute nodes I/O nodes 2
Example • BLAS DSCAL on disk Y = α.Y • Experiment • Traditional:The input file is read from filesystem, and the output file is written to the same file system. The input file has 120,586,240 doubles. • Active Storage:Each server receives the factor, reads the array of doubles from its disk locally, and stores the resulting array on the same disk. Each server processes 120,586,240/N doubles, where N is the number of servers • Speedup contributed to avoiding data movement between client and servers 3
Related Work Y=foo(X) • Active Disk/Storage concept was introduced a decade ago to use Processing resources ‘Near’ the disk • On the Disk Controller. • On Processors connected to disks. • Reduce network bandwidth/latency limitations. • References • DiskOS Stream Based model (ASPLOS’98: Acharya, Uysal, Saltz) • Active Storage For Large-Scale Data Mining and Multimedia (VLDB ’98: Riedel, Gibson, Faloutsos) • Research proved Active Disk idea interesting, but • Difficult to take advantage of in practice • Processors in disk controllers not designed for the purpose • Vendors have not been providing SDK 4
MDS MDS MDS MDS Lustre Architecture O(10000) Client NAL Directory Metadata & concurrency File IO & Locking OST OST OST OST OST OST O(10) OST OST OST OST Recovery, File Status, File Creation OST OST OST OST O(1000) OST OST OST OST 5
User Space Application LLITE LOV OSC NAL Lustre Client • Application IO requests • LLITE module implements Linux VFS layer • LOV stripes object and targets IO to correct Object Client • OSC packages up request for transmission over the NAL 6
NAL OST OBDfilter ext3 Lustre Object Storage Server • Requests arrive from Portals NAL • Object Storage Target directs Request to appropriate lower level OBD • OBDfilter presents ext3 as Object Based Disk 7
Current Implementation of Active Storage User Space Processing Component NAL • Extra Module passes data, until told to pipe data elsewhere • Data is sent to user space process through Unix Character Device File. • Processed Data is written back to disk • Pattern: 1W->2W OST ASDEV ASOBD OBDfilter ext3 8
Lustre OST Lustre OST Lustre OST Lustre MDS Lustre OSS 0 Lustre OSS 38 Lustre OSS 39 SC’2004 StorCloud Most Innovative Use Award Gigabit Network Client System • 320 TB Lustre • 984 400GB disks • 40 Lustre OSS's running Active Storage • 4 Logical Disks (160 OST’s) • 2 Xeon Processors • 1 MDS • 1 Client creating files Sustained 4GB/s Active Storage write processing 10
Real Time Visualization Time OST # 11
Status and Future Work • Status • Proof of concept 1W->2W code works now • Difficult to administer and use – 2 people • Memory copies between user and kernel space • Future • Implement other processing patterns • Optimize performance by eliminating memory copies • Implement Active Storage for PVFS • Support different striping in files • HDF, NetCDF, Database Stored Procedure Calls …. 13