260 likes | 394 Views
Integrating HDF5 with SRB. Object-level Access to Remote Files. P eter Cao, NCSA Mike Wan, SDSC. Sponsored by NLADR, NFS PACI Project in Support of NCSA-SDSC Collaboration. Outline. Introduction to HDF5 The HDF-SRB model SRB Support in HDFView. Matter & universe. Life & nature.
E N D
Integrating HDF5 with SRB Object-level Access to Remote Files Peter Cao, NCSA Mike Wan, SDSC Sponsored by NLADR, NFS PACI Project in Support of NCSA-SDSC Collaboration SRB Workshop, San Diego
Outline • Introduction to HDF5 • The HDF-SRB model • SRB Support in HDFView SRB Workshop, San Diego
Matter & universe Life & nature Weather & climate August 24, 2001 August 24, 2002 Total Column Ozone (Dobson) 60 385 610 Overview of HDF5Answeringbig questions … SRB Workshop, San Diego
Overview of HDF5Involves big data … SRB Workshop, San Diego
Overview of HDF5On big computers … SRB Workshop, San Diego
Commonmodels extensions Overview of HDF5HDF solution … File format for all kinds of data Efficiency storage & IO Software & tools open source & multiple platform Standard APIs conventions & easy use SRB Workshop, San Diego
Overview of HDF5Exmaple HDF5 SRB Workshop, San Diego
Overview of HDF5HDF Software Tools & Applications HDF I/O Library HDF File SRB Workshop, San Diego
Overview of HDF5Object model • Primary Objects • Groups • Datasets • Additional ways to organize data • Attributes • Sharable objects • Storage and access properties SRB Workshop, San Diego
“/” tom harry dick temp Overview of HDF5Groups • A mechanism for collections of related objects • Every file starts with a root group • Similar to UNIXdirectories • Can have attributes SRB Workshop, San Diego
Metadata Data Dataspace Rank Dimensions 3 Dim_1 = 4 Dim_2 = 5 Dim_3 = 7 Datatype IEEE 32-bit float Attributes Storage info time = 32.4 Chunked pressure = 987 compressed temp = 56 Overview of HDF5Datasets SRB Workshop, San Diego
(a) Hyperslab from a 2D array to the corner of a smaller 2D array (b) Regular series of blocks from a 2D array to a contiguous sequence at a certain offset in a 1D array (c) A sequence of points from a 2D array to a sequence of points in a 3D array. (d) Union of hyperslabs in file to union of hyperslabs in memory. Overview of HDF5Data subsetting SRB Workshop, San Diego
Project DescriptionMotivation High performance distributed data system SRB Workshop, San Diego
Project DescriptionGoals • Use SRB as middleware to transfer data between the server and client • Use Object-level access for interactive and efficient access to part of the file Working prototype of client/server system for object-level access to HDF5 stored in the SRB SRB Workshop, San Diego
Remote Data Access on SRBMethods • Normal ways to access SRB: • Get the whole file: large files (100TB SCEC) • Use POSIX low level calls: low performance • New way: • Implement proxy operations to access objects or parts of objects in one request SRB Workshop, San Diego
HDF5 Normal SRB File AccessArchitecture client HDF5 File (whole file or a sequence of bytes) SRB Server MCAT SRB Workshop, San Diego
Object-level File AccessArchitecture Client Server HDF5 Library Client Application HDF5 file HDF5 Object (File, Group, Dataset, Subset, Attribute) HDF5 Object (File, Group, Dataset, Subset, Attribute) MCAT SRB Server HDF5-SRB Module (pack/unpack messages) HDF5-SRB Module (pack/unpack messages) SRB Workshop, San Diego
HDF5 Examples of File Access I need to see the eye of Hurricane Bob! SRB Workshop, San Diego
Get the file HDF5 Examples of File AccessWhole file transfer client Transfer large image – slow! SRB Workshop, San Diego
Open file file’s open find image image found open image image open Examples of File AccessSRB POSIX API client HDF5 Many small messages – slow and complex! SRB Workshop, San Diego
Get me the eye of hurricane Bob HDF5 Examples of File AccessObject level client 1 request, small transfer – fast! SRB Workshop, San Diego
HDF5-SRB Model New objects/APIs • A new set data objects • H5File, H5Group, H5Dataset, H5Datatype, etc • Encapsulated client requests and server results • Enhanced SRB APIs • Pack/Unpack routines (exchange data between byte stream and structure) to handle complicated struct – string, pointers, pointers to arrays, arrays of pointers, etc • New srbGenProxyFunct (general Proxy Function) handles other types of request besides HDF5 SRB Workshop, San Diego
HDF5-SRB Model Data Flow Client API srbObjRequest(void *obj, int objID) Server API srbObjProcess(void *obj, int objID) 5. H5Object 3. H5Obj::op() 7. unpackMsg() 6. packMsg() 1. packMsg() 2. unpackMsg() HDF5 Library 4. Access file srbGenProxyFunct HDF5 file SRB Server SRB Workshop, San Diego
Running Server/Client • A SRB server that supports HDF5 • HDF5 library and other external libraries (SZIP, ZLIB) • A SRB version 3.4 or later from http://www.sdsc.edu/srb/ • Follow instruction on how to run SRB server from UG packed with SRB source release or online at http://hdf.ncsa.uiuc.edu/hdf-srb-html/HDF-SRB-UG.html • Any client application that implements HDF5-SRB Objects • No HDF5 library is required on the client • Example client application: HDFView 2.3 or above SRB Workshop, San Diego
Short DemoHDFView • Support Windows and Linux SRB Workshop, San Diego
Question / Comments? SRB Workshop, San Diego