N-bit and ScaleOffset filters

N-bit and ScaleOffset filters MuQun Yang National Center for Supercomputing Applications University of Illinois at Urbana-Champaign Urbana, IL 61801 contact: ymuqun@ncsa.uiuc.edu

N-Bit Filter Outline • Definition • An usage example • Limitations

Enable N-Bit filter • Create a dataset creation property list • Set chunking (and specify chunk dimensions) • Set up use of the N-Bit filter • Create dataset specifying this property list • Close property list

N-Bit filter: usage example /*Define dataset datatype (N-Bit), and set precision, offset */ datatype = H5Tcopy(H5T_NATIVE_INT); precision = 17; H5Tset_precision(datatype,precision); offset = 4; if(H5Tset_offset(datatype,offset); /* Set the dataset creation property list for N-Bit */ chunk_size[0] = CH_NX; chunk_size[1] = CH_NY; properties = H5Pcreate (H5P_DATASET_CREATE); H5Pset_chunk (properties, 2, chunk_size); H5Pset_nbit (properties); /* Create a new dataset with N-Bit datatype */ dataset = H5Dcreate (file, DATASET_NAME, datatype, dataspace, properties);

N-bit filter restrictions • Only compresses N-Bit datatype or field derived from integer or floating-point • Assumes padding bits of zero • No handling if fill value datatype differs from dataset datatype

ScaleOffset Filter Outline • Definition • How does ScaleOffset filter work • Why uses ScaleOffset filter • Usage examples • Performance with EOS data • Limitations

ScaleOffset filter • ScaleOffset compression performs a scale and/or offset operation on each data value and truncates the resulting value to a minimum number of bits before storing it. The datatype is either integer or floating-point. • offset in Scale-Offset compression means the minimum value of a set of data values • If a fill value is defined for the dataset, the filter will ignore it when finding the minimum value

How ScaleOffset works An example for Integer Type • Maximum is 7065; Minimum is 2970 The "span" = Max-Min+1 = 4096 • Case 1: No fill value is defined. Minimum number of bits per element to store = ceiling(log2(span)) = 12 • Case2: Fill value is defined in this array. Minimum number of bits per element to store = ceiling(log2(span+1)) = 13

How ScaleOffset works An example for Integer Type (cont.) • Compression: 1. Subtract minimum value from each element 2. Pack all data with minimum number of bits • Decompression: 1. Unpack all data 2. Add minimum value to each element • Outcome: 1. Save about 60% disk space for this case

How ScaleOffset works An example for Floating-point Type • D-scaling factor: the number of decimal precision to keep for the filter output • Floating-point data:{104.561, 99.459, 100.545, 105.644} • D-scaling factor:2 • Preparation for Compression 1. Calculate the minimum value = 99.459 2. Subtract the minimum value Modified data: {5.102, 0, 1.086, 6.185} 3. Scale the data by multiplying 10 ^ D-scaling factor = 100 Modified data: {510.2, 0, 108.6, 618.5} 4. Round the data to integer Modified data: {510 , 0, 109, 619}

How ScaleOffset works An example for Floating-point Type (cont.) • Compression and Decompression: using ScaleOffset for integer • Restoration after decompression 1. Divide each value by 10^ D-scaling factor 2. Add the offset 99.459 3. The floating point data {104.559, 99.459, 100.549, 105.649} • Outcome: 1. Lossy compression 2. Compression ratio will depend on D-scaling factor

Scale-Offset filter compresses floating-point data • GRiB data packing method • The Scale-Offset compression of floating-point data is lossy • Two scaling methods: D-scaling E-scaling Currently only D-scaling is implemented

Why ScaleOffset Filter? • Internal HDF5 filter • Easy to understand • Integer: lossless (by default) • Floating-point: GRIB lossy compression • Easy to control floating compression • D-scaling factor • Easy to estimate the compression ratio

H5Pset_scaleoffset API H5Pset_scaleoffset (hid_t plist_id, H5Z_SO_scale_type_t scale_type, int scale_factor) • plist_id IN: Dataset creation property list identifier • scale_type IN: Flag indicating compression method • H5Z_SO_FLOAT_DSCALE (0) Floating-point type • H5Z_SO_INT (2) Integer type • scale_factor IN: Flag indicating compression method • If scale_type is H5Z_SO_FLOAT_DSCALE, decimal scale factor • If scale_type is H5Z_SO_INT, scale_factor denotes minimum-bits, should be a positive integer or H5Z_SO_INT_MINBITS_DEFAULT

Integer example /* Set the fill value of dataset */ fill_val = 10000; H5Pset_fill_value(properties, H5T_NATIVE_INT, &fill_val); /* Set parameters for Scale-Offset compression*/ H5Pset_scaleoffset (properties, H5Z_SO_INT H5Z_SO_INT_MINBITS_DEFAULT); /* Create a new dataset */ dataset = H5Dcreate (file, DATASET_NAME, H5T_NATIVE_INT, dataspace, properties);

Floating-point example fill_val = 10000.0; /* Set the fill value of dataset */ H5Pset_fill_value(properties, H5T_NATIVE_FLOAT, &fill_val); /* Set parameters for Scale-Offset compression; use D-scaling method, set decimal scale factor to 3 */ H5Pset_scaleoffset (properties,H5Z_SO_FLOAT_DSCALE, 3); /* Create a new dataset */ dataset = H5Dcreate (file, DATASET_NAME, H5T_NATIVE_FLOAT, dataspace, properties);

Limitations • Compressed floating-point data range is limited by the size of corresponding unsigned integer type. • Long double is not supported.

Thank you This work was funded by the NASA Earth Science Technology Office under NASA award AIST-02-0071 and UCAR Subaward No S03-38820. Other support was provided based upon the Cooperative Agreement with the National Aeronautics and Space Administration (NASA) under NASA grant NNG05GC60A. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of NASA.

N-bit and ScaleOffset filters

N-bit and ScaleOffset filters

Presentation Transcript

Fluorescence and filters

Filters

Filters

Film and filters

Filters and Edges

Bit by bit

Bit and Byte

Chapter 11 Filters and Tuned Amplifiers Passive LC Filters Inductorless Filters Active-RC Filters

Processes and Filters

Illumination and Filters

N-Bit and Scale-offset filters

Manufacturers and Filters – Is it all still a bit fuzzy?

Filters and EQ

Chapter 11 Filters and Tuned Amplifiers Passive LC Filters Inductorless Filters Active-RC Filters

Filters

Filters

Filters and Utilities

Filters and Results

Filters

FILTERS

Syringe Filters and Membrane Filters