180 likes | 369 Views
IEEE SciVis ’13 Uncertainty & Multivariate Analysis. Efficient Local Statistical Analysis via Integral Histograms with Discrete Wavelet Transform. Teng-Yok Lee & Han-Wei Shen. Local distributions (region histograms) are widely used…. Example 1: Transfer Function Design. Example 3:
E N D
IEEE SciVis’13 Uncertainty & Multivariate Analysis Efficient Local Statistical Analysis via Integral Histograms with Discrete Wavelet Transform Teng-YokLee & Han-Wei Shen
Local distributions (region histograms) are widely used… Example 1: Transfer Function Design Example 3: Time-varying data overview Example 2: Local vector field analysis … but computing region histograms for arbitrary sizes is not always efficient
Integral Histograms: A Solution in Image Processing Histogram of the region bounded by (x0, y0) and (x1, y1) : I(x1, y1) – I(x1, y0) – I(x0, y1) + I(x0, y0) I(x, y): Integral histogram of (x, y) The histogram of the region bounded by (0, 0) and (x, y) (x, y) (x0, y1) (x1, y1) + – (x0, y0) (x0, y0) (x1, y0) – + (0, 0) F. Porikli, Integral histogram: a fast way to extract histograms in Cartesian spaces. In CVPR ‘05.
Integral Histograms: Properties • Widely used in image processing • Easy to implement • Performance: A region histogram can be computed by combining 4 integral histograms • Storage-challenging • Each grid point is associated with one histogram, implying that the data is magnified by the number of histogram bins • Especially true for large images, videos, and 3D volumes
Our Solution: WaveletSAT • WaveletSAT: Efficient integral histogram compression with Discrete Wavelet Transform (DWT) • Contributions • Efficient region histogram query with limited storage overhead • A single shot algorithm: No need to build the integral histograms and then apply DWT • Efficient compression: limited memory footprints & easy to parallelize
From Integral Histograms to Bin SATs Integral histograms Histogram Bin Bin SAT: The SAT formed by the values of histogram bin b of all integral histograms Histogram Bin The bin values of integral histogram at x is the sum from the left to x in the binary function A 1D Example
Bin SAT: Monotonically Increasing and Smooth The image Mandrill. Lots of high frequency details. The bin SATs of its integral histograms of 32 bins. The bin SATs are smooth. BinSATs can be transformed to sparse coefficients via FFT/DCT/DWT But … Need to process all bin SATs Require all data points
Bin SATs & Step Functions • A bin SAT = sum of step functions • Only part of grid points contribute to a bin SAT • For DWT, DCT, or FFT • No need to wait for all points • The transform of bin SAT = Sum of the transform of these step functions The bins SAT for bin 2 = Sum of step functions sx where x’s value is in bin 2 Input function
Efficient Wavelet Transform for Step Functions • With DWT, each step function can be efficiently transformed • DWT: Computing the local difference with wavelet functions at different scales • Only the wavelet that covers the edge has a non-0 coefficient Before the edge: Wavelet Coef = 0 Cover the edge: Wavelet Coef ≠ 0 After the edgie: Wavelet Coef = 0 Wavelet function: A windowed function to compute local difference (Wavelet Coefficient)
WaveletSAT: Algorithm • As DWT has O(log N) scales, give a step function • Each scale has only 1 non-zero wavelet coefficient • This step function is transformed to O(log N) non-zero coefficients • WaveletSAT algorithm • Input: An 1D array of N points • Output: Wavelet coefficients for all bin SATs • For each point • Find the corresponding bin B • Update the O(log N) non-zero wavelet coefficients for bin B
WaveletSAT: Benefits • As each point only contributes to a limited number of bins, the time complexity for N points is O(N log N) • The complexity is independent to the number of bins • Each step function can be transformed separately & out-of-order • Easy to parallelize • No need to pre-compute the integral histograms • For each point • Find the corresponding bin B • Update the O(log N) non-0 wavelet coefficients for bin B
Query of Integral Histograms Wavelet Functions Wavelet Coef. Reconstruction of bin SATs via Inverse DWT. Inverse DWT: Linear combination of wavelet functions with their wavelet coefficients × w0 + × w1 + × w2 + × w3 When query a single integral histogram at x, only its bin value at x is needed The wavelet functions that do not cover x can be discarded + × w4 + × w5 + × w6 + × w7 = Bin SAT
Optimization for Region Histogram Query Wavelet Functions Wavelet Coef for bins Performance issue: The reconstruction is needed for all bins. Recall: A region histogram is the combination of multiple integral histograms. These integral histograms can share the same wavelet functions and coefficients. × w0 … w0 + × w1 w1 … + × w2 w2 … + × w3 w3 … + … × w4 w4 + × w5 w5 … + × w6 w6 … + × w7 w7 …
WaveletSAT for High Dimensional Data A 2D Array DWT along the row DWT along the column • Now a Bin SAT is the sum of multiple D-dimensional step function • For each step function, sequentially apply DWT to all dimensions • A D-dimensional step function has O(log ND) non-zero wavelet coefficients
Result: Encoding Time & Compression Rates • Comparison: ZIP compression for bin SATs • Encoding time (lower is better) • WaveletSAT is more efficient when #bins increases • GPUs bring 4 – 6 time speed up • Compression rate (CR, higher is better) • WaveletSAT achieves higher CR when #bins increases • CR of WaveletSAT can be further boosted by ZIP A 2D slice of dataset Ocean Blue curves: Integral histograms with different zip levels. Red Curves: WaveletSAT (□: w/o ZIP; ◊: w/ ZIP) Blue curves: Integral histograms with different zip levels. Red Curves: WaveletSAT (◊: CPU; □: GPU)
Result: Query Time • If integral histograms can be fully loaded into the memory • Faster than WaveletSAT, but getting slower when #bins is increasing • Not doable for larger datasets • The optimization of region histogram query reduces the performance gap Blue curves: Integral histograms in core. Red Curves: WaveletSAT Black Curves: With the optimization for region histograms A 2D slice of dataset MJO
Summary • WaveletSAT: An efficient algorithm for region histogram query • Efficient in terms of encoding, storage, & reconstruction • Future works • Utilize WaveletSAT to decide the scale of salient features • Parallelize WaveletSAT for distributed environments
Acknowledgements • Dataset sources • Ocean: M. Maltrud at Los Alamos National Laboratory • MJO: S. Hagos & R. L. Leung at Pacific Northwest National Laboratory • 3D volumes: The repo maintained by C. Scheideggeret al. • Mandrill: ? • Source code: https://code.google.com/p/wavelet-sat Questions? This work was supported in part by NSF grant IIS-1017635, NSF grant IIS-1065025, US Department of Energy OESC0005036, Battelle Contract No. 137365, and Department of Energy SciDAC grant DE-FC02-06ER25779, program manager Lucy Nowell.