1 / 30

Data Compression for Hardware-accelerated Volume Rendering

This presentation discusses efficient data compression methods for large-scale, multi-dimensional datasets in volume rendering, focusing on vector quantization and hierarchical encoding. It addresses compression, representation, and rendering challenges, with a GPU-based approach for interactive visualization. The talk covers future directions in video compression and promising technologies. Various algorithms and optimizations like PCA-Split and LBG-Algorithm are explained for high-quality decoding and rendering, with examples showcasing significant speed-ups and fidelity improvements.

jwarwick
Download Presentation

Data Compression for Hardware-accelerated Volume Rendering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Jens Schneider Rüdiger Westermann Technical University Munich Data Compression for Hardware-accelerated Volume Rendering

  2. Motivation Need to deal with data of increasing size: • Large-scale • Multi-dimensional • Multi-parameter Increasing problems: • Compression • Representation • Rendering We will adress all three problems!

  3. Talk Outline The Approach – Vector Quantization Contributions Quality and speed • Hierachical encoding • PCA-Split • Progressive encoding of time-resolved data Multi-dimensional data • Vectors of arbitrary length Rendering from compressed data • GPU-based decoding and rendering • Per-fragment evaluation • Interactive framerates

  4. Talk Outline The Application – Volume Rendering • Large-scale volumetric data sets • Time-varying sequences 1.4 GB / 20 fps 16 MB / 14 fps 0.78 MB / 11 fps 70 MB / 24 fps

  5. Talk Outline The Future – Video Compression ? • Video compression techniques very exciting! • Merge video decoding pipeline and 3D API Promising Technologies • MPEG-II Streams • XvMC API • OpenGL Superbuffers • Commodity graphics hardware video functionality Chip vendors just beginning to realize this!

  6. Input mapping Encoder in=E(Xn) Xn Codebook C with codewords Decoder in X‘n=C(in) Output mapping Vector Quantization

  7. Vector Quantization LBG-Algorithm • Linde, Buzo and Gray 1980 • Iterative refinement of a previous Codebook • Sensitive to quality of first Codebook • Usually computationally expensive Speed-Up possible (and necessary) • Partial searches • Fast searches • Better initial Codebook (i.e. PCA-Splits) LBG-Algorithm can be fast!

  8. Vector Quantization The PCA-Split • Lensch et.al. 2001 – BRDF Compression • Covariance analysis to find optimal splitting plane • Cut a cluster of input vectors in two by this plane. • Plane is given by centroid of current set and largest Eigenvector (= normal) of the Auto-Covariance Matrix

  9. Vector Quantization LBG as PCA post-processing • Increases fidelity • Leads to stable Voronoi-Regions • Only a few steps are necessary • Great speed-up compared to LBG only! A series of LBG steps, codebook from last slide

  10. 32D vectors, 1MB 4D vectors, 2MB Original, 32MB Example Full-color confocal microscopy scan, 5122x32xRGB

  11. Hierarchical Vector Quantization Laplace Decomposition

  12. 43 dim. VQ 23 dim. VQ Direct Copy Hierarchical Vector Quantization

  13. Hierarchical Vector Quantization Output: • One RGB Index-Volume • Two Codebooks RGB Index-Volume  3D Texture Codebooks  2D -Textures

  14. Example Visible Human (Male), RGB slice 2048x1216 Compression took 10.0 seconds, PSNR = 34.72dB Original (7.1MB) Compressed (285KB)

  15. Timings Reference System: P4 2.8GHz, 1GB memory VHP Slice, 2048x1216 RGB 10.0 sec Engine 2562x128 CT-Scan 19.0 sec Skull 2563 CT-Scan 50.6 sec Vortex Sequence, 1283x100 13 (5) min Shockwave Sequence, 2563x89 29 (13) min

  16. Decoding process in flatland Rendering GPU-based decoding • Indices stored in 3D RGB-texture (3/64th original size) • Decode index per block  dependent fetch • Decode adress per block 43 adress texture

  17. Rendering Render 3D index and adress texture • Nearest neighbor interpolation for both • GL_REPEAT for adress texture Per-fragment decoding • Decode detail components and dependent fetch • Add the details to average component (Red channel) • Lookup result in 1D RGB transfer function Problem: Complex fragment shader slows down rendering

  18. Rendering Solution:Deferred Fragment Processing Avoid decoding in empty regions. „Empty“ means: a) -Transfer function maps 0  0. • Check on CPU • Switch between two possible rendering modes b) Average value is 0 (Red channel) • Check in a first, simple fragment program • Fragment‘s depth value is set accordingly • Second pass: discard (early Z-Test) or render fragment • Full decoding only performed in second pass

  19. 2562x128 Engine CT Scan 19.0 seconds, PSNR = 36.17dB (P4 2.8GHz) Compressed (402KB) – 12 fps Original (8MB) – 19 fps

  20. 2563 Skull CT Scan 50.6 seconds, PSNR = 35.35dB (P4 2.8GHz) Original (16MB) – 14 fps Compressed (780KB) – 11 fps

  21. Time-resolved Sequences Exploit temporal coherences during compression: • Group of Frames (GOF) First frame in a GOF: • PCA-Split followed by LBG-Refinement Other frames: • LBG-refinement of last Index-Volume and Codebook Result: • Great speed-up (factor 2 to 3) • Very large GOFs possible (64+ frames) • Virtually same fidelity as frame-by-frame

  22. 1283x100 Vortex-Simulation 5 minutes, PSNR = 34.43dB (P4 2.8 GHz) Original (200MB) - 28 fps Compressed (11MB) - 16 fps

  23. 2563x89 Shockwave-Sequence 13 minutes, PSNR = 51.36dB (P4 2.8 GHz) Original (1.4GB) - 20 fps Compressed (70MB) - 24 fps

  24. Conclusions • Compression ratios of approx. 20:1 • Interactive rendering possible • Easy random access to each frame • Wide variety of data sets handled • Currently only nearest neighbor interpolation • Mainly limited by performance / instruction count. • Tri-linear interpolation can be done on newer GPUs!

  25. Online Demo Shockwave sequence Vortex sequence

  26. MPEG Stream CPU De-Quantisation Motion Compensation Inverse DCT Video Chip Colorspace Conversion The Future ? Typical MPEG Decoding Pipeline Predictor / Corrector method Further compression opportunities

  27. MPEG Stream De-Quantisation Motion Compensation Inverse DCT Bind as Texture P- / Super-Buffer Blit Colorspace Conversion Fragment Processing The Future ? Merge with OpenGL API XvMC

  28. XvMC Extension to X-Server Already supported on: • GeForce 4 MX / GeForce FX (full) • Other GeForces (no iDCT) Driver-Code • No OpenSource • Other vendors working on implementation Specification: Mark Vojkovich, XFree Project Good Performance !

  29. Other Possibilities Super-Buffer / „Über-Buffer“ • OpenGL extension • Basically allows malloc() on video RAM • Beta implementation available Might be used to merge video and OpenGL pipes! • More OS Independence • More hardware Independence • Easier to implement • Only on newer GPUs Some research still necessary!

  30. Questions ? Thank You!

More Related