1 / 55

Numerical-Precision-Optimized Volume Rendering

Sqeeze. Numerical-Precision-Optimized Volume Rendering. Ingmar Bitter Neophytos Neophytou Klaus Mueller Arie Kaufman. Sqeeze. Numerical-Precision-Optimized Volume Rendering. Ingmar Bitter Neophytos Neophytou Klaus Mueller Arie Kaufman. Outline.

acton
Download Presentation

Numerical-Precision-Optimized Volume Rendering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sqeeze Numerical-Precision-Optimized Volume Rendering Ingmar Bitter Neophytos Neophytou Klaus Mueller Arie Kaufman

  2. Sqeeze Numerical-Precision-Optimized Volume Rendering Ingmar Bitter Neophytos Neophytou Klaus Mueller Arie Kaufman

  3. Outline • Numerical precision - a rendering resource

  4. Outline • Numerical precision - a rendering resource • Fixed-point arithmetic

  5. Outline • Numerical precision - a rendering resource • Fixed-point arithmetic • Reverse order precision analysis • Compositing, shading, gradients, classification, sampling/splatting, sample/splat location

  6. Outline • Numerical precision - a rendering resource • Fixed-point arithmetic • Reverse order precision analysis • Compositing, shading, gradients, classification,sampling/splatting, sample/splat location • Results

  7. Outline • Numerical precision - a rendering resource • Fixed-point arithmetic • Reverse order precision analysis • Compositing, shading, gradients, classification, sampling/splatting, sample/splat location • Results • Conclusions

  8. Numerical Precision: A Resource • Double precision computation for all – ideal?

  9. Numerical Precision: A Resource • Double precision computation for all – ideal? • slower then all other alternatives • not possible on graphics cards (at least for now) • expensive on custom chip implementations • and most importantly: not needed to create best possible images!!

  10. Numerical Precision: A Resource • Double precision computation for all – ideal? • slower then all other alternatives • not possible on graphics cards (at least for now) • expensive on custom chip implementations • and most importantly: not needed to create best possible images!! reasons: predominantly 8-bit displays (per channel) limited range intervals throughout

  11. Current Status • Stable volume rendering pipeline: both CPU and GPU[LL94, Lev88, MJC02, Wes90, EKE01, RSEB00] • Interpolation before classification, even for splatting [MMC99] • Caching optimized for volume rendering[Kni00, LCCK02, PSL98] • Precision-limited rendering systems: ATI, NVidia,VolumePro [PHK99], VizardII [MKW02], UltraVis [Kni00] • Completely fixed: final output image display bit precision • 8 bits per RGB color channel on CRTs and LCDs • 8 bits max in DVI standard • SGIs 12 bit color displays are nearly extinct • Radiologists’ requirements are not mass market, same analysis applies

  12. OpenGL Arithmetic: 12=1? • Representation [0, 255] a = b = 255 • Computation = a[0, 255]×b[0, 255] >> 8; = 254  wrong  1 mult, one shift • Alternatively: tmp = a[0, 255]×b[0, 255] + 128; result = (tmp+(tmp >> 8)) >> 8; = 255, correct [Bli95]  1 mult, 2 adds, 2 shifts

  13. OpenGL Arithmetic: 12=1? • Representation: fixed-point I.Fb • I.Fb = I integer bits, F fraction bits • 8 bits  1.7b fixed point number then a = b = 11.7b = 128 • Computation = a1.7b× b1.7b >> 7 = 128  correct  1 mult, one shift  one fewer bit of resolution, but OK (we will see)

  14. Reverse Order Precision Analysis Ray Casting Splatting • Unified ray casting and splatting pipelines • Composite creates the final image Sample Location Splat Location Sample Splat Classify Gradient Shade Composite

  15. Reverse Order Precision Analysis Ray Casting Splatting • Unified ray casting and splatting pipelines • Composite creates the final image • Precisionrequirements propagate backwards Sample Location Splat Location Sample Splat Classify Gradient Shade Composite

  16. Compositing - Math • Pre-(alpha)-multiplied colors: • C = αC = αR, αG, αB • Alpha correction (r samples per unit): • Tcorrected = (1- α)r

  17. Compositing - Math • Pre-(alpha)-multiplied colors: • C = αC = αR, αG, αB • Alpha correction: • Tcorrected = (1- α)r • With back-to-front compositing: • CCompositingBuffer×= Tcorrected += Cfront • TCompositingBuffer×= Tcorrected; αCompositingBuffer= 1-Tcorrected • perform multiplication N timesper pixel  correct solution needs N× F × r bits precision T/CCompositingBuffer • Tcorrected, Cfront T/CCompositingBuffer

  18. Compositing – Precision Theory • 8-bit destination resolution • therefore all partial results can be rounded • drop all bits not contributing to the 8 most significant bits (MSB) • Adding N = 2p samples • allows 8+p bits to influence the 8 MSB • Conversion from αCompositingBufferC to C for display (division) • allows 8+p more bits to influence the 8 MSB • Conversion from αcorrectedC to C for display • allows r times as many bits to influence the 8 MSB • Sufficient resolution is: r× 2 × (8+p) for C, r× (8+p) for α • 32/16 bits for C/αCompositingBuffer for 2563 volumes and no super-sampling • 608 bits for 5122×2048 volumes and 16 samples per voxel

  19. Compositing – Precision Theory • 8-bit destination resolution • therefore all partial results can be rounded • drop all bits not contributing to the 8 most significant bits (MSB) • Adding N = 2p samples • allows 8+p bits to influence the 8 MSB • Conversion from αCompositingBufferC to C for display (division) • allows 8+p more bits to influence the 8 MSB • Conversion from αcorrectedC to C for display • allows r times as many bits to influence the 8 MSB • Sufficient resolution is: r× 2 × (8+p) for C, r× (8+p) for α • 32/16 bits for C/αCompositingBuffer for 2563 volumes and no super-sampling • 608 bits for 5122×2048 volumes and 16 samples per voxel

  20. Compositing – Precision Theory • 8-bit destination resolution • therefore all partial results can be rounded • drop all bits not contributing to the 8 most significant bits (MSB) • Adding N = 2p samples • allows 8+p bits to influence the 8 MSB • Conversion from αCompositingBufferC to C for display (division) • allows 8+p more bits to influence the 8 MSB • Conversion from αcorrectedC to C for display • allows r times as many bits to influence the 8 MSB • Sufficient resolution is: r× 2 × (8+p) for C, r× (8+p) for α • 32/16 bits for C/αCompositingBuffer for 2563 volumes and no super-sampling • 608 bits for 5122×2048 volumes and 16 samples per voxel

  21. Compositing – Precision Theory • 8-bit destination resolution • therefore all partial results can be rounded • drop all bits not contributing to the 8 most significant bits (MSB) • Adding N = 2p samples • allows 8+p bits to influence the 8 MSB • Conversion from αCompositingBufferC to C for display (division) • allows 8+p more bits to influence the 8 MSB • Conversion from αcorrectedC to C for display • allows r times as many bits to influence the 8 MSB • Sufficient resolution is: r× 2 × (8+p) for C, r× (8+p) for α • 32/16 bits for C/αCompositingBuffer for 2563 volumes and no super-sampling • 608 bits for 5122×2048 volumes and 16 samples per voxel

  22. Compositing – Precision Theory • 8-bit destination resolution • therefore all partial results can be rounded • drop all bits not contributing to the 8 most significant bits (MSB) • Adding N = 2p samples • allows 8+p bits to influence the 8 MSB • Conversion from αCompositingBufferC to C for display (division) • allows 8+p more bits to influence the 8 MSB • Conversion from αcorrectedC to C for display • allows r times as many bits to influence the 8 MSB • Sufficient resolution is: r× 2 × (8+p) for C, r× (8+p) for α • 32/16 bits for C/αCompositingBuffer for 2563 volumes and no super-sampling • 608 bits for 5122×2048 volumes and 16 samples per voxel

  23. Compositing – Precision Practice • No alpha correction (r = 1): 2 × (8+p) bits • Iso-surface rendering using “old fashioned” OpenGL: • store not αC but C in frame buffer: (8+p) • bright colors: 5+p • at most 8 non-zero samples per ray (p=3): 5+3=8 bits  standard 24 bit RGBA frame buffer is adequate • Fog visualization • what matters is the ability to see objects though volumetric fog (substance with low opacity) • visual experiments show 15 fractional bits are sufficient

  24. Compositing – Precision Practice • No alpha correction (r = 1): 2 × (8+p) bits • Iso-surface rendering using “old fashioned” OpenGL: • store not αC but C in frame buffer: (8+p) • bright colors: 5+p • at most 8 non-zero samples per ray (p=3): 5+3=8 bits  standard 24 bit RGBA frame buffer is adequate • Fog visualization • what matters is the ability to see objects though volumetric fog (substance with low opacity) • visual experiments show 15 fractional bits are sufficient

  25. Compositing – Precision Practice • No alpha correction (r = 1): 2 × (8+p) bits • Iso-surface rendering using “old fashioned” OpenGL: • store not αC but C in frame buffer: (8+p) • bright colors: 5+p • at most 8 non-zero samples per ray (p=3): 5+3=8 bits  standard 24 bit RGBA frame buffer is adequate • Fog visualization • what matters is the ability to see objects though volumetric fog (substance with low opacity) • visual experiments show 15 fractional bits are sufficient

  26. Compositing – Precision Practice • No alpha correction (r = 1): 2 × (8+p) bits • Iso-surface rendering using “old fashioned” OpenGL: • store not αC but C in frame buffer: (8+p) • bright colors: 5+p • at most 8 non-zero samples per ray (p=3): 5+3=8 bits  standard 24 bit RGBA frame buffer is adequate • Fog visualization • what matters is the ability to see objects though volumetric fog (substance with low opacity) • visual experiments show 15 fractional bits are sufficient

  27. Compositing – Precision Practice • No alpha correction (r = 1): 2 × (8+p) bits • Iso-surface rendering using “old fashioned” OpenGL: • store not αC but C in frame buffer: (8+p) • bright colors: 5+p • at most 8 non-zero samples per ray (p=3): 5+3=8 bits  standard 24 bit RGBA frame buffer is adequate • Fog visualization • what matters is the ability to see objects though volumetric fog (substance with low opacity) • visual experiments show 15 fractional bits are sufficient

  28. Compositing – Precision Practice • No alpha correction (r = 1): 2 × (8+p) bits • Iso-surface rendering using “old fashioned” OpenGL: • store not αC but C in frame buffer: (8+p) • bright colors: 5+p • at most 8 non-zero samples per ray (p=3): 5+3=8 bits  standard 24 bit RGBA frame buffer is adequate • Fog visualization • what matters is the ability to see objects though volumetric fog (substance with low opacity) • visual experiments show 15 fractional bits are sufficient

  29. Compositing – Conclusion Least-significant-bit-fog at various bit precisions 8 10 12 14 15 16 5123dataset r = 2 • Preferred bit-aware back-to-front compositing equations: • αC1.15b×= T1.15bsample += C1.15bsample • T1.15b×= T1.15bsample

  30. Shading - Math • PhongCcolor = kambient OobjectColor IlightIntensity+kdiffuse O Σi{ Ii (N•Li) } +kspecular Σi{ Ii (R•Li)r } • k є [0,1] kambient +kdiffuse +kspecular =1 • OobjectColor (8 bit) and IlightIntensityє [0,1] • N•Li andR•Li є [-1,1], but є [0,1] after clamping • PhongCcolor = є [0,1] (possibly clamping Σi)

  31. Shading - Analysis • PhongCcolor needs to be as precise as 1.15b • Use 16.16b for all multiplications [0,1)× [0,1] • sufficient precision and no overflow

  32. Shading – New Computation • Replace specular exponentiation with recursive multiplies • repeatedly multiply number with itself • works for all exponents r=2n • when r=26 (16 bit precision), then max error < 0.005% • better results than Knittel’s parabola approximation

  33. Shading – New Computation • Replace specular exponentiation with recursive multiplies • repeatedly multiply number with itself • works for all exponents r=2n • when r=26 (16 bit precision), then max error < 0.005% • better results than Knittel’s parabola approximation Knittel’s parabola pow r=2n

  34. Shading - Conclusion • Preferred bit-aware Phong shading equation: C16.16b = k16.16bambient O0.8bobjectColor I16.16blight+k16.16bdiffuseO0.8bΣi{ I16.16bi (N16.16b•L16.16bi) } +k16.16bspecular Σi{ I16.16bi (R16.16b•L16.16bi)2^n }

  35. Gradients - Math • Gx = 0.5 sample(x+1,y,z) -0.5 sample(x-1,y,z) • Gy = 0.5 sample(x,y+1,z) -0.5 sample(x,y-1,z) • Gy = 0.5 sample(x,y,z+1) -0.5 sample(x,y,z-1)

  36. Gradients - Analysis • G = G1.Fb • Discrete nearest gradient vector neighbors • sin φ = 1/2F,sin φ ≈ φ → φ ≈ 1/2F • Maximum error for specular intensity, large r • r = 64, 164 != 1, but 164 = (1- 1/2F)64 • error of 22%, 6.1%, 1.6%, 0.4%for F of 8, 10, 12, 14 φ

  37. Gradients - Analysis • 5123-sized spheres with Phong highlights • 4, 6, 8, 10, 12, 14 bit gradients • Diffuse artifacts for 4 and 6 bits • Specular artifacts up to 10 bits 4 6 8 10 12 14 10 12 14

  38. Gradients - Conclusion • Thus, 12 bits dynamic range is needed • Now consider normalization: • reduces I.Fb to 1.Fb • up to I bits will be added to the fractional part • Volume samples often have 12 bits • Gx,y,z with 12.12b minimum representation • Gx,y,z with 16.16b preferred representation • leaves room for interpolation bits in normalization

  39. Classification – Prelims and Recaps • Use of T instead of α is more efficient in compositing operation • Largest visual precision/quantization error occurs at high transparencies (low opacities) • need more bits for T than for C, just to be sure • Want transfer function lookup table to be cache-friendly • power-of-2 RGBA-tuple alignment • Would like to use pre-integrated classification for color and opacity transfer functions [EKE01, MGS02]

  40. Classification – Prelims and Recaps • Use of T instead of α is more efficient in compositing operation • Largest visual precision/quantization error occurs at high transparencies (low opacities) • need more bits for T than for C, just to be sure • Want transfer function lookup table to be cache-friendly • power-of-2 RGBA-tuple alignment • Would like to use pre-integrated classification for color and opacity transfer functions [EKE01, MGS02]

  41. Classification – Prelims and Recaps • Use of T instead of α is more efficient in compositing operation • Largest visual precision/quantization error occurs at high transparencies (low opacities) • need more bits for T than for C, just to be sure • Want transfer function lookup table to be cache-friendly • power-of-2 RGBA-tuple alignment • Would like to use pre-integrated classification for color and opacity transfer functions [EKE01, MGS02]

  42. Classification – Prelims and Recaps • Use of T instead of α is more efficient in compositing operation • Largest visual precision/quantization error occurs at high transparencies (low opacities) • need more bits for T than for C, just to be sure • Want transfer function lookup table to be cache-friendly • power-of-2 RGBA-tuple alignment • Would like to use pre-integrated classification for color and opacity transfer functions[EKE01, MGS02]

  43. Classification - Math • Desired lookup table entries: R1.8bG1.8bB1.8bT1.16b 5.5 bytes • Common lookup table entries: R0.8bG0.8bB0.8bα0.8b 4 bytes

  44. Classification - Math • Desired lookup table entries: R1.8bG1.8bB1.8bT1.16b 5.5 bytes • Common lookup table entries: R0.8bG0.8bB0.8bα0.8b 4 bytes • Better lookup table entries: R0.8bG0.8bB0.8bsqrt(α)0.8b spreads low α • Computed lookup after T = 1-(sqrt(α)2): R0.8bG0.8bB0.8bT1.16b squaring doubles precision

  45. Classification - Conclusion Foot with least-significant-thin-tissue-fog • Preferred bit-aware lookup table entries: R0.8bG0.8bB0.8bsqrt(α)0.8b α0.8b sqrt(α)0.8b α0.16b

  46. Sample Interpolation - Math • sample = voxel0× (1-w) + voxel1× w • sample = w × (voxel1 - voxel0) + voxel0 • Requirements: • Gx,y,z, derived from samples,need 12 bit dynamic range • samples need 12 bit values for transfer function lookup • cover both low and high dynamic range neighborhoods • Therefore, sample12.12b is a minimum requirement • integer part comes from voxels voxel12.0b • fractional part comes from interpolation w1.12b

  47. Sample Interpolation - Conclusion • Preferred bit-aware sample interpolation: sample12.12b = w1.12b× (voxel112.0b - voxel012.0b) + voxel012.0b • Splats start on voxels, need no interpolation: splat12.0b = voxel12.0b

  48. Sample Location - Math k • k-th sample location = startPos + Σk Vinc • Perspective rays need to differ enough to allow 1024 rays across 60 degrees, or 0.05◦ • sin φ = (k 1/2F) / k,sin φ ≈ φ → φ ≈ 1/2F • F = 6, 12, 16 → φ = 0.9◦, 0.05◦, 0.0009◦ • Also, need to address 2048 slices (integer positions) → 11bits • Thus, need overall 11.12b φ

  49. Sample Location - Conclusion • Preferred bit-aware sample location: • perspective projection: sampleLocation11.12b = startPos11.12b + Σ Vinc1.12b • parallel projection: sampleLocation11.6b (0.9◦ OK)

  50. Splat Scan Conversion - Math • Splats project onto image grid → reverse rays • Allow as many as 2048 splat rays across 60 degrees, or 0.025◦ • Hence, twice the ray casting precision • one extra fractional bit F=13 • Also address 2048 slices (11bits) • Thus, need overall 11.13b φ

More Related