310 likes | 465 Views
Accelerating Marching Cubes with Graphics Hardware. Gunnar Johansson, Linköping University Hamish Carr, University College Dublin. Presentation outline. Goal Background Previous work Our approach Results Conclusions, Future work. Goal.
E N D
Accelerating Marching Cubes with Graphics Hardware Gunnar Johansson, Linköping UniversityHamish Carr, University College Dublin
Presentation outline • Goal • Background • Previous work • Our approach • Results • Conclusions, Future work
Goal • Isosurface visualization for studying 3D scalar functions • Marching Cubes is standard algorithm • This work presents GPU acceleration in combination with CPU-based algorithmic acceleration (interval/Kd-trees)
Isosurface visualization • Goal: study a volumetric scalar function, f(x) • Isosurface is a set of points with equal isovalue (h){ x : f(x) = h } Illustration by Stefan Roettger, University of Erlangen
Marching Cubes • Each corner of a cube is classified asabove (black), or below(white), a given isovalue • Vertices of surface is linearly interpolated along the edges • Normals are computed usingcentral differences and interpolated along the edges
Example application • Studying medicaldatasets • MRI, CT scans
Previous workAlgorithmic acceleration • Original marching cubes visits all cells in the dataset • O(N) in time complexity, N = number of cells • However, an isosurface is expected to intersect only a fraction of the cells • Efficient search structures can be used to store maximum and minimum value of each cell • Kd-tree O(√N + k), k = size of isosurface • Interval tree O(log N + k)
Previous workGPU acceleration • Restricted to tetrahedral cells • Marching tetrahedra • Pascucci, 04 • Klein et al, 04 • Reck et al, 04 • Cannot create/delete vertices on GPU • “Worst-case” strategy • Always fed 4 vertices (a quad) to the GPU
Previous workGPU acceleration • CPU tasks • Selects cell and sends data to GPU • GPU tasks • Classifies cell • Interpolates surface vertices • Compute normals (per face) • Bottleneck? • Data transfer CPU – GPU
Previous workGPU acceleration • Parallel to our work • Goetz et al, “Real-time marching cubes on the vertex shader”, 05 • Classifies cell on both CPU and GPU • Do not apply interval/Kd-trees • Only computes face normals
Previous work • Traditional pipeline(with accelerating search structures)
Previous work • GPU accelerated pipeline(by Pascucci / Reck et al / Klein et al)
Our approach • Marching cubes on GPU: Basic challenges • Cannot create vertices on GPU • Too costly to send all possible triangulations (“worst-case” strategy)
Our approach • “Caching cell topology” • Store each case triangulation on the GPU using display lists • Classify cell on CPU and invoke corresponding display list • Minimize CPU – GPU bandwidth bottleneck by storing dataset on GPU
Our approach • “Caching cell topology”
7 6 3 2 4 5 0 1 Our approach • Display list stores corner indices • Use indices for texture lookup • Use values from textureto interpolate verticesand normals (0,3) (0,4) (0,1)
Our approach • Case classification still a CPU bottleneck?
Our approach • Accelerate case classification by“pre-computing cell topology” • Pre-compute possiblecases for each cell • Store all intervalswith correspondingcase in intervalor Kd-tree
Our approach • “Case interval/Kd-tree” • Shifts case classification to pre-computation • Storage requirements increase for noisy dataset (as much as 7 times)
Results • First approach • Store dataset packed in 2D 1-channelfloat texture • Central differencing on GPU for normals • Results disappointing • 1.2-1.6 speedup for Marching Cubes without accelerating structures • Even decrease in speedup when using accelerating structures: GPU bottleneck
Results • Vertex texture support is currently poor • Only 2D 1/4–channel floats • High latency • Central differencing • 12 texture lookups per vertex normal • 14 lookups in total for each vertex
Results • Second approach • Pre-compute normals and store dataset and normals packed in 2D 4-channel float texture • Only need 2 lookups for each vertex • Results improved • Speedup of 3-4 times compared with CPU counterpart • 128x128x128 “Hydrogen atom” dataset • Interval tree + CPU: 27 fps • Interval tree + GPU: 112 fps (4 times speedup)
Conclusions • Accelerating isosurface extractionusing GPU • Cache all possible cell triangulations (cases) • Use CPU for classification • Use GPU for interpolation • Optimize CPU classification by pre-computing all possible cases (case interval/Kd-tree)
Conclusions • Applicable to any interpolant (in this work described using Marching Cubes) • Current hardware impose restrictions • Float textures, high latency for vertex texture lookup
Future work • Move computation to fragment processor • More powerful than vertex processor • Better, more efficient texture support • Ability to download (to CPU) the extracted surface • Optimize memory usage (texture/system) • Apply to higher-order interpolants
Thank you Acknowledgements:Thanks to UCD for funding