640 likes | 1.03k Views
GPGPU: Distance Fields. Avneesh Sud and Dinesh Manocha Feb 12, 2007. So Far. Overview Intro to GPGPU using OpenGL Current Architecture (Cell, G80) Programming (CUDA, Compilers) Applications (Vision). Interesting Reading on Parallel Computing.
E N D
GPGPU: Distance Fields Avneesh Sud and Dinesh Manocha Feb 12, 2007
So Far • Overview • Intro to GPGPU using OpenGL • Current Architecture (Cell, G80) • Programming (CUDA, Compilers) • Applications (Vision)
Interesting Reading on Parallel Computing • The Landscape of Parallel Computing Research: A View from Berkeley
This Lecture • Distance Fields and Voronoi Diagrams • Hands on demo • Advanced: Optimization • Discussion: Why fast on a GPU?
This Lecture • Distance Fields and Voronoi Diagrams • Hands on application demo • Parallel algorithm • Example code: 2D • Visual Debugging (imdebug) • Example code: 3D • Advanced: Optimization • Discussion: Why fast on a GPU?
Outline • Distance Fields and Voronoi Diagrams • Hands on application demo • Advanced: Optimization • Discussion: Why fast on a GPU?
Distance Field Given a set of geometric primitives (sites), it is a scalar field representing the minimum distance from any point to the closest site Sites 2D Distance field
Generalized Voronoi Diagram Given a collection of sites, it is a subdivision of space into cells such that all points in a cell are closer to one site than to any other site Voronoi Site Voronoi cell Sites Voronoi diagram
Region where distance function contributes to final distance field = Voronoi Region Voronoi Diagram and Distance Fields Distance field Voronoi diagram
Distance Functions: 2D A scalar function f (x)representing minimum distance from a point x to a site graph z = f (x,y) f (x,y)=√x2+y2
Distance Functions: 3D • Distance function of a site to plane is a quadric Point Site Circular Paraboloid Line Site Elliptic Cone Plane Site Plane
Why Should We Compute Them? Collision Detection & Proximity Queries Robot Motion Planning Surface Reconstruction Non-Photorealistic Rendering Surface Simplification Mesh Generation Shape Analysis
Why Difficult? • Exact Computation • Compute analytic boundaries Analytic Boundary
Why Difficult? • Exact Computation • Compute analytic boundaries • Boundaries composed of high-degree curves and surfaces and their intersections • Complex and difficult to implement • Robustness and accuracy problems
Approximate Computation Approximate Algorithms Discretize Sites Discretize Space GPU
Outline • Distance Fields and Voronoi Diagrams • Hands on application demo • Parallel algorithm • Example code: 2D • Visual Debugging (imdebug) • Demo: 3D • Advanced: Optimization • Discussion: Why fast on a GPU?
Brute-force Algorithm Record ID of the closest site to each sample point Coarsepoint-samplingresult Finerpoint-samplingresult
Slight Variation… = For each site, compute distances to all sample pts Given sites and uniform sampling Composite through minimum operator Record IDs of closest sites
GPU Algorithm… = For each site, compute distances to all pixels Given sites and frame buffer Composite through depth test Read-back IDs of closest sites
GPU Algorithm: 2D • Demo: Point site Point coord (uniform parameter) Pixel coord
GPU Algorithm: 2D Source • Initialization • Setup GL State (Depth, Render Target) • Setup fragment program • Fragment program • Computation: For each point site • Set program parameters • Execute fragment program • Display • Display results
GPU Algorithm: 2D Source • Show source … • Compile cg source and show assembly
GPU Algorithm: Debugging • Visual debugging with imdebug (by Bill Baxter) • http://www.billbaxter.com/projects/imdebug/index.html • Steps • Modify fragment program • Readback and display buffer contents
GPU Algorithm: Debugging • Example
GPU Algorithm: 2D • Demo: Line site End-Point coords (uniform parameters) Pixel coord Careful: Equation is to an infinite line
GPU Algorithm: 2D • Line segment: Region closer to interior of line segment In remaining region?
GPU Algorithm: 3D • Graphics hardware computes one 2D slice • Sweep along 3rd dimension (Z-axis) computing 1 slice at a time 3D Voronoi Diagram
Outline • Distance Fields and Voronoi Diagrams • Hands on application demo • Advanced: Optimization • Discussion: Why fast on a GPU?
GPU Optimizations • Where to optimize? • Make fragment program run faster • GPU / Application dependent optimizations • Reduce memory bandwidth • Reduce number of invocations of fragment program • Geometric culling
GPU Optimizations: Recommended Reading • Practical Performance Analysis and Tuning • GPU Programming Guide • GPU Gems 2 • GPU Computation Strategies and Tips (Ian Buck) • GPU Program Optimization (Cliff Woolley)
GPU Optimizations • Where to optimize? • Make fragment program run faster • GPU / Application dependent optimizations • Reduce memory bandwidth • Reduce number of invocations of fragment program • Geometric culling
Optimization: Fragment Program • Reduce number of instructions! • Do we need dist(x, p) or dist2(x, p)? • Advantage: dist() requires an additional reciprocal sqrt • Show code + demo
Optimization: Fragment Program • Do we need to evaluate (x – p) in fragment program?
Optimization: Fragment Program • Do we need to evaluate (x – p) in fragment program? • Rasterization/G80 lectures: GPUs have VERY FAST dedicated hardware for linear interpolation (lerp) • Lerp color, textures, normals across triangle vertices
GPU: Linear Interpolation • Color example
Optimization: Fragment Program • Evaluate (x – p) at polygon vertices and use dedicated hardware to lerp at each pixel ! • What about line / triangle sites? • Can be linearly interpolated too ! • More details later
GPU Optimizations • Where to optimize? • Make fragment program run faster • GPU / Application dependent optimizations • Reduce memory bandwidth • Reduce number of invocations of fragment program • Geometric culling
Optimization: Memory Bandwidth • Reduce number of texture lookups, framebuffer writes • Pack data into fewer channels • Is bandwidth limited?
Optimization: Memory Bandwidth • Reduce number of texture lookups, framebuffer writes • Pack data into fewer channels • How?
Optimization: Memory Bandwidth • Pack data into fewer channels • Using fp32 render target • 32 bit = 4 billion site ids • We can use only 1 channel (red) for writing site id instead of 4 channels (RGBA)
GPU Optimizations • Where to optimize? • Make fragment program run faster • GPU / Application dependent optimizations • Reduce memory bandwidth • Reduce number of invocations of fragment program • Geometric culling
Linear Factorization Distance vector field: Gives vector from a point in 3D to closest point on a site Line Site Distance Vectors
Linear Factorization • Distance functions are non-linear (quadric) • Distance Vectors can be factored into linear terms • Linearly interpolated along each axis
Linear Factorization: 2D • Distance vectors are linearly interpolated Line Segment e f
Linear Factorization: 3D • Distance vectors are bi-linearly interpolated f e p
Linear Factorization: 3D • Distance vectors are bi-linearly interpolated f e b p a
Linear Factorization: 3D • Distance vectors are bi-linearly interpolated f e b p a
Linear Factorization: 3D • Distance vectors are bi-linearly interpolated f e b p a
Linear Factorization: 3D • Distance vectors are bi-linearly interpolated f e b p a
Linear Factorization: 3D • Distance vectors are bi-linearly interpolated f e e b p a