1 / 36

ECS 298 Photorealistic Image Synthesis Rendering Acceleration

ECS 298 Photorealistic Image Synthesis Rendering Acceleration. Brian Budge Center for Image Processing and Integrated Computing Computer Science Department University of California, Davis. Overview. Why do we need acceleration? Box/Bounding Box intersection

fran
Download Presentation

ECS 298 Photorealistic Image Synthesis Rendering Acceleration

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ECS 298 Photorealistic Image Synthesis Rendering Acceleration Brian Budge Center for Image Processing and Integrated Computing Computer Science Department University of California, Davis

  2. Overview • Why do we need acceleration? • Box/Bounding Box intersection • Sub-linear acceleration via bounding volume hierarchies • Sub-linear acceleration via space partitioning • Mailboxing • Common code optimizations Hardware Flow Visualization

  3. Acceleration: Why do we need it? • The methods we use take a long time • It can take several hours or even days to compute a realistic image • Brute force is pretty stupid Hardware Flow Visualization

  4. Meet the bounding box, your new best friend • We need a bounding volume • Two come to mind immediately – sphere and box • Why these? Because the intersection tests are fast. Only need a boolean test – do we hit the bounding volume? • Why choose one over the other? • Need to try to bound objects closely • Obviously, spheres bound spheres best and boxes bound boxes best • Worst case: polygon. Boxes much better for polygons • Oh, I should mention, we’re talking about axis aligned boxes Hardware Flow Visualization

  5. Bounding box intersection test • The best current bbox test is one based off of Brian Smitt’s work, by Amy Williams • Makes use of IEEE floating point rules to eliminate redundant conditionals • Basically it’s three slab intersections, one in X, one in Y, and one in Z • If the parametric value t is valid for all three slab intersections, there is an intersection • Especially if you end up using a BVH, you should get her code (it’s online – I’ll provide the URL) Hardware Flow Visualization

  6. Bounding Volume Hierarchies (BVH) • This is probably the easiest acceleration structure to write • Sometimes it even works okay when it’s broken  • Starts being faster than naïve testing after just a handful of objects • Like I said before, sphere’s are okay, but I’d recommend axis aligned bounding boxes • Let’s look at a 2D example Hardware Flow Visualization

  7. Hardware Flow Visualization

  8. Hardware Flow Visualization

  9. Hardware Flow Visualization

  10. Hardware Flow Visualization

  11. Hardware Flow Visualization

  12. Hardware Flow Visualization

  13. Notes… • We can see problems here already • Overlap is a potential killer • Can get better or worse builds of the same data • Alternate x,y,z • Each alteration choose best x,y,z • Try some magical function to optimize the whole thing • Also, can try different levels of termination Hardware Flow Visualization

  14. Spatial subdivision • Lots of data structures for spatial subdivision • Uniform Grid (and Hierarchical Grid) • Octree • BSP Tree • Kd Tree Hardware Flow Visualization

  15. Uniform Grid • Uniform Grids are the best data structure for: • Uniformly spread objects • Small self-contained objects • Example of a good scene: Millions of little spheres randomly distributed in a cube • Example of a bad scene: A room with 4 walls, a floor and a ceiling Hardware Flow Visualization

  16. Building a uniform grid • Easiest way is to make bounding boxes for each object, and place the object in all grid voxels that overlap with the bounding box • Decide the grid layout by • Approximating the number of objects in each direction • Shirley has a reasonable heuristic in his book (also has some code) • Want to try to have few object in each voxel, but also want few empty voxels Hardware Flow Visualization

  17. Traversing a uniform grid • Algorithm called 3D-DDA • Can end up being painful if there are lots of objects in each voxel • Seidel suggests using a hierarchical grid 2 levels deep to eliminate a bunch of this Hardware Flow Visualization

  18. Octrees • I could probably have many of you explain this • Octrees are better than uniform grids for many scenes • Not so much of a dependency on uniformity • Traversal isn’t as fast, but there could be a lot less to traverse (making it faster) Hardware Flow Visualization

  19. Building an Octree • Start with a bounding box of all your objects • If there are enough objects inside, split space into 8 uniformly sized voxels • Recursively descend and decide if each of your new voxels should be split into 8 • Could be split into 8 non-uniform voxels, but this is complicated, and traversal would be much harder • Again, you’d probably use bounding boxes for each object to decide if the object is within a voxel Hardware Flow Visualization

  20. Traversing an Octree • Much more time consuming than uniform grid • Lots of ways to do it • What is most efficient? Not sure. Quite honestly, I haven’t implemented this • See ray tracing news for a HUGE overview of this topic Hardware Flow Visualization

  21. BSP Trees • BSP trees are binary space partitioning trees • They do exactly what they say: split space into two pieces • General BSP trees can have arbitrarily aligned polygons splitting up space • This allows for the best possible linear split of space (could do better if we allowed splitting space by trimmed NURBS or something) Hardware Flow Visualization

  22. BSP Trees (continued) • There are problems with generalized BSP trees that cause most people not to want to use them • It is probably an NP-complete problem to split up space in the best possible case • That’s okay – for other data structures, same thing • But the problem is that even finding good heuristics for good partitioning is difficult • (Note that BSP trees work great if all you have is triangles – good algorithms for splitting triangles for best-case partitioning exist) Hardware Flow Visualization

  23. Hardware Flow Visualization

  24. Kd-trees • Special case of BSP tree – axis aligned • Kd-trees seem to be one of the best data structures for ray tracing • Traversal is extremely fast – testing intersection against the plane is trivial as it is solving a parametric equation for one variable • More truly logarithmic than bounding volume hierarchy, since there is early out Hardware Flow Visualization

  25. Building Kd-trees • This part is the pain in the butt • Getting a good partitioning can be hard • Note that you don’t want your partition to go through many objects, since those need to be duplicated on both sides of the split • Many heuristics out there for building one of these. I’m not sure which is best, but maybe we should have a look at Havran’s work (best efficiency scheme project) Hardware Flow Visualization

  26. Which one is best? • They all have their benefits • I believe that in the general case where things will tend to be clustered, the trees win out • Seems that if you can get a good partitioning, kd-trees are the best because of their easy fast traversal (and did I mention low memory overhead?) • Different scenes will be good with different schemes Hardware Flow Visualization

  27. Which one is best? (continued) • We could implement them all, and try them on all scenes ;-) Yeah right • Implement a few and know when they work well • The cool part is that you can treat an acceleration scheme the same as an object  • This makes it easy to nest acceleration schemes, which is very very good Hardware Flow Visualization

  28. Mailboxing • So when you’re traversing spatial partitioning data structures, you might possibly intersect the same object over and over (if it lies in more than one voxel) • Mailboxing keeps track of this. When you encounter an object, check to see if you’ve already hit it at some point! • Doing this cleverly can reduce running time pretty dramatically in real scenes where overlap is common Hardware Flow Visualization

  29. Mailboxing (continued) • I’m not a huge fan of mailboxing though • Reason being that it is hard to parallelize • Well, not hard really – too big. You’re keeping around to much data for each ray. Just too heavyweight • Possible research topic: Some kind of caching scheme. It would be a lot of toying around with cache sizes and replacement schemes Hardware Flow Visualization

  30. Other optimizations • One that I use is to keep track of the interval where the ray is valid • If you hit an object at t=20, there’s no point searching past that point! • Likewise, you can avoid self intersection by having an interval (epsilon,infinity) • Can use a special memory allocator (preferably one that is already written and tested ) Hardware Flow Visualization

  31. Code optimization • Remember Knuth: Premature optimization is evil! • However, in bottleneck situations, it can be extremely important • Remember that 90% of the time is spent in 10% of the code • Optimize only what you need to for the greatest benefit! Hardware Flow Visualization

  32. Code optimization (continued) • Remember that conditionals are not the same as arithmetic operations • They are expensive if branch prediction is incorrect! • So try to avoid them (can’t really do this that much) • Also place the one that is most likely in the if condition, since that is what Intel chooses without other information Hardware Flow Visualization

  33. Code optimization (continued) • Early exit is usually a good idea • Divides, sqrt, and trig functions are expensive • If you’re dividing by the same value multiple times, do it once and multiply instead (could be hundreds of times faster) • I use that one everywhere • If possible delay expensive computations like sqrt until after a conditional • Example is when doing the quadratic equation Hardware Flow Visualization

  34. References • Ray tracing news, specifically • http://www.acm.org/tog/resources/RTNews/html/rtnv12n2.html#art4 • Hans-Peter Seidel’s notes • http://www.mpi-sb.mpg.de/units/ag4/teaching/uebung/lecture20.pdf • Pete Shirley’s book • Pete Shirley and Russel Morley, Realistic Ray Tracing, 2003, AK Peters • Amy Williams’ paper and code • http://www.cs.utah.edu/~awilliam/box/ • Havran’s Best Efficiency Project • http://sgi.felk.cvut.cz/BES/ • Matt Pharr and Greg Humphries’ book – good info on kd-trees and mailboxing • I have this in printed form that can be borrowed Hardware Flow Visualization

  35. Additional Info • I need you guys to sign up for presentations! They start Tuesday • It’ll be tricky to get everyone signed up for separate lectures • Fill out the sheet after class • Class web page is at http://graphics.cs.ucdavis.edu/~bcbudge/ecs298_2004 • Should be more filled in by tomorrow Hardware Flow Visualization

  36. Scene file format • After class, I’d like to get together and decide on a scene file format • We will need something user editable, and yet not ridiculous for real scenes • Mine currently is something like this: • Camera{ • Lens{ • Pinhole (or thin lens, or fisheye) • Aperture 8mm • Etc… • } • } • Color red {0.8,0.5,0.6,0.4,0.3,0.2,0.1,0.1,0.1,0.1} • Object{ • Type Trimesh • Location “my_mesh.tmf” • Material my_mat • } Hardware Flow Visualization

More Related