720 likes | 980 Views
Space Partitioning. Computer Graphics. What is Space Partitioning?. A division of 3D space into distinct regions Specifically, a tree data structure is used to divide up the world in a hierarchical manner The tree leaf nodes usually contain pieces (triangles) of the world
E N D
Space Partitioning Computer Graphics
What is Space Partitioning? • A division of 3D space into distinct regions • Specifically, a tree data structure is used to divide up the world in a hierarchical manner • The tree leaf nodes usually contain pieces (triangles) of the world • Each leaf holds onto the polygons that are in its sub-piece of the world
Why Perform Space Partitioning? • Speed • For example, an accurate model of a Boeing-777 contains over 500M triangles • You do not want to send 500M tri down the rendering pipeline • We have seen various mapping techniques to make the model look good with less triangles • But we can also cut down on the number of triangles by only sending triangles down the pipeline that are in the field of view of the camera • By dividing the world into pieces we can cull/clip on large scale rather than per triangle as the rendering pipeline does
Types of Space Partitioning • The two main types we will explore are: • Octrees • Binary Space Partition (BSP) trees • However, in the process we will also cover: • Frustum culling • Bounding Volumes (BVs) • Potentially Visible Sets (PVS) • Scene Graphs
Frustum Culling • In the classic rendering pipeline front-facing triangles are clipped against the Frustum • All 3 vertices outside the volume implies the triangle is culled (not sent down the pipeline) • The problem with this technique is that it is triangle-based and doesn’t take into account spatial coherence • All the triangles for a given object lie in about the same position in space
Frustum Culling • Why is spatial coherence important? • If an object is out of the Frustum then so are all its triangles • If we could perform the Frustum check per object then we could cull all the object’s triangles at the same time with a single test • Note that this is a software test, but only 1 is made rather than 10s/100s/1000s of hardware tests
Frustum Culling • How do we perform the test in software? • First we need to find the planes of the Frustum • Recall that there are 6 planes (near, far, left, right, top, bottom)
Frustum Culling • If we can find the plane equation for each then we can test to see which side of the plane a point lies • Recall the plane eqn is: Ax + By + Cz + D = 0 • Where (A, B, C) is the normal of the plane • If we have a point (x, y, z) and we plug it into this equation, we get: • Positive on the normal side of the plane (in front) • Zero on the plane • Negative on the non-normal side of the plane (behind) • A point must be in front of all planes for it to be in the Frustum
Frustum Culling • So, how do you find the Frustum planes? • After setting up the Camera, get the modelview matrix and the projection matrix • glGetFloatv( GL_MODELVIEW_MATRIX, mod ); • glGetFloatv( GL_PROJECTION_MATRIX, proj ); • Note that you need both because the modelview sets up the camera position and orientation whereas the projection sets up the shape of the Frustum • Then multiply the two matrices together • clip = (mod) x (proj)
Frustum Culling • The A, B, C, and D values can then be extracted from the “clip” matrix • As an example the right plane is: • A = clip[ 3] - clip[ 0]B = clip[ 7] - clip[ 4]C = clip[11] - clip[ 8]D = clip[15] - clip[12] • These plane values are then normalized so +/- distances from the plane are correct • Each component (A-D) is divided by sqrt(A2 + B2 + C2)
Frustum Culling • This sounds like a lot of work • Not coding, but execution time “work” • But keep a couple of things in mind: • The Frustum plane equations only need to be recalculated whenever the camera moves • We are going to test an entire object rather than a single point • In particular, we will test sphere and cube objects
Frustum Culling • Why just spheres and cubes? • They make nice Bounding Volumes (BVs) • A Bounding Volume is a simplified shape that encloses an entire complex object • If the Frustum test against the BV fails then the entire complex object in the BV is not in the Frustum • The advantage is that tests against simple shapes are computational much easier to perform than tests against complex shapes
Frustum Culling • Sphere testing • A sphere is defined by a center point and a radius • Part of the sphere is still in front of the plane (and thus, it can’t be culled) even if the center point is up to radius units on the back side of the plane • Recall that Ax + By + Cz + D is the distance from the point (x, y, z) to the plane • Thus, we keep the sphere if the sphere center plugged into this equation produces values > -radius for all 6 plane equations • The point test is the same, but with > 0
Frustum Culling • Cube test • A cube consists of 8 vertices • Usually pass in as a center location and a size • We could test to see if any of the 8 cube vertices are within the Frustum (i.e. test each vertex against all 6 planes) • However, this leads to problems when the cube is bigger than the Frustum – all 8 corners are outside the Frustum, but the cube contents should be drawn • False negatives – cube culled when it shouldn’t be
Frustum Culling • Instead we reverse the order and test all 8 points against a single plane • If all are to the back side of the plane then the cube is not visible and we can cull it • If any are in front, then we move onto the next plane • If it passes all 6 planes then we keep it • However, this can lead to false positives (see figure) • We let it though and simply let the card cull it on a triangle by triangle basis
Frustum Culling • So what sort of speed up do you get? • It depends on several factors: • How big the Frustum is (especially the far plane) • How many object are in the scene and how they are distributed • How complex each particular object is • Check out gametutorials Frustum culling tutorial
Octrees • Octrees are a space partitioning data structure • We saw Octrees before in the context of modeling • In modeling the leaf nodes were either filled or empty • In space partitioning the leaf nodes contain a list of triangles that are in that sub-section of space • The process: • A single cube is placed around the entire world • If the cube contains too many triangles, then the cube is sub-divided into 8 smaller axis-aligned cubes • Recurse until we hit the stopping condition
Octrees • When do you stop sub-dividing? • When the cube contains less than a specified number of triangles • When the sub-division depth has reached a maximum • When the total number of cubes in my octree has reached a maximum number
Octrees • What happens to triangles that cross a cube boundary? • Place the object in the largest box that fully contains it • Not efficient in that a small object can be placed in a large bounding box if it lies directly on the boundary • Implies that not all triangles are stored in the leaf nodes • Split the triangle • Increases the number of primitives in the scene • Place the triangle in both sub-divisions • As the triangles from each sub-division are sent to the renderer we need to mark them so we don’t draw them twice • Makes Octree editing, such as deletion, is more difficult
Octrees • So how to Octrees help us? • We can combine Octrees with Frustum culling • The Octree cubes are Bounding Volumes for the triangles in their region of space • We only need to send the triangles contained in the cubes that pass the Frustum test to the rendering pipeline
Octrees • This leads to a process called Hierarchical Frustum Culling • Starting with the root node of the Octree, test the Octree cube against the Frustum • If not in frustum, then prune children • If in frustum, then • Send any contained triangles to renderer • Recurse on each of the children • Demo Octree2 and Octree3 from gametutorials
BSP Trees • The idea is much the same as with Octrees • However, the space is repeatedly split into 2 parts (hence the name “binary space partition”) by a given splitting plane • Octrees always use cubes • There are two variants • Axis-Aligned • Polygon Aligned
Axis-Aligned BSP Trees • Starting with a bounding box around the entire world, split it into 2 parts with an xy, xz, or yz splitting plane • Divide all the triangles between the two half-spaces • Triangles that intersect the splitting plane can be dealt with the same way we did with Octrees • Recursively continue thesubdivision process until astopping criteria is reached • Note that each successivesubdivision only splits its ownhalf-space
Axis-Aligned BSP Trees • The interesting part is how to decide what axis and where along that axis to split • Suggestions on how? • Hint: how does all this relate to what you learned in cs255 about merge sort vs. quick sort
Polygon-Aligned BSP Trees • In this version, a polygon is chosen as the divider polygon • This polygon is part of a plane and it is this plane that is used as the splitting plane
Polygon-Aligned BSP Trees • Just as with Axis-Aligned BSP trees the most important choice is the selection of the polygon divider • There are two main issues to consider: • How balanced is the tree • How many extra triangles did we have to make because of triangle splitting • It is a debatable question as to which is better: • A balanced tree with lots of extra triangles • A unbalanced tree with fewer extra triangles • However most people pick the unbalanced tree with fewer splits of triangles
Polygon-Aligned BSP Trees • Generating an optimal BSP tree requires trying every possible splitting combination • The problem is that this is a O(n!) algorithm • Recall that 20! = 2,432,902,008,176,640,000 • The next best option is to pick the best splitter at each level • Divide-and-Conquer with a Greedy selection of the splitter at each level • This is an O(n2) algorithm • This is the option that most games use • However, it is still pre-computed and saved in a file
BSP Trees • Hierarchical Frustum Culling can be used fairly easily with Axis-Aligned BSP trees • Very similar to how it was used for Octrees since each region is rectangular • However, for Polygon-Aligned BSP trees testing is more complex • The region (leaf or interior) in question is convex, so one could test all the corners of the region the same way the 8 corners are tested for rectangular regions • Or a bounding box of the region can be tested • Quicker test, but more false positives (passed frustum test when not actually in the fustum
BSP Trees • However, Polygon-Aligned BSP trees have their advantage in visibility • Allow us to easily perform a visibility ordering • Allow us to align splitting planes along walls which may not be axis aligned
BSP Tree Visibility • In complex indoor scenes, not only do we have lots of geometry that is not in the camera’s frustum but we also have lots of geometry that is hidden by walls • That is, the frustum might extend through many walls • Frustum culling would simply select all nodes it intersected and send them to the renderer where a Z-buffer technique is used to handle the occlusion in the normal way • With Polygon-Aligned BSP Trees we can determine visibility ordering w.r.t. a given camera position • Works in linear time • Works for any given camera position
BSP Tree Visibility • In order to determine the visibility ordering • First, build the BSP tree for the scene • Usually done off-line • For a static scene • Dynamic scenes objects are usually handled separately • Second, insert a viewpoint into the scene • Third, perform a in-order traversal of the tree to determine the polygon order • Recall in-order traversals mean process child, process node, process other child
BSP Tree Visibility • What order should we process the children in? • To obtain a back-to-front ordering • Process the far-side child first • Relative to the location of the camera • To obtain a front-to-back ordering • Process the near-side child first
BSP Tree Visibility • Back-to-front: C1, B, D, A, C2, E • Front-to-back: E, C2, A, D, B, C1
BSP Tree Visibility • Back-to-front ordering leads to a version of the “painters algorithm” • This doesn’t suffer from the intersecting problem seen previously in the painters algorithm • Any polygons that would have intersected have now been split into multiple pieces • However, the painters algorithm is still really slow • Pixels will be drawn only to be overdrawn by closer tris • Often 10x overdraws (Michael Abrash of Quake frame) • Overdraw amount varies so framerate is not stable • Overdraw problems increase with scene complexity
BSP Tree Visibility • Front-to-back has no overdraw problems • Each pixel is drawn exactly once • But we need to find a way to not overwrite a pixel once it has been drawn • Could use the normal Z-buffer technique • However, this is overkill since we will never need to replace a value in the Z-buffer with a newer one • Could use a Stencil buffer technique • Stencil filters pixels that have already been drawn • As new pixels are drawn, the stencil buffer is updated • This technique is usually faster than back-to-front
Stencil Buffer • A Stencil Buffer is another buffer like the color (frame) buffer and the Z-buffer • On most hardware this is an 8 bit buffer • The Stencil Test is a test that occurs immediately before the Z-buffer test • If the Stencil Test fails, the fragment is discarded rather than passed on to the Z-buffer • Fragment won’t show up in the color buffer either
Stencil Buffer • The Stencil Test is a test of the value currently stored in the Stencil Buffer • The Stencil Function controls the type of Stencil Test to perform:
Stencil Buffer • There are 3 possible outcomes of the combined Stencil and Z-buffer tests: • Stencil Test fails • Stencil Test passes, Depth Test fails • Stencil Test passes, Depth Test passes • The Stencil Operation lets you specify what happens to the Stencil Buffer values in each of these cases
Stencil Buffer • The possible Stencil Operations are: • GL_KEEP old value kept • GL_ZERO value replaced by zero • GL_REPLACE value replaced by given reference value (allows setting to a specific value) • GL_INCR value incremented by 1 • GL_DECR value decremented by 1 • GL_INVERT value bitwise inverted
Stencil Buffer • The Stencil buffer/test are only performed if • The Stencil buffer is enabled • The underlying hardware/drivers support it • The stencil buffer is usually used as a mask • Example: creating a non-rectangular windshield for a first-person driving game • Store 1s in the stencil buffer pixels where you wish the windshield to be and then set the stencil test to render if not equal to 0
B A BSP Tree Visibility • Note that the visibility ordering is not directly related to either: • Viewing direction • Both A and B appear in the visibility ordering even though camera is facing the other direction • Distance of the polygons from the viewpoint • In back-to-front ordering, B is before A in both figures even though A is sometimes farther than B from the camera A B
BSP Tree Visibility • The previous slides have assumed that the BSP tree will continue dividing until every polygon is used as a splitting plane • However, in indoor scenes we often want to use BSP trees aligned with the walls to provide visibility on a room level and then let the normal Z-buffer algorithm handle objects within the room • That is, a hybrid visibility scheme
Portals • World is divided up into Cells and Portals • Cells are convex regions • Could be BSP tree leaves • Portals are doorways/windows between the Cells
Portals • Basic algorithm: • Start in the cell that contains the camera • Equating Cells to BSP tree leaves helps here • Perform a frustum check against all objects in the current cell to determine what is visible • If any portals are visible (in frustum), add the connecting cell to the list of cells to process • Continue until all visible cells have been rendered
Portals • An optimization of the previous algorithm is frustum reduction • When a portal is visible, it means the cell it leads to is visible, but only partially, because the camera can only see this cell through the portal • Thus, the frustum can be reduced in the new cell • Cuts down the number of objects in the cell that are sent to the renderer • And cuts down on the number of “visible” portals seen which reduces the amount of recursion
Portals • The reduced frustum is created by planes from the camera through the portal edges • For portals with vertical walls, it turns into a 2D issue of casting rays through points
PVS • Techniques that were tried and rejected for large-scale culling in Quake • Pure Z-buffering • Painters algorithm • Beam tree • Subdividing raycasting • Edge or Span sorting • Portals • Direct visibility extraction from BSP • Instead Quake went with PVS
PVS • Potentially Visible Sets (PVS) involve pre-computing visibility information from the BSP tree • That is, first divide up the world using a BSP tree, storing all the polygon faces in the leaf nodes • Next, perform off-line visibility checks to determine which leaf nodes are visible from which other leaf nodes creating a 2D visibility matrix • Visible from anywhere in the source leaf node • The 2D visibility matrix information, called a Potentially Visible Set (PVS) is then used to cull large parts of the BSP tree away when rendering
PVS • The PVS is used by first descending the BSP tree with the camera coordinates to determine in which leaf node the camera resides • Then the PVS is consulted to determine other leaf nodes that may also need to be rendered • This can then be combined with frustum culling to test the bounding boxes of these leaf nodes with the frustum • And finally, the polygons from the surviving leaf nodes are sent to the renderer