190 likes | 320 Views
3D Rendering & Algorithms __. Sean Reichel & Chester Gregg. a.k.a. “The boring stuff happening behind the video games you really want to play right now.”. What is 3D Rendering? __. Definition:
E N D
3D Rendering & Algorithms __ Sean Reichel & Chester Gregg a.k.a. “The boring stuff happening behind the video games you really want to play right now.”
What is 3D Rendering? __ Definition: • Wiki: “… the 3D computer graphics process of automatically converting 3D wire frame models into 2D images with 3D photorealistic effects or non-realistic rendering on a computer.” • Me: “… the process of drawing models defined in three dimensions with the use of a depth buffer.” • The Important thing that requires two eyes (usually) • Ooo, this looks pretty
What is 3D Rendering? __ The mesh as viewed from the OpenGL standpoint: • A list of vertices containing a 3D position and data associated with that position contained within a governing structure: • Structures: Points, Lines, Triangles, Triangle Strips/Fans, etc. • Vertex Data: Position, Color, Texture Coordinates, Normals, etc. • Ex. A basic color-interpolated triangle: • Vertices: • Vertex(Position(-1,0,0), Color(0,0,1)), • Vertex(Position(1,0,0), Color(0,1,0)), • Vertex(Position(0,1.3,0), Color(1,0,0)) • Structure: • TRIANGLE • (Every 3 vertices defines a triangle)
The Modern 3D Rendering Pipeline • Usually transforms vertices from one coordinate system to another. (Model translation, rotation, and projection) • Handles most (if not all) of the model positioning on screen. • Vertex Shader: • Performs operations on individual vertices of the input mesh: • 2.TesselationShader: • Complicated Topic: Takes in “Patches” of vertices & outputs (usually more) vertices: • Usually generates ‘smoother’ geometry from a rough input.
The Modern 3D Rendering Pipeline • A highly adaptable, but very slow and inefficient part of the pipeline. • Used to generate data that requires all the points of a given primitive. • Common Example: Adding fur to surface primitives. • Geometry Shader: • Takes in Primitives of type ‘N’ and outputs Primitives of type ‘M’ • Rasterization: • A Non-programmable stage that interpolates vertex data across the primitives they compose. Fragment Shader: • Performs Operations on all the fragments (i.e. ‘meta-pixels’) generated by rasterization. • Used to apply textures and lighting effects on a per fragment basis.
Simple Rendering: __ Rendering Stages: • Vertex Shader • Tesselation • Geometry Shader • Rasterization (interpolation) • Fragment Shader
Algorithms & Optimization __ Bottleneck Categories of the 3D Rendering Pipeline: • Application Bottlenecks: External Systems • The CPU is not fast enough to run the application. • The system bus can’t transfer data fast enough. • Transform Bottlenecks: Vertex Calculations • GPU can’t keep up with all the vertex calculations • GPU can’t keep up with Tesselation/Geometry Shaders. • Fill Bottlenecks: Fragment Calculations • GPU can’t keep up with all the fragment calculations. • GPU can’t access Textures or Framebuffer fast enough.
Algorithms & Optimization __ 1.A. Fixing Application Bottlenecks: CPU: • Your application is performing other operations unrelated to rendering: • The solutions to this problem are outside this presentation’s scope. • Example: Detonating a large amount of TNT in Minecraft. (8 hours of lag for 2 seconds of video)
Algorithms & Optimization __ 1.B. Fixing Application Bottlenecks: Bandwidth: • Data is being sent to the GPU, but too much data is being sent for the Bus to keep up with • Solution 1: Lossy Compression: • A given 3D model can be simplified while retaining a minimum quality. (Very common in video games to simplify ALL models to a given minimum quality) • By using a smaller number of vertices/primitives, or reducing the size/quality of a texture, the total amount of data will decrease. • Ex. Determining how many triangles should make up a sphere.
Algorithms & Optimization __ 1.B. Fixing Application Bottlenecks: Bandwidth: • Solution Two: Lossless CompressionTechniques: • Triangle Strips & Higher-Order Model Structures in Meshes: • A model defined with the TRIANGLE structure requires three vertices per primitive drawn. • In a Mesh, triangles often share vertices with other triangles. • A TRIANGLE_STRIP model requires fewer redundant vertices by using the two previous vertices in the list to create a new triangle with every new vertex. • The limiting efficiency: One Vertex per Primitive. • The limits can be further improved with Element Arrays: • Keeps every UNIQUE vertex in one array, and uses another array to index the data. • Very efficient when each vertex contains large amounts of data.
Algorithms & Optimization __ 1.B. Fixing Application Bottlenecks: Bandwidth: • Solution Three: Tessellation: • Instead of sending large amounts of data to the GPU, send a small amount of data and let the GPU generate additional vertices. • Works well to smooth surfaces and generate models defined by simple equations with a minimal set of data. • Ex. The surface of a sphere is represented by . Through Tessellation, the sphere on the left can be passed to the GPU and be tessellated into the sphere on the right.
Algorithms & Optimization __ 2. Fixing Transform Bottlenecks: Vertex Limit: • The GPU may be performing more vertex calculations than it can handle in the given time. • Usually caused by excessive per-vertex calculations or generation of large numbers of vertices with the Tesselation and Geometry Shaders. • Solution One: Lossy Compression: • Simpler models have fewer vertices, reducing the Vertex load. • Solution Two: Element Array Model Structures • Element Arrays keep a list of all UNIQUE vertices, and index them with another array of unsigned integers. • Element Arrays process each UNIQUE vertex only ONCE, instead of each time the vertex is passed by the mesh.
Algorithms & Optimization __ 2. Fixing Transform Bottlenecks: Vertex Limit: • Solution Three: Occlusion Query: • Not every object being sent to the GPU will actually be rendered in the final scene because it is behind another object relative to the viewer. • An Occlusion Query Test consists of testing whether an object’s bounding box will pass the depth test (i.e. appear in the scene). • When it fails, the GPU can choose to ignore potentially complex geometry because it won’t appear in the scene.
Algorithms & Optimization __ 3.A Fixing Fill Bottlenecks: Fragment Limit: • The GPU is being required to perform more fragment operations than it can handle. • Solution One: Move Calculations to Vertex Shader: • Rather than calculating the values for every fragment, it is usually faster to calculate the values at each vertex and interpolate the result across the primitives. • The quality will decrease, but it is again a question of minimum quality tolerance.
Algorithms & Optimization __ 3.A Fixing Fill Bottlenecks: Fragment Limit: • Solution Two: Occlusion Testing: • The occlusion test usually stops at rasterization, and never calls the fragment shader. • It is essentially ‘free’ to solve this bottleneck.
Development of the GPU __ • The Sad Truth: • 3D Rendering is a computationally expensive process. • To render a rectangle to fill your computer screen in HD (1280x720), will require at least a million assignment operations. (i.e. This is your desktop background) • Current demand for high quality video games requires that MILLIONS of primitives be drawn. • Our CPU does not have the throughput required to handle this.
Development of the GPU __ • 3D Rendering has two important properties. • Most of the calculations do not need to be precise. • All the calculations within a given stage of the pipeline are independent. • Which lead to the following conclusions • A simpler processor can perform the calculations. • The problem is intrinsically parallel • This leads to the GPU as we know it. • A collection of many simple processors designed with the sole task of performing 3D rendering calculations in parallel with eachother.
Current Trends of GPU Development • a.k.a. “The development of BRUTE FORCE” • Hardware over Software: incorporate faster hardware managed elements to the render pipeline (Tessellation has recently been incorporated into its own Tesselator hardware on the GPU). • However, the biggest trend is to simply pack as much power into the GPU as possible: • Example: Console GPU GFLOPS: PlayStation and Xbox • *Apparently, the Xbox 360 could render the Batmobile from the next Batman Arkham Knight Game for Xbox One • But that is ALL the Xbox 360 would be able to do.