290 likes | 398 Views
Status – Week 260. Victor Moya. Summary. shSim. GPU design. Future Work. Rumors and News. Imagine. shSim. Currently working: Command Processor: reads a text based trace file (programs, parameters, vertexs, commands to rasterizer).
E N D
Status – Week 260 Victor Moya
Summary • shSim. • GPU design. • Future Work. • Rumors and News. • Imagine.
shSim • Currently working: • Command Processor: reads a text based trace file (programs, parameters, vertexs, commands to rasterizer). • Shader: simulates a N multithreaded, variable latency support, VS1 capable ‘vertex’ shader. • Rasterizer: OpenGL ‘emulator’, accepts resolution and clip planes changes, recieves ‘shaded’ vertexs from the shader (only 2 QuadFloats, vertex positon + color), displays the triangles in a GL window.
shSim • Tests: • 2/4 multithread (with another 2/4 input buffers) single shader. • Fixed 3 latency cycles. Shader to Rasterizer latency of 4. CommandProcessor to Rasterizer latency of 6. • Simple coordinate change traces (shader.input, shader.input.2). • Ripple vertex shader example from DX8 & DX9 SDK (ripple.input): • Around 300 triangles (1100 vertexs). • Color is calculated from vertex position.
shSim • Ripple.vsh.
shSim • Screenshots from frames rendered by shSim:
GPU Architecture • Based in current GPUs: • NV30 • R300 • Based in other graphic processors: • PS3 • Imagine
GPU Architecture • Based in an API: • DX8 • DX9 • DX10 • OpenGL 1.4 and extensions. • OpenGL 2.0 • Based in an architecture model: • Vector • Scalar • Multithreaded
GPU Specification • Shader Model: • Language: • DX9: • VS2.0/PS2.0. • VS3.0/PS3.0. • OpenGL: • NV_vertex_program_2/NV_fragment_program. • ARB_vertex_program/ARB_fragment_program. • Our own language.
GPU Specification • Shader Architecture: • Architectural model: • Scalar. • SIMD. • Multithreaded. • Vector. • Out-of-order.
GPU Specification • Configuration: • Integer Unit: • Number. • Precission. • SIMD or scalar? • Float Point Unit: • Number. • Precission. • SIMD or scalar?
GPU Specification • Memory Unit: • Number. • Texture modes. • Filtering modes. • Register Banks: • Number. • Ports. • Size. • Scalar or SIMD?
Future Work • Shader: • Add branch/call/ret instructions. • Add texture instructions (Pixel Shader). • Command Processor: • Define a trace specification: binary, gzipped? • Define an interface with OpenGL (Mesa?) or DX8/DX9 (driver?). • Primitive Assembly: • Implement vertex cache and primitive assembly (only triangles?). • Implement culling and clipping?
Future Work • Deferred rendering? • Transformed geometry must be stored in video memory. • Geometry must be sorted: • Tiles. • Front to back. • Rasterization: • Triangle Setup and Fragment Generation. • Any suited method: Olano & Greer, DDA?. • MSAA support?
Future Work • Early Z and Hierarchical Z? Pixel Shader: • Implement unified with vertex shaders? • Queue/buffering mechanism? (memory/texture latency very large). • Pixel Shader: • Unified shader architecture? • Pixels need a lot of buffering (memory/texture operations). • Implement a TMU simulator (filter algorithms, memory access, texture compression, cache).
Future Work • Fixed fragment operations: • Implement using the shader? • Fog: remove? • Pixel Ownership: remove? • Scissor Test: implement (needed if clipping is not implemented). • Alpha test: same as Z Test. • Z Test and Stencil Test: must be implemented, but could be added to a generic shader unit? • Blending: add to shader? • Dithering: remove. • Logical Op: remove or add to shader. • MSAA Operations: ?
Future Work • Framebuffer: • Z compression. • Color compression. • SSAA or MSAA support?
News and Rumors • NV30 architecture: • 4x2 pixel pipes? • 8x zixel pipes (Z Test & Stencil only). • ATI ready to release R350 and RV350 in a couple of weeks. • R350: Updated R300 core with additional features (?) and increased clock frequency (375 – 400 MHz). • RV350: value chip based in R300 core. Maybe 8x1 core, 128 bits bus. Clock frequency 300 – 400 MHz. 75 Million transistors.
Imagine • ‘Computer Graphics on a Stream Architecture’, John Douglas Owens, PhD dissertation. • Not read yet either.