1 / 29

Status – Week 260

Status – Week 260. Victor Moya. Summary. shSim. GPU design. Future Work. Rumors and News. Imagine. shSim. Currently working: Command Processor: reads a text based trace file (programs, parameters, vertexs, commands to rasterizer).

gaurav
Download Presentation

Status – Week 260

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Status – Week 260 Victor Moya

  2. Summary • shSim. • GPU design. • Future Work. • Rumors and News. • Imagine.

  3. shSim • Currently working: • Command Processor: reads a text based trace file (programs, parameters, vertexs, commands to rasterizer). • Shader: simulates a N multithreaded, variable latency support, VS1 capable ‘vertex’ shader. • Rasterizer: OpenGL ‘emulator’, accepts resolution and clip planes changes, recieves ‘shaded’ vertexs from the shader (only 2 QuadFloats, vertex positon + color), displays the triangles in a GL window.

  4. shSim • Tests: • 2/4 multithread (with another 2/4 input buffers) single shader. • Fixed 3 latency cycles. Shader to Rasterizer latency of 4. CommandProcessor to Rasterizer latency of 6. • Simple coordinate change traces (shader.input, shader.input.2). • Ripple vertex shader example from DX8 & DX9 SDK (ripple.input): • Around 300 triangles (1100 vertexs). • Color is calculated from vertex position.

  5. shSim • Ripple.vsh.

  6. shSim • Screenshots from frames rendered by shSim:

  7. GPU Architecture • Based in current GPUs: • NV30 • R300 • Based in other graphic processors: • PS3 • Imagine

  8. GPU Architecture • Based in an API: • DX8 • DX9 • DX10 • OpenGL 1.4 and extensions. • OpenGL 2.0 • Based in an architecture model: • Vector • Scalar • Multithreaded

  9. GPU Specification • Shader Model: • Language: • DX9: • VS2.0/PS2.0. • VS3.0/PS3.0. • OpenGL: • NV_vertex_program_2/NV_fragment_program. • ARB_vertex_program/ARB_fragment_program. • Our own language.

  10. GPU Specification • Shader Architecture: • Architectural model: • Scalar. • SIMD. • Multithreaded. • Vector. • Out-of-order.

  11. GPU Specification • Configuration: • Integer Unit: • Number. • Precission. • SIMD or scalar? • Float Point Unit: • Number. • Precission. • SIMD or scalar?

  12. GPU Specification • Memory Unit: • Number. • Texture modes. • Filtering modes. • Register Banks: • Number. • Ports. • Size. • Scalar or SIMD?

  13. XBOX (NV2A) Vertex Shader

  14. Future Work • Shader: • Add branch/call/ret instructions. • Add texture instructions (Pixel Shader). • Command Processor: • Define a trace specification: binary, gzipped? • Define an interface with OpenGL (Mesa?) or DX8/DX9 (driver?). • Primitive Assembly: • Implement vertex cache and primitive assembly (only triangles?). • Implement culling and clipping?

  15. Future Work • Deferred rendering? • Transformed geometry must be stored in video memory. • Geometry must be sorted: • Tiles. • Front to back. • Rasterization: • Triangle Setup and Fragment Generation. • Any suited method: Olano & Greer, DDA?. • MSAA support?

  16. Future Work • Early Z and Hierarchical Z? Pixel Shader: • Implement unified with vertex shaders? • Queue/buffering mechanism? (memory/texture latency very large). • Pixel Shader: • Unified shader architecture? • Pixels need a lot of buffering (memory/texture operations). • Implement a TMU simulator (filter algorithms, memory access, texture compression, cache).

  17. Future Work • Fixed fragment operations: • Implement using the shader? • Fog: remove? • Pixel Ownership: remove? • Scissor Test: implement (needed if clipping is not implemented). • Alpha test: same as Z Test. • Z Test and Stencil Test: must be implemented, but could be added to a generic shader unit? • Blending: add to shader? • Dithering: remove. • Logical Op: remove or add to shader. • MSAA Operations: ?

  18. Future Work • Framebuffer: • Z compression. • Color compression. • SSAA or MSAA support?

  19. News and Rumors • NV30 architecture: • 4x2 pixel pipes? • 8x zixel pipes (Z Test & Stencil only). • ATI ready to release R350 and RV350 in a couple of weeks. • R350: Updated R300 core with additional features (?) and increased clock frequency (375 – 400 MHz). • RV350: value chip based in R300 core. Maybe 8x1 core, 128 bits bus. Clock frequency 300 – 400 MHz. 75 Million transistors.

  20. Imagine • ‘Computer Graphics on a Stream Architecture’, John Douglas Owens, PhD dissertation. • Not read yet either.

More Related