1 / 16

RADEON ™ 9700 Architecture and 3D Performance

RADEON ™ 9700 Architecture and 3D Performance. Gordon Elder. RADEON ™ 9700. What is the RADEON ™ 9700 ? Programmability(SMARTSHADER ™ 2.0) First Full Floating Point Graphics Pipeline Enables Compilation of High Level Shading Languages Performance High Bandwidth Parallelism

vandana
Download Presentation

RADEON ™ 9700 Architecture and 3D Performance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. RADEON™ 9700Architecture and 3D Performance Gordon Elder

  2. RADEON™ 9700 What is the RADEON™ 9700 ? • Programmability(SMARTSHADER™ 2.0) • First Full Floating Point Graphics Pipeline • Enables Compilation of High Level Shading Languages • Performance • High Bandwidth • Parallelism • Efficiency • Image Quality (SMOOTHVISION™ 2.0) • Multisample Antialiasing • Anisotropic Texture Filtering

  3. Image Generation with Image Mapping 1st Generation Programmability Idea: Texture Mapping, Blinn and Newell 1976 Implementation: SGI VGXT 1990 Hardwired Vertex Processing Hardwired Fragment Processing with a Single Texture Result: Environment Mapping and other effects Blinn, J. F. and Newell, M. E. Texture and reflection in computer generated images. Communications of the ACM Vol. 19, No. 10 (October 1976), 542-547

  4. Image Generation with Texture Composition2nd Generation Programmability Idea: Shade trees, R. Cook 1984 Implementation: RADEON™8500 2001 Limited Vertex Programmability Limited Fragment Processing • Multiple Textures • Fixed Point Data • Short Programs Result: Current generation of effects. Robert L. Cook Shade Trees. Computer Graphics Vol. 18, No. 3, (July 1984), 223-231

  5. Image Generation with General Purpose Floating Point Math & Texturing 3rd Generation Programmability Idea: RenderMan®, Pixar 1987 Implementation: ATI RADEON™9700 2002 Advanced Vertex Programmability Advanced Fragment Programmability • Floating Point Data • Rich Instruction Set • Large Instruction Store Result: Enabling Cinematic Rendering Compiling RenderMan®, Maya, etc. Willina T. Reeves, David H. Salesin, Robert L. Cook Rendering Antialiased Shadows with Depth Maps. Computer Graphics Vol. 21, No. 4, (July 1987), 283-291

  6. SMARTSHADER™ 2.0 • Next-generation programmable shader technology • Enabling cinema-quality effects in real time • First complete DirectX® 9.0 feature support • 2.0 Vertex and Pixel Shaders • Floating Point Pixel Pipelines • 128-bit Floating Point Texture and Frame Buffer Formats • Two-Sided Stencil Shadow Acceleration • High Precision 32-bpp (10:10:10:2) Display Mode • Higher Order Surface Enhancements • Full feature set also available for OpenGL® • OpenGL® Shading Language Support

  7. Vertex Shaders (SMARTSHADER™ 2.0) • Flow Control • Loops, jumps and subroutines • Allow re-use of certain parts of theshader code • Avoids repetition and saves instructions • More Instructions, More Complex Effects • Up to 65,280 instructions per pass • Vertex shaders can be much more complex than they were in DX8

  8. Pixel Shaders (SMARTSHADER™ 2.0) • More Complex Shaders by an Order of Magnitude • Up to 160 instructions per pass • 32 address ops, 64 color ops, 64 alpha ops • Compared with 12 instructions total in DX8.0 • Multi-pass rendering support • High precision 128-bit floating point data formats for storing intermediate results between passes • Shaders can now effectively be thousands of instructions long – performance is the only limitation • 24-bit per component floating point precision for all pixel shader operations - necessary for cinema-quality effects • Allows shaders written in any present or future language to run on hardware with SMARTSHADER™ 2.0 • Even high level languages like RenderMan® can now be compiled to run on RADEON™ 9700 in real time • Pixel shader can also implement complex Image Processing algorithms

  9. RADEON 9700 Performance Key design elements for best performance: High Bandwidth, Parallelism, & Efficiency High Bandwidth • AGP 8x provides 2 GB/sec transfers to or from the CPU or system memory. • 310 MHz 256-bit DDR Memory Interface provides 20 GB/sec access to the Frame Buffer • Internal 256-bit data busses for Color, Texture and Z Parallelism • 4 Vertex Engines running at 325MHz provides 325 Mtriangles/sec (4 clocks per vertex per engine) • 8 Pixels/Clock Rasterization Architecture running at 325MHz provides a peak fill rate of 2.6 Gpix/sec

  10. RADEON 9700 Performance (cont.) Efficiency Graphics systems tend to be Memory Bandwidth limited. The RADEON™9700 is no exception. So it is important to use the bandwidth efficiently. • Hierarchical and Early Z checking allows pixels to be rejected before the pixel shader. This is very important when shader programs are long. • Color, Texture and Z caches reduce memory bandwidth utilization. Benefit from spacial and temporal locality. • Lossless Color and Z data compression reduce memory bandwidth utilization. • Compressed Textures can be utilized to reduce memory bandwidth utilization. • Fast Color and Z clears eliminate need to access memory for clears HyperZ III

  11. RADEON™9700 Performance (cont.) One more interesting thing…….. Scalability • The RADEON™9700 Architecture is capable of scaling up to 256 simultaneous units

  12. Image Quality (SMOOTHVISION™ 2.0 ) Performance matters too Pixel antialiasing and anisotropic texture filtering improve image quality only if they are enabled. Just going to higher resolutions isn’t the answer for improved image quality. • Artifacts due to poor texture sampling remain. • Dynamic antialiasing artifacts are still very visible. Sufficient performance for high resolution display, high quality texture filtering, and antialiasing is needed. The RADEON™9700 was architected to do all three simultaneously.

  13. Standard Edge Gradient Output Input Gamma Corrected Edge Gradient Output Input Anti-Aliasing (SMOOTHVISION™ 2.0) • Non-Grid Programmable Multi-Sampling • 2, 4, or 6 samples per pixel • Sample positions provide the maximum quality per sample • Lossless Z and Color compression minimizes bandwidth cost of higher sample counts. • Per Sample Gamma Correction • Takes gamma into account when blending samples • Creates smoother edge transitions

  14. Anisotropic Filtering (SMOOTHVISION™ 2.0) • Improved Adaptive Algorithm • Up to 16 Trilinear Samples (128-tap) • Calculates optimal number of samples foreach polygon • Delivers full image quality benefit while conserving memory bandwidth

  15. RADEON™9700 Demos

  16. Conclusion

More Related