1 / 28

The Framebuffer

The Framebuffer. February 6, 2003. “A Configurable Pixel Cache for Fast Image Generation”, Gorris et al. Problem Processor speeds have increased to the point that the frame buffer is now the bottleneck. Solution Cache Memories. Overview. Three considerations in frame buffer design:

ishi
Download Presentation

The Framebuffer

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Framebuffer February 6, 2003

  2. “A Configurable Pixel Cache for Fast Image Generation”,Gorris et al. • Problem • Processor speeds have increased to the point that the frame buffer is now the bottleneck. • Solution • Cache Memories

  3. Overview • Three considerations in frame buffer design: • Input mechanism to the frame buffer • RAM used to store the image • Output mechanism used by the frame buffer to refresh the display • Traditional approach: • Sequential memory locations lie along a scan line • Sequential pixels provided at a high rate to refresh the display • refresh requires significant percentage of RAM bandwidth

  4. Pixel-cache Approach • Pixel-cache holds a tile of frame buffer pixels • High speed scan converter calculates intensity one pixel at a time • Scan converter writes pixels serially into a pixel cache • Once the bounds of the tile are exceeded, the contents are transferred in parallel to the frame buffer

  5. Assumptions • Tiles are non-overlapping and aligned on boundaries that are integer multiples of their height and width • Why? Quick mapping between scan converter addresses and tile/bit addresses • Fscan(x, y) = MSBstile(x,y) + LSBsbit(x, y) • Tile address is used to access a tile of pixels in the frame buffer • Bit Address is used to access individual pixels within the cache

  6. Implementation • Pixel cache • Single 48-pin Integrated Circuit (IC) • 1.6 micron CMOS • 3700 gates • 16.6 MHz • Frame buffer • 256K-bit VRAM • 4x4 or 16x1 tiles

  7. Main Components • Data Cache: stores pixel intensity data from the scan converter • data port is bi-directional for fast reads and writes • Source Register: holds current tile being written to frame buffer • frame buffer writes overlapped with data cache input • Replacement Rule Register/Logic: used for boolean operations on the frame buffer • data from scan converter can be used with old frame buffer data being written back to the frame buffer

  8. Components – cont. • Destination Register: holds existing contents of frame buffer • used by replacement rule logic to perform boolean logic with incoming pixel data and committed frame buffer data • operation can be overlapped with data cache writes and source register transfers • Pattern Register: holds frame buffer data to be blended with scan converter data • similar to destination register • useful for generating repeating patterns • Z Cache & Z Pipeline Register: used to buffer depth information • equivalent to data cache and source register but for z data

  9. Tile Organization • Larger tiles give higher performance • more pixel updates per frame buffer memory cycle • but increase the size and cost of the pixel cache • Number of pixels updated is function of organization and operation performed • randomly oriented vectors – square tiles • horizontal vectors – linear horizontal tile structure • See Figures 4 and 5 • No Silver Bullet

  10. Z-Buffer • Requires storing the z value for each displayed pixel • both pixel intensity and z value are updated or left alone • Idea: read several z values in parallel and overlap compare and update with writing of previously updated values to memory

  11. Requirements • Take advantage of unused video ram • 1280x1024 w/ 256k VRAM = 728x1024 free • Reconfigurable frame buffer • 8 to 32 planes in multiples of 8 planes • Allow z-buffering on all configurations • Off-screen frame buffer memory should not limit z resolution

  12. Z-Buffer Cache • See the paper

  13. Performance • Dependent upon • rate at which pixels are stored in the cache • cache hit rate • time to write a tile to the frame buffer • Line-drawing • random 30-pixel vectors with 4x4 tiles • 9 Mpixels/sec or 300,000 vectors/sec with cache • 1/3 the performance without cache • Polygons • random 30x30 pixel squares with 16x1 tiles • 15 Mpixels/sec or 16,000 polygons/sec with cache • 1/5 the performance without cache

  14. “Aliasing and Anti-Aliasing”, Moller and Haines • Anti-aliasing is the process of removing visual artifacts or more specifically “jaggies” • Need sampling and filtering • rendering an image is a sampling task • texels need to be resampled for texture mapping to give good results • a sequence of images for animation need to be sampled at uniform time intervals

  15. Sampling • Want to represent information (signals) digitally to reduce the amount of information • note: too little information can cause aliasing • To reconstruct the original signal, sampling frequency needs to be more than 2x maximum frequency of the signal sampled (Nyquist) • This implies a signal is bandlimited • In 3D graphics, point samples (edges of polygons) are not bandlimited BUT textures are!

  16. Reconstruction • How do we recreate the original signal? • Box filter (nearest neighbor) • worst filter to use (noncontinuous) • but simple • Tent filter (aka triangle filter) • better than box filter (continuous) • Ideal lowpass filter (sinc filter) • perfect reconstruction • but impractical – filter width can become infinite and negative

  17. Resampling • Used to magnify or minify a sample • Magnification is the simpler case • have continuous signal (remember the sinc filter) • just resample at desired intervals • Minification • frequency of original sample is too high to avoid aliasing • need to refilter the signal (see the paper)

  18. Screen-Based Anti-Aliasing • Screen based anti-aliasing has no knowledge of objects being rendered • General strategy • use a sampling pattern for the screen and then weight and sum the samples to produce a pixel color

  19. Supersampling - FSAA • Supersampling: Anti-aliasing algorithm that takes more than one sample per pixel • Full-Scene Anti-aliasing (FSAA) • renders the scene at a higher resolution • averages the neighboring samples to create an image • common in consumer hardware • costly but simple

  20. Supersampling – Accumulation Buffer • Accumulation Buffer • buffer that has the same resolution as the desired image but more bits of color • view is moved half a pixel in the screen x- or y- direction as needed, images are summed in the accumulation buffer then after rendering they are averaged • part of OpenGL • costly for real-time rendering

  21. Supersampling – T-buffer • T-buffer • variant of accumulation buffer • 2n image and z-buffers used for rendering • some logic to determine what buffer gets what data and how the buffers are combined (averaged) • data can be sent to all buffers simultaneously • screen offsets (x- and y-) can be set per buffer • Benefits for anti-aliasing? • work can be done in parallel • no programming needed to support anti-aliasing (single pass)

  22. Multisampling – “A-buffer” • A-buffer • computes a polygons approximate coverage of each grid cell • takes more than one sample per pixel in a single pass • shares computations among the samples for a grid cell • Commonly used in software to generate high quality renderings but not in real-time • Focused on edge anti-aliasing and transparency

  23. Limitations • Limitations • size of the coverage mask • even at 8x8 aliasing is still visible • box filter often used for simplicity • worst filter

  24. Gaussian Filters • Benefit: allows samples to affect more than one pixel • Approximation of sinc function with limits (removes infinite and negative width) • Generically referred to as Gaussian filters due to basis on the Gaussian bell curve equation

  25. Quincunx (NVIDIA) • Real-time anti-aliasing scheme with samples that affect more than one pixel • “Quincunx” means an arrangement of five objects, four of them in a square and the fifth in the center • Pattern approximates a 2D tent filter • Uses a weighted average for the samples • center sample – 1/2 weight • corner samples – 1/8 weight • Superior to FSAA but can introduce error

  26. So far so good… • Not every object on the screen can be perfectly sampled • Ex. arbitrarily small objects • Regular sampling patterns will always exhibit some form of aliasing • Solution? Distribute samples randomly over a pixel and use a different sampling pattern per pixel

  27. Stochastic Sampling • Why does it work? Randomization tends to replace repetitive aliasing with noise which our visualize system is more tolerant of • Jittering – most common stochastic sampling method • assume N samples per pixel • divide the pixel area into N regions of equal area • place a sample randomly in each region • final pixel color is computed by an average of the samples • Notable: 3Dlab’s SuperScene antialiasing hardware scheme uses jittering

  28. Other Sampling • Interleaved sampling • ATI SMOOTHVISION • AT&T Pixel Machines • SGI VGX • Poisson disk sampling • pattern in which nonuniformly distributed points are seperated by a minimum distance • Molnar’s scheme • adaptive refinement • useful in interactive applications • sampling rate kept low while the scene changes and is increased as the scene becomes static

More Related