1 / 12

The Power of Streaming Table Look-Up

The Power of Streaming Table Look-Up. Fred Brooks University of North Carolina at Chapel Hill http://www.cs.unc.edu/~brooks brooks@cs.unc.edu Thanks to ONR Virte, NIH NIBIB, DoE, NSF. von Neumann Computers. Designed to do many operations on each datum

adeola
Download Presentation

The Power of Streaming Table Look-Up

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Power of Streaming Table Look-Up Fred Brooks University of North Carolina at Chapel Hill http://www.cs.unc.edu/~brooks brooks@cs.unc.edu Thanks to ONR Virte, NIH NIBIB, DoE, NSF

  2. von Neumann Computers • Designed to do many operations on each datum • Hence data stays (mostly) still, while instructions flow past • Substantial set of different operations, but each has fixed function

  3. Data-Streaming Computers • Designed to do same operation on many data; serially vs SIMD || • So operation stays still (set up), and the data flows past • Want very powerful vector operations, so as to flow the data few times • A whole different way of programming • APL exemplifies how to think

  4. Problem: Conditionals • Stencil (logical vector) calculation: <>=≠ • Input/output masking by stencil; • Some operators effect conditions • Absolute value, clamping • Max, min, match, merge (two streams) • Table Look-Up • Table Look-Up with Table Change

  5. GPUs • Are data-streaming computers • Have some fixed operations, e.g., vertex transformation • The powerful, custom-tailorable ops are done by streaming table lookup, • and streaming to-memory operations, such as Z-buffering • I’m eager to hear Bill Dally on this

  6. So a Quick Look at the Past • Get some ideas for programming • Get some ideas for generalizing the GPUs • Avoid some mistakes of previous designs

  7. CONVERT in IBM 709 (1957) • Three ops in a standard op set • Amdahl, 709’s architect, invented them • General table lookups on 6-bit bytes, but designed for specific applications: • Translate character codes, 1-for-1 • Radix conversion—add result to ACC • Decimal (BCD) addition • Suppress left zeros, etc. for printing

  8. CONVERT AND REPLACE step • Six-bit byte argument is added to table base address • Returns a 36-bit function: • 6 bits replace the argument in stream • 15 bits are added mod 215 to table address for next lookup! • A finite-state machine!

  9. The IBM 7950 (“Harvest”) • 1961; Jim Pomerene was chief architect • A “plug-in processor” for Stretch • Like GPUs, bigger than host • A pure data streaming machine • Delivered to NSA, ran decades • For byte-by-byte cryptanalysis

  10. Harvest Programming Model From Blaauw & Brooks, Computer Architecture, 1997

  11. What Can Such a Machine Do? • Flow bytes through multiple permutation tables • Sort, merge, collate like crazy • Count, incidence in memory • Detect low-probability sequences, by Bayes • Determine the language of a text, by Bayes • Convert Roman numerals to Arabic • Buchholz, ed., Planning a Computer System, 1962 • Randomly create valid hymn tunes, by Markov

  12. Lessons and Ideas for GPUs • Real problems not uniform as model ones • Going back to host is a killer • Today, texture memory size cramps • Just wait • Adding TLU result to table address gives a very powerful capability • Two streams interacting >> one • Instance stream in the NV 40?

More Related