1 / 23

Performing Advanced Bit Manipulations Efficiently in General-Purpose Processors

Performing Advanced Bit Manipulations Efficiently in General-Purpose Processors. Yedidya Hilewitz and Ruby B. Lee Princeton Architecture Lab for Multimedia and Security Department of Electrical Engineering Princeton University 18 th IEEE Symposium on Computer Arithmetic (ARITH-18)

gavin
Download Presentation

Performing Advanced Bit Manipulations Efficiently in General-Purpose Processors

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Performing Advanced Bit Manipulations Efficiently in General-Purpose Processors Yedidya Hilewitz and Ruby B. Lee Princeton Architecture Lab for Multimedia and Security Department of Electrical Engineering Princeton University 18th IEEE Symposium on Computer Arithmetic (ARITH-18) Montpellier, France, June 25-27, 2007

  2. Background and Motivation • Advanced bit manipulations are not well supported by commodity microprocessors • These operations are performed using “programming tricks” (cf. Hacker’s Delight) • Bit manipulations play a role in applications of increasing importance • We propose a brand new shifter architecturethat replaces the shifter with a new unit that directly supports bit manipulation operations Yedidya Hilewitz and Ruby B. Lee Performing Advanced Bit Manipulations Efficiently

  3. Outline • Background and motivation • Advanced bit manipulation operations • Delineation and example usage • New shift-permute functional unit • Summary and conclusions Yedidya Hilewitz and Ruby B. Lee Performing Advanced Bit Manipulations Efficiently

  4. Advanced Bit Manipulation Instructions • Bit Permutation • Butterfly (bfly) and Inverse Butterfly (ibfly) • Bit Gather and Bit Scatter • Parallel Extract and Parallel Deposit Yedidya Hilewitz and Ruby B. Lee Performing Advanced Bit Manipulations Efficiently

  5. Any of the n! permutations of n bits can be done with one pass of bfly and ibfly instructions • bfly+ibfly = general permutation circuit • 8-bit Butterfly • lg(n) stages of n 2:1 MUXes split into n/2 pairs that pass through or swap inputs • 8-bit Inverse Butterfly Yedidya Hilewitz and Ruby B. Lee Performing Advanced Bit Manipulations Efficiently

  6. Bit Gather (Parallel Extract) and Bit Scatter (Parallel Deposit) • Parallel Extract • pex r1 = r2, r3 • extracts bits from r2 flagged by 1’s in r3 and compresses and right justifies in result register • Parallel extract maps to ibfly datapath • Parallel Deposit • pdep r1 = r2, r3 • deposits in the result register, at positions flagged by 1’s in r3, the right justified bits from r2 • Parallel deposit maps to bfly datapath Yedidya Hilewitz and Ruby B. Lee Performing Advanced Bit Manipulations Efficiently

  7. Example Usage: Bioinformatics - DNA Sequence Reversal • DNA Bases A, C, G and T represented by two bit codes • Reversing DNA sequence is equivalent to reversing order of bit pairs • bfly or ibfly permutation • 1 ibfly instruction equivalent to 11-23 ALU and shifter instructions • 2×(and, and, shift, shift, or) + byte reverse instruction, at minimum Yedidya Hilewitz and Ruby B. Lee Performing Advanced Bit Manipulations Efficiently

  8. Advanced Bit Manipulation Functional Unit • We propose adding a new functional unit to directly perform advanced bit manipulations • To minimize the cost, we intend for this new functional unit to replace the shifter unit • Shifter currently performs basic bit manipulation operations • Our new functional unit represents an evolution of shifter designs Yedidya Hilewitz and Ruby B. Lee Performing Advanced Bit Manipulations Efficiently

  9. Basic Bit Manipulation Operations • shift r1 = r2, s • extract r1 = r2, pos, len • mix r1 = r2, r3 • rotate r1 = r2, s • deposit r1 = r2, pos, len Yedidya Hilewitz and Ruby B. Lee Performing Advanced Bit Manipulations Efficiently

  10. Parallel Extract and Parallel Deposit • Parallel Extract • Parallel Deposit Yedidya Hilewitz and Ruby B. Lee Performing Advanced Bit Manipulations Efficiently

  11. Barrel Shifter ? Evolution of Shifter Designs • Log Shifter • Our proposed design Yedidya Hilewitz and Ruby B. Lee Performing Advanced Bit Manipulations Efficiently

  12. New Shifter Design • Inverse butterfly (or butterfly) circuit enhanced with extra multiplexer stage is basis of new shifter design • We will show that either butterfly or inverse butterfly individually can do rotate • Rotations are the basic operation underlying shift, extract, deposit and mix • Model other basic bit manipulation operations as rotate + • zeroing • sign bit propagation or • merging Yedidya Hilewitz and Ruby B. Lee Performing Advanced Bit Manipulations Efficiently

  13. New Shift-Permute Functional Unit Implementation Yedidya Hilewitz and Ruby B. Lee Performing Advanced Bit Manipulations Efficiently

  14. Configuring Inverse Butterfly for Rotations • Hard Problem: generating control bits for rotations on inverse butterfly circuit • We derive an expression for the control bits based on recursive function of shift amount, s, and stage number, j Yedidya Hilewitz and Ruby B. Lee Performing Advanced Bit Manipulations Efficiently

  15. Example: Right Rotation by 5 on 8-bit Inverse Butterfly Circuit • The input is right rotated by 5 after each stage within each subcircuit Yedidya Hilewitz and Ruby B. Lee Performing Advanced Bit Manipulations Efficiently

  16. Example: Right Rotation by 5 on 8-bit Inverse Butterfly Circuit • After stage 1, input is right rotated by 5 (mod 2) = 1 within each 2-bit subcircuit Yedidya Hilewitz and Ruby B. Lee Performing Advanced Bit Manipulations Efficiently

  17. Example: Right Rotation by 5 on 8-bit Inverse Butterfly Circuit • After stage 2, input is right rotated by 5 (mod 4) = 1 within each 4-bit subcircuit • Bits that wrapped at output of previous stage are swapped Yedidya Hilewitz and Ruby B. Lee Performing Advanced Bit Manipulations Efficiently

  18. Example: Right Rotation by 5 on 8-bit Inverse Butterfly Circuit • After stage 2, input is right rotated by 5 (mod 4) = 1 within each 4-bit subcircuit • Bits that wrapped at output of previous stage are swapped Yedidya Hilewitz and Ruby B. Lee Performing Advanced Bit Manipulations Efficiently

  19. Example: Right Rotation by 5 on 8-bit Inverse Butterfly Circuit • After stage 3, input is right rotated by 5 • Bits that wrapped at output of previous stage are passed through Yedidya Hilewitz and Ruby B. Lee Performing Advanced Bit Manipulations Efficiently

  20. Rotations in general on n-bit Inverse Butterfly Circuit • shift amount, s < n/2 → swap bits that wrapped • shift amount, s ≥ n/2 → pass through bits that wrapped Yedidya Hilewitz and Ruby B. Lee Performing Advanced Bit Manipulations Efficiently

  21. Circuit Implementation of Rotation Control Bit Generator Yedidya Hilewitz and Ruby B. Lee Performing Advanced Bit Manipulations Efficiently

  22. Comparison to Barrel and Log Shifters Yedidya Hilewitz and Ruby B. Lee Performing Advanced Bit Manipulations Efficiently

  23. Summary and Conclusions • We proposed evolving the shifter to a new design using butterfly and inverse butterfly datapaths • New shifter subsumes basic shifter, multimedia shift-permute unit and advanced bit manipulation unit • We have shown how to perform basic shifter operations on these datapaths • Rotation control bit generator • Extra multiplexer stage for masking and merging • Use of the new shifter design in future microprocessor implementations allows for increased capabilities at only marginal cost Yedidya Hilewitz and Ruby B. Lee Performing Advanced Bit Manipulations Efficiently

More Related