1 / 28

Reconfigurable Computing (EN2911X, Fall07) Lab 2 presentations

Reconfigurable Computing (EN2911X, Fall07) Lab 2 presentations. Prof. Sherief Reda Division of Engineering, Brown University http://ic.engin.brown.edu. 4.2 seconds 14 seconds 33 seconds 300 seconds 305 seconds 320 seconds. Runtimes by different teams. Cesare Ferri Rotor Le.

ezhno
Download Presentation

Reconfigurable Computing (EN2911X, Fall07) Lab 2 presentations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reconfigurable Computing (EN2911X, Fall07) Lab 2 presentations Prof. Sherief Reda Division of Engineering, Brown University http://ic.engin.brown.edu

  2. 4.2 seconds 14 seconds 33 seconds 300 seconds 305 seconds 320 seconds Runtimes by different teams

  3. Cesare Ferri Rotor Le Palindrome Checker

  4. Part I : Verilog Module DECOMPOSE THE NUMBER IN DIGITS Room for loop unrolling here.. always @(posedge CLOCK_50) begin not_palindrome = 1'd0;len = 0; tmp = number; //reset for (i = 0; i<9 ; i = i + 4'd1) begin if (tmp > 0) begin modulo = tmp % 4'd10; tmp = tmp / 10; vector[len % 9] = modulo; len = len + 1; end end th = (len >> 1) ; for (j=0; j<th; j = j + 4'd1) begin tmp2 = (len-1) - j; tmp3 = vector[j];tmp4 = vector[tmp2]; if ( tmp3 != tmp4 ) not_palindrome = 1'b1; end result = ~(not_palidrome); end

  5. Part I : Verilog Module COMPARE THE DIGITS STORED INTO THE VECTOR loop unrolling, again.. always @(posedge CLOCK_50) begin not_palindrome = 1'd0;len = 0; tmp = number; //reset for (i = 0; i<9 ; i = i + 4'd1) begin if (tmp > 0) begin modulo = tmp % 4'd10; tmp = tmp / 10; vector[len % 9] = modulo; len = len + 1; end end th = (len >> 1) ; for (j=0; j<th; j = j + 4'd1) begin tmp2 = (len-1) - j; tmp3 = vector[j];tmp4 = vector[tmp2]; if ( tmp3 != tmp4 ) not_palindrome = 1'b1; end result = ~(not_palidrome); end

  6. Optimized Verilog Code Do loop unrolling to compare digits: if (digits[0] == digits[3] && digits[1] == digits[2]) not_palindrome = 1'd1;//reset

  7. Unsolved things • Our running time now depends on the way that we extract digits from the number • Some ideas to improve? • Using shift register • Using non-blocking instructions

  8. Palindrome Homework Summary ENGN2911X Aaron Mandle Bryant Mairs

  9. Setup Two-cycle fixed length custom instruction Operates on 20 numbers at a time Returns total palindromes in that 20-number block

  10. Process Combinatorial conversion from binary to BCD Check number of digits Compare digits based on length Total up number of valid palindromes

  11. Binary to BCD Conversion • Built using blocks of conditional add-3 modules and shifts • Add-3 modules: • 4-bit input • Adds 3 if input was 5 or greater • Based on adding 6 numbers > 9

  12. module checkPalindrome(data, result); input [31:0] data; output [31:0] result; wire [3:0] digits [10:0]; wire [3:0] digCount; bin2bcd({digits[9], digits[8], digits[7], digits[6], digits[5], digits[4], digits[3], digits[2], digits[1], digits[0]}, data); assign digCount = digits[9] != 0?10: digits[8] != 0?9: digits[7] != 0?8: digits[6] != 0?7: digits[5] != 0?6: digits[4] != 0?5: digits[3] != 0?4: digits[2] != 0?3: digits[1] != 0?2: 1; assign result = digCount == 1 || digCount == 2 && (digits[0] == digits[1]) || digCount == 3 && (digits[0] == digits[2]) || digCount == 4 && (digits[0] == digits[3] && digits[1] == digits[2]) || digCount == 5 && (digits[0] == digits[4] && digits[1] == digits[3]) || digCount == 6 && (digits[0] == digits[5] && digits[1] == digits[4] && digits[2] == digits[3]) || digCount == 7 && (digits[0] == digits[6] && digits[1] == digits[5] && digits[2] == digits[4]) || digCount == 8 && (digits[0] == digits[7] && digits[1] == digits[6] && digits[2] == digits[5] && digits[3] == digits[4]) || digCount == 9 && (digits[0] == digits[8] && digits[1] == digits[7] && digits[2] == digits[6] && digits[3] == digits[5]); endmodule

  13. Yossi

  14. Finding the length of the decimal representation (# digits) by: typedef unsigned long UINT; inline UINT GetMSDFIndx(UINT n) { return (n >= 100000000 ? 8 : (n >= 10000000 ? 7 : (n >= 1000000 ? 6 : (n >= 100000 ? 5 : (n >= 10000 ? 4 : (n >= 1000 ? 3 : (n >= 100 ? 2 : (n >= 10 ? 1 : 0)))))))); } For all solutions

  15. Times: On laptop (Intel 2333 MHz): 8 secs. On NIOS (100 MHz): 3500 secs. Inherently sequential Early false detection: quit the computation if we find two digits that do not match.  Brings down expected # divide operations to less than 2.2 Software Only Solutions

  16. Observations: 1. Detect whether the MSD is a given number without division MSD test: d is the MSD of number n of length L if and only if d*10L-1 ≤ n < (d+1)* 10L-1E.g 4*103 <= 4765 < 5*103 2. “Cut out” the MSD: 4665 – 4*103 = 665 and continue. Algorithm: find one LSD after another, compare with MSDs, quit early if not a palindrome. Runs in 8 seconds on laptop Software Only Solutions

  17. On NIOS, division is really expensive Division free algorithm:Don’t test the MSD, find it with binary search Software Only Solutions

  18. On NIOS, division is really expensive Algorithm: Start from left Find half of the digits Compute the palindrome whose left half matches these digits Compare to the tested number Loose the early false detection, but still better than division. Runs in 3500 secs on NIOS 100 MHz. Software Only Solutions

  19. A general trick to divide by a constant without using division.Based on trick I read in “Hackers Delight” of how to divide by 3. Demonstrate on divide by 10: Given: number n < 230 Needed: floor(n/10) Algorithm:Multiply n by (231+2)/10 = 0xCCCCCCD, and then shift right 31 positions. Using the Hardware

  20. Algorithm:Multiply n<230 by (231+2)/10 = 0xCCCCCCD, and then shift right 31 positions. Proof: The above algorithm outputs: floor(n/10) <= n/10 + 2*n/(10*231) < floor(n/10) + 1 < 1/10 Division Free divide by 10 floor[ n * ((231+2)/10) * 1/231 ] = floor [ n/10 + 2*n/(10*231) ] = floor(n/10) n < 230 implies: 2*n < 231 floor(n/10) <= n/10 <= floor(n/10) + 9/10

  21. Similarly, to divide n by a constant C, we need to findP and R such that: 2P + R = 0 mod C. R*n < 2P And then multiply n by (2P + R)/C, and shiftright P positions. Found the constants to all powers of 10 needed.Algorithm worst register to register delay: 25 ns. Run Time: 33 secs. Divide by Constant

  22. EN2911X Lab 2: Palindromes Brian Reggiannini and Chris Erway

  23. Checking a palindrome All combinational logic! Step 1: Convert 30-bit integer to 37-bit binary-coded decimal (BCD) format Step 2: Detect the length of decimal number Step 3: Compare pairs of digits with XOR

  24. Binary to BCD converter

  25. Binary to BCD converter

  26. Integration with Nios II Worst-case propagation delay: 43ns, 5 cycles Don’t want to wait! Use 32-bit PIO interface Array of 25 palindrome-checking units Write out 32-bit start value… Read back # of total palindromes found (from next 25) While Nios is waiting: increment loop counter

  27. Nios Software

  28. Results Original C program: 49.59s/billion Unoptimized Nios C program: 7842s/100million Final result: 4.2s/billion (420000036 cycles @ 100MHz) Total logic elements: 23,039 / 33,216 (69%)

More Related