500 likes | 523 Views
ECE 448: Lab 6 Using PicoBlaze Fast Sorting. Agenda for today. Part 1: Introduction to Lab 6 Part 2: Instruction Set of PicoBlaze-6 Part 3: Hands-on Session: FIDEx IDE Part 4: Lab 6 Exercise 1 Part 5: Lab 5 Demos. Part 1. Introduction to Lab 6. Sources.
E N D
ECE 448: Lab 6 Using PicoBlaze Fast Sorting
Agenda for today Part 1: Introduction to Lab 6 Part 2: Instruction Set of PicoBlaze-6 Part 3: Hands-on Session: FIDEx IDE Part 4: Lab 6 Exercise 1 Part 5: Lab 5 Demos
Part 1 Introduction to Lab 6 ECE 448 – FPGA and ASIC Design with VHDL
Sources • P. Chu, FPGA Prototyping by VHDL Examples • Chapter 14, Picoblaze Overview • Chapter 15, Picoblaze Assembly Code • Development • Chapter 16, Picoblaze I/O Interface • Chapter 17, Picoblaze Interrupt Interface • K. Chapman, PicoBlaze for Spartan-6, Virtex-6, 7-series, Zynq and UltraScale Devices (KCPSM6) ECE 448 – FPGA and ASIC Design with VHDL
INSTRUCTION RAM Buttons, Switches RAM_wen DATA RAM rinit interrupt we instruction address R_wen A[7..0] PICOBLAZE interrupt_ack INPUT_INTERFACE addr port_id BUTTON dout din out_port in_port SWITCH A[7..0] A[8..0] rinit DO[7..0] DI[7..0] read_strobe write_strobe RA[7..0] A[8..0] RD[7..0] PRNG CYCLE COUNTER & OUTPUT_INTERFACE ADDR_DECODER CCOUNT SSD3 SSD3_en, SSD2_en, SSD1_en, SSD0_en, CCOUNT_en, LED_en, MEM_BANK_en, PRNG_CTRL_en, RAM_wen SSD2 SSD1 A[8] SSD0 PRNG_STATUS DO[0] LED PRNG_CTRL Switch S7 MEM_BANK Four 7-segment displays register rinit R_wen
MEM_BANK: 000 001 002 . . . 0FE 0FF 100 101 102 103 104 105 106 107 108 109 1FE 1FF 7 6 5 4 3 2 1 0 A8 A8 – current memory bank number = the most significant bit of the address 255 x 8 DATA RAM BUTTON: 7 6 5 4 3 2 1 0 A BS BR BL BU BD A – button active (bit cleared by reading register BUTTON or by interrupt_ack) BS – Select, BR – Right, BL – Left, BU – Up, BD - Down MEM_BANK BUTTON PRNG_STATUS: SSD3 7 6 5 4 3 2 1 0 SSD2 D SSD1 D – done: bit cleared by writing to register PRNG_CTRL, set after PRNG generates 255 8-bit numbers SSD0 LED PRNG_STATUS PRNG_CTRL: PRNG_CTRL SWITCH 7 6 5 4 3 2 1 0 CCOUNT I I – initialize: after 1 is written to this bit, PRNG generates 255 8-bit numbers, and the corresponding address (index) of each number MEM_BANK
SWITCH: 7 6 5 4 3 2 1 0 S7 S6 S5 S4 S3 S2 S1 S0 S7-S0 – bits corresponding to the state of each switch CCOUNT: 7 6 5 4 3 2 1 0 S R D R – reset the 64-bit Cycle Counter, and start counting clock cycles S – stop the Cycle Counter D – display the Cycle Counter (Switch S7 chooses between displaying Least Significant and Most Significant Word) LED: 7 6 5 4 3 2 1 0 L7 L6 L5 L4 L3 L2 L1 L0 L7-L0 – bits corresponding to the status of each LED
Task 1 – Browsing Mode (default mode) Two 7-Segment Displays (in hexadecimal notation) (SSD3-SSD2) Address Data 00 01 02 03 04 05…. FA FB FC FD FE 00 01 02 03 04 05…. FA FB FC FD FE Current Address Value at Current Address Button Up = Increment Address Button Down = Decrement Address Two 7-Segment Displays (in hexadecimal notation) (SSD1-SSD0) 255x8 RAM
Task 2 – Initialize Address Data 00 01 02 03 04 05…. FA FB FC FD FE 25 87 94 26 B5 C6…. 7A 5B 34 43 89 Button Left = Initialize with Pseudorandom Values Then, return to the browsing mode 255x8 RAM
8-bit LCG (Linear Congruential Generator) with the period of 28-1 Rn+1 = a * Rn + c (mod m) where R is the sequence of pseudorandom values, a is the multiplier, c is the increment and m is the modulus. R0 will be the initial seed value. LCG generates one output per 1 clock cycle.
Task 3 – Sorting Address Data Sorting signed numbers in the descending order 00 01 02 03 04 05…. FA FB FC FD FE 7F 67 53 44 38 2D…. B1 AA 91 80 255x8 RAM
Task 4 – Cycle Count Display Mode During Sorting display: “----” on the Seven Segment Displays. After Sorting display: Number of clock cycles used (in the hexadecimal notation) #Cycles15…0 - 16 least significant bits #Cycles31..16 - 16 most significant bits Switch between these two values using switch S7 S7=0 : 16 least significant bits S7=1 : 16 most significant bits Pressing any button (other than Select) after sorting, brings the display back to the browsing mode.
Task 5 (Bonus) – Interrupts • Modify your circuit in such a way that it generates an interrupt each time any button is pressed • Modify your assembly language program accordingly, • by replacing polling by an interrupt serving routine • Consider using Register Bank switching in your interrupt service routine (if appropriate)
Contest for the Fastest Implementation of Sorting • Bonus points will be awarded to students who perform sorting (correctly) using the smallest number of clock cycles. • Possible optimizations: • Faster sorting algorithms in software • Efficient assembly language implementation • Faster sorting algorithms in hardware • Efficient hardware implementation
Part 2 Instruction Set of PicoBlaze-6 ECE 448 – FPGA and ASIC Design with VHDL
PicoBlaze-3 Programming Model ECE 448 – FPGA and ASIC Design with VHDL
PicoBlaze-6 Programming Model Bank B Bank A FFC FFD FFE FFF ECE 448 – FPGA and ASIC Design with VHDL
Syntax and Terminology Syntax Example Definition sX KK PORT(KK) PORT((sX)) RAM(KK) Value at register 7 Value ab (in hex) Input value from port 2 Input value from port specified by register a Value from RAM location 4 s7 ab PORT(2) PORT((sa)) RAM(4)
Addressing modes Immediate mode SUB s7, 07 ADDCY s2, 08 s7 – 07 => s7 s2 + 08 + C => s2 Direct mode sa + sf => sa PORT(2a) => s5 ADD sa, sf INPUT s5, 2a Indirect mode s3=> RAM((sa)) PORT((s2)) => s9 STORE s3, (sa) INPUT s9, (s2)
Arithmetic Instructions (1) C Z IMM, DIR Addition ADD sX, sY sX + sY => sX ADD sX, KK sX + KK => sX ADDCY sX, sY sX + sY + CARRY => sX ADDCY sX, KK sX + KK + CARRY => sX
Arithmetic Instructions (2) C Z IMM, DIR Subtraction SUB sX, sY sX – sY => sX SUB sX, KK sX – KK => sX SUBCY sX, sY sX – sY – CARRY => sX SUBCY sX, KK sX – KK – CARRY => sX
Test and Compare Instructions C Z TEST TEST sX, sY sX and sY => none TEST sX, KK sX and KK => none COMPARE COMPARE sX, sY sX – sY => none COMPARE sX, KK sX – KK => none IMM, DIR C = odd parity of the result IMM, DIR
Data Movement Instructions (1) C Z - - LOAD LOAD sX, sY sY => sX LOAD sX, KK KK => sX IMM, DIR
Data Movement Instructions (2) C Z - - DIR, IND STORE STORE sX, KK sX => RAM(KK) STORE sX, (sY) sX => RAM((sY)) - - DIR, IND FETCH FETCH sX, KK RAM(KK) => sX FETCH sX, (sY) RAM((sY)) => sX
Example 1: Clear Data RAM ;========================================================= ; routine: clr_data_mem ; function: clear data ram ; temp register: data, s2 ;========================================================= clr_data_mem: load s2, 40 ;unitize loop index to 64 load s0, 00 clr_mem_loop: store s0, (s2) sub s2, 01 ;dec loop index jump nz, clr_mem_loop ;repeat until s2=0 return
Data Movement Instructions (3) C Z - - DIR, IND INPUT INPUT sX, KK sX <= PORT(KK) INPUT sX, (sY) sX <= PORT((sY)) OUTPUT OUTPUT sX, KK PORT(KK) <= sX OUTPUT sX, (sY) PORT((sY)) <= sX - - DIR, IND
Edit instructions - Shifts *All shift instructions affect Zero and Carry flags
Edit instructions - Rotations *All rotate instructions affect Zero and Carry flags
Program Flow Control Instructions (1) JUMP AAA PC <= AAA JUMP C, AAA if C=1 then PC <= AAA else PC <= PC + 1 JUMP NC, AAA if C=0 then PC <= AAA else PC <= PC + 1 JUMP Z, AAA if Z=1 then PC <= AAA else PC <= PC + 1 JUMP NZ, AAA if Z=0 then PC <= AAA else PC <= PC + 1
Program Flow Control Instructions (2) CALL AAA TOS <= TOS+1; STACK[TOS] <= PC; PC <= AAA CALL C | Z , AAA if C | Z =1 then TOS <= TOS+1; STACK[TOS] <= PC; PC <= AAA else PC <= PC + 1 CALL NC | NZ , AAA if C | Z =0 then TOS <= TOS+1; STACK[TOS] <= PC; PC <= AAA else PC <= PC + 1
Program Flow Control Instructions (3) RETURN PC <= STACK[TOS] + 1; TOS <= TOS - 1 RETURN C | Z if C | Z =1 then PC <= STACK[TOS] + 1; TOS <= TOS - 1 else PC <= PC + 1 RETURN NC | NZ if C | Z =0 then PC <= STACK[TOS] + 1; TOS <= TOS - 1 else PC <= PC + 1
Part 3 Hands-on Session: FIDEx IDE ECE 448 – FPGA and ASIC Design with VHDL
KCPSM6 Assembler (book, Xilinx download) KCPSM6.EXE ECE 448 – FPGA and ASIC Design with VHDL
Differences between Mnemonics of Instructions (book) FIDEx IDE ECE 448 – FPGA and ASIC Design with VHDL
Differences between Mnemonics of Instructions ECE 448 – FPGA and ASIC Design with VHDL
Numeric Formats (FIDEx IDE) Hexadecimal: 0x3A Binary: B00111010 or B00111010 Octal: \072 Decimal: 58
Assembler Directives (FIDEx IDE) #EQU yourConstant, 0x3A ; defines your constant #EQU yourRegName, s0 ; renames a PicoBlaze register #ORG ADDR, n ; sets the memory address of the following instruction to n #DEFINE case0 ; chooses among multiple program variants #IFDEF case0 ….. #ELSEIF case1 | case2 ..... #ENDIF
Example & Demo of Tools ECE 448 – FPGA and ASIC Design with VHDL
Part 4 Lab 6 Exercise 1 ECE 448 – FPGA and ASIC Design with VHDL
Linear Congruential Generator (LCG) • Develop an assembly language implementation of a Linear Congruential Generator (LCG) producing a sequence of 8-bit pseudo-random numbers. • Use FIDEx IDE to debug and simulate your program. • Recurrence relation • Rn+1 = a * Rn+ c (mod m), where • m = 28 • a=0x11; c=0x9D; R0=0xD7 • Additionally, assume that * represents an unsigned multiplication
Notation a Multiplicand ak-1ak-2 . . . a1 a0 x Multiplier xk-1xk-2 . . . x1 x0 p Product (a * x) p2k-1p2k-2 . . . p2 p1 p0
Multiplication of two 4-bit unsigned binarynumbers Partial Product 0 Partial Product 1 Partial Product 2 Partial Product 3
Unsigned Multiplication – Basic Equations k-1 x =xi * 2i p = a * x i=0 k-1 p = a * x =a * xi *2i = = x0a20 + x1a21 + x2a22 + … + xk-1a2k-1 i=0
Iterative Algorithm for Unsigned Multiplication Shift/Add Algorithm p = a * x = x0a20 + x1a21 + x2a22 + … + xk-1a2k-1 = = (...((0 + x0a2k)/2 + x1a2k)/2 + ... + xk-1a2k)/2 = k times p(0) = 0 j=0..k-1 p(j+1) = (p(j) + xj a 2k) / 2 p = p(k)
Iterative Algorithm for Unsigned Multiplication Shift/Add Algorithm p = a * x = x0a20 + x1a21 + x2a22 + … + x7a27 = = (...((0 + x0a28)/2 + x1a28)/2 + ... + x7a28)/2 = 8 times p(0) = 0 j=0..7 p(j+1) = (p(j) + xj a 28) / 2 p = p(k)
Unsigned Multiplication Computations 8 bits 8 bits pL pH p p(j) xj a + xj a 28 + pL pH C 2 p(j+1) >> 1 p(j+1) pL pH C pH = s5 pL = s6 PicoBlaze Registers a = s3 x = s4
Unsigned Multiplication Subroutine (1) ;========================================================= ; routine: mult_soft ; function: 8-bit unsigned multiplier using ; shift-and-add algorithm ; input register: ; s3: multiplicand ; s4: multiplier ; output register: ; s5: upper byte of product ; s6: lower byte of product ; temporary register: ; s2: index j ;=========================================================
Unsigned Multiplication Subroutine (2) mult_soft: load s5, 00 ; clear pH load s2, 08 ; initialize loop index mult_loop: sr0 s4 ; shift lsb of x to carry jump nc, shift_prod ; x_j is 0 add s5, s3 ; x_j is 1, pH=pH+a shift_prod: sra s5 ; shift upper byte pH right, ; carry to MSB, LSB to carry sra s6 ; shift lower byte pL right, ; lsb of pH to MSB of pL sub s2, 01 ; dec loop index jump nz, mult_loop ;repeat until i=0 return
Part 5 Lab 5 Demos ECE 448 – FPGA and ASIC Design with VHDL