250 likes | 405 Views
E-Voting Machine - Design Presentation. Group M1 Bohyun Jessica Kim Jonathan Chiang Chi Ho Yoon Donald Cober. Mon. Sept 29 System Hardware Component Diagram Gate-level Data path Updated Transistor Estimates Floorplan. Secure Electronic Voting Terminal. Status Update.
E N D
E-Voting Machine - Design Presentation • Group M1 • Bohyun Jessica Kim • Jonathan Chiang • Chi Ho Yoon • Donald Cober Mon. Sept 29 System Hardware Component Diagram Gate-level Data path Updated Transistor Estimates Floorplan Secure Electronic Voting Terminal
Status Update • Behavioral Verilog Entire System • Gate-level Hardware Block Diagram • Updated Transistor Count Calculations • Initial Floorplan • Structural Verilog Entire System • Refined Floorplan
constant init Data Bus Card Reader 0 1 Machine Init FSM 8 bit MUX Encryption Key SRAM Key Register 8 bit Add/Sub Fingerprint Scanner 8-bit REG User ID SRAM User ID FSM T: 88 Selection Counter T: 128 0 1 0 1 0 1 8 bit MUX 8 bit MUX 8 bit MUX Write-in SRAM User Input 8 bit FullAdder 8 bit FullAdder 8 bit FullAdder Selection FSM Choice SRAM TX_Check XOR COMMS Register XOR Confirmation FSM Message ROM 8 bit FullAdder Display Shift Register In Shift Register Out 8-bit REG
SUPER MUX! … data[7:0] SuperMux: • Our data flow consists of shuffling 8 bits of data from a source to a destination • These sources and destination are SRAMs, User Input, Comms, etc • Many are bidirectional • Since only one piece of data will be sent at a time, it makes sense to use a bus configuration for data movement rather than a set of giant muxes • We can gate which srcs/dests (drop points) are connected to the bus with one level of pass logic • This way the data will only ever go through two layers of pass logic to • Get onto the bus • Get off of the bus • We will still call this the SuperMux for legacy purposes • Layout will be fun Drop point Drop point Drop point …
Tiny Encryption Algorithm Project Specs Original Implementation: 64-bit blocks: Two 32-bit inputs 128-bit key: Four 32-bit keys (K[0], K[1], K[2], K[3]) Feistel Structure: Symmetric structure used in block ciphers “Magic” constant: 9E3779B9 (Delta) = 2^32 / 1.6180339887 (golden ratio) 64 Feistel rounds = 32 cycles E-Voting Machine Implementation: 16-bit blocks: Two 8-bit inputs 32-bit key: Four 8-bit keys 32 Feistel rounds = 16 cycles Decision: Scale up 1.6 golden ratio by magnitude of 10 to 16, scale (2^16) by 10 = 655360 and do division 655360 / 16 to get Delta. Avoids using Floating point for key scheduler. New Delta = A000, truncate least sig bit to A000 to fit 16 bits when decrypting, since A00 * 8 cycles = 0x5000 Hardware: 4, 5-bit Shifters 16-bit Multipliers 16-bit Adder / Subtractor
COMMS BLOCK Hardware Implementation 1 StatesinA[7:0]inB[7:0]sel_outsel_shift[1:0]sel_sumv_out[7:0] (1) delta sum[7:0] 0 00 0 v_out0 = sum[7:0] (2) v1 sum[7:0] 0 01 1 v_out1= (C+D) (3) v1 << 4 k0 1 10 0 v_out2= (A+B) ^ (C+D) (4) v1 >> 5 k1 1 11 0 v_out3 = (A+B) ^ (C+D) ^ (E+F) (5) v0 out3 0 1 1 v_outx = V0 + (A+B) ^ (C+D) ^ (E+F) States (6)-(9) same as above except using k2, k3, and flip v1, v0 Implementation goes through 9 states/clk cycles each iteration to update output function v_outx. Reusing of: (1x) 8 bit Full adder/sub (Ripple carry) [16*8 = 128] (2x) 2:1 8 bit MUX for output pass-through [4*8*2 = 64] (8x) 2-input XORS [6*8 = 48] (1x) 8 bit REG [11*8 = 88] (1x) 4:1 8 bit MUX for shifting selection [12*8 = 96] In addition, logic will to iterate 8 times and be controlled via FSM machine that uses: (2x) 3:1 8 bit MUX for state input selection [8*8*2 = 128] (2x) 1 bit Counter adder for updating cycle [16*2 = 32] (2x) 1 bit REG for storing updated cycle [11*2 = 22] Total: 606 Advantages: Saves transistors and area for Comms Block Disadvantages: Very heavy pass-logic from MUX layers and XOR High clk frequency required since reusing same components for calculating outx by stages. This translates to higher power consumption since we are trying to do more with less hardware. Tradeoff: Every 8-bit MUX uses 4*8 = 32 transistors compared to 8-bit Full Adder 16*8 = 128 transistors. However MUXES have high pass-logic so area vs. power tradeoff is concerned here. 3:1 8 bit MUX 3:1 8 bit MUX inA[7:0] inB[7:0] sel_shift[1:0] Logical Shifter Code 0 1 sel_sum inA[7:0]sel_shift[1:0] delta 00 v1 01 v1 << 4 10 v1 >> 5 11 8 bit MUX 00 01 10 11 T: 32 4:1 8 bit MUX T: 64 8’h00 8 bit FullAdder/Sub 0 1 sel_out 8 bit MUX T: 128 T: 32 1 bit FullAdder XOR T: 48 1-bit REG clk 8-bit REG clk T: 88 v_outx sum += delta; v0 += ((v1<<4)+k0) ^ (v1+sum) ^ ((v1>>5)+k1); v1 += ((v0<<4)+k2) ^ (v0+sum) ^ ((v0>>5)+k3);
COMMS BLOCK Hardware Implementation 2 sum delta 0 1 sel_outoutput 0 pass sum, V1 1 pass new sum, V0 8 bit MUX 8 bit Add/Sub Implementation 2 does concurrent calculations for all 3 parts of function, completes full iteration of calculations in 2 clk cycles. Uses: (1x) 8 bit Full adder/sub (Ripple carry) [16*8 = 128] (3x) 8 bit Full adder (Ripple carry) [12*8*4 = 384] (4x) 2:1 8 bit MUX for output pass-through [4*8*4 = 128] (16x) 2-input XORS [6*16 = 96] (2x) 8 bit REG [11*8*2 = 176] (1x) 1 bit Counter adder for updating cycle [16] (1x) 1 bit REG for storing updated cycle [11] Total: 939 In addition, logic will not need complex FSM, just needs to do 8 iterations. Advantages: Low pass logic, speed performance, low power, MUX logic transistor count essentially halved. Disadvantages: More Transistor Count and larger area. Tradeoff: Larger area but low pass logic from reduced MUX and complex FSM simplifies design, increases speed and minimizes power. 8-bit REG V1 K0 V1 clk V1 K1 V0 T: 88 T: 128 0 1 0 1 0 1 8 bit MUX 8 bit MUX 8 bit MUX T: 32 T: 32 T: 32 sel_out {V1[3:0], 4’b0} {5’b0, V1[7:5]} 8 bit FullAdder 8 bit FullAdder 8 bit FullAdder T: 128 T: 128 T: 128 XOR 1 bit FullAdder XOR 1-bit REG 8 bit FullAdder T: 128 clk 8-bit REG clk T: 88 sum += delta; v0 += ((v1<<4)+k0) ^ (v1+sum) ^ ((v1>>5)+k1); v1 += ((v0<<4)+k2) ^ (v0+sum) ^ ((v0>>5)+k3); v_outx
E-Voting TEA Gate Level Hardware FullAdder Common full adder Mirror Adder -Uses 28 transistors (including 4 transistors in inverters) -NMOS and CMOS are completely symmetrical logic : S = a ⊕ b ⊕ Carryin Carryout = (a ⊕ b) • Carryin +(a • b)
E-Voting TEA Gate Level Hardware FullAdder What we decided to use in this project… 1-bit full adder -Uses pass-transistor logic for computing XNOR -Sum-bit equals to A^B^C, where A and B are 2 inputs and Cin is the Carry-in input; muxing at the bottom will sort out the Cout bit to carry out. -Will use this adder 8 times to compute all 8 bits of data -Uses inverters to strengthen the signal at the end of each XNOR -Uses only 16 transistors yet strong signal
E-Voting TEA Gate Level Hardware MUX XOR XOR -To avoid using two t-gates -Uses 6 transistors (XNOR + inv) T-gate Mux -4 transistors -very tiny hence difficult to layout
REG E-Voting TEA Gate Level Hardware TSPC Register -True single phase clock flip-flop -Advantage of single clock distribution, small area for clock lines, high speed and no clock skew -We will use 8T instead of 9T
SRAM Gate Level Hardware SRAM Cell -6T SRAM Cell -smaller transistor size -lower energy dissipation -efficient layout
SRAM Gate Level Hardware Address Decoder -Combination of inverters and nand gates
SRAM Gate Level Hardware SRAM -Input/Ouput tri-state buffers? -Need of Sense amplifier?
Card Reader Data Bus Machine Initialization FSM 1bit Card Detected Signal 2bit Address Encryption Key SRAM (4 byte) 1bit Message COMMS Message ROM 1bit Data Ready 4-bit Data bus control 8bit Data 8bit Data 8bit Data 8bit Data 1bit Activate next
1bit Activate this 1bit Reactivate this User Input Data Bus User ID FSM 1bit Card Detected Signal 3bit Address 1bit Finger Scanned Signal 1bit No Signal 1bit Yes Signal Card Reader Fingerprint Scanner 2bit Message User ID SRAM (8 byte) COMMS Message ROM 1bit Data Ready Display 8bit Data 7-bit Data bus control 8bit Data 8bit Data 8bit Data 8bit Data 8bit Data 1bit Activate next
1bit Activate this 1bit Reactivate this Data Bus Selection FSM User Input 2bit Address 1bit Previous Page Signal 1bit Next Page Signal 3bit Count Selection Counter 8bit Data 2bit Message Choice SRAM (4 byte) COMMS Message ROM 1bit Data Ready Display 8bit Data 6-bit Data bus control 8bit Data 8bit Data 8bit Data 8bit Data 1bit Activate next
TX_Check 1bit Activate this User Input Data Bus Confirmation FSM 1bit Reset 6bit Address 2bit Address 3bit Address 1bit No Signal 1bit Yes Signal 1bit TX_good User ID SRAM (8 byte) 1bit Reset 2bit Message Choice SRAM (4 byte) 1bit Reset Write-in SRAM (64 byte) COMMS 1bit Data Ready Message ROM 8-bit Data bus control Display 8bit Data 8bit Data 8bit Data 8bit Data 8bit Data 8bit Data 1bit Reactivate Selection 1bit Reactivate User ID
SUPER MUX! The statement that we only transfer one byte of data at a time is technically false For example: When the Message ROM is sending a message to the COMMS The COMMS are using data from the Encryption Key SRAM to encode the message Encryption Key SRAM (4 byte) Message ROM Data Bus COMMS We can circumvent this by hardwiring the Encryption Key SRAM data to the COMMs Key input in addition to attaching it to the bus. This only works because the Key SRAM will never be active on the data bus while the COMMs are accessing it
SUPER MUX! Other hardwired Connections: TX Check Choice SRAM The transmission check confirms that the data sent to the main computer and held in it’s current session matches the choices stored in our SRAM During the Confirmation FSM the SRAM data is sent to the main computer and the main computer echos it back. The echo is streamed into the TX Check (as well as the display) and the TX Check compares it (as it is streaming) to the Choice SRAM User Input Write-In SRAM
module machine_init_fsm(clk, cardDetectSig, commDetectSig, actNext, mux_src, mux_dest, message, address); //Initialize initial begin actNext = 0; state = 0; next_state = 1'b0; end //Main FSM always @* begin if(!actNext) begin case (state) `s1: begin mux_src = 0; mux_dest = 0; //Wait for card data if(cardDetectSig) begin //Send card data to the Key SRAM next_address = 0; next_state = `s2; end end `s2: begin mux_src = `CARD_SRC; mux_dest = `KEY_SRAM_DEST; //read in 4 bytes from card reader if(address==3) begin next_state = `s3; end next_address = address + 1; end `s3: begin //Send a key request to the comms message = `KEY_REQUEST; mux_src = `MESSAGE_SRC; mux_dest = `COMMS_DEST; next_state = `s4; end `s4: begin mux_src = 0; mux_dest = 0; next_address = 0; //Wait for data to arrive if(commDetectSig==0) begin next_state = `s4; end else begin next_state = `s5; end end `s5: begin mux_src = `COMMS_SRC; mux_dest = `KEY_SRAM_DEST; //read in 4 bytes from card reader if(address==3) begin next_state = `s6; end next_address = address + 1; end `s6: begin //proceed mux_src = 9'bzzzzzzzzz; mux_dest = 8'bzzzzzzzz; message = 3'bzzz; address = 2'bzz; next_address = 2'bzz; actNext = 1; end endcase end else begin mux_src = 9'bzzzzzzzzz; mux_dest = 8'bzzzzzzzz; message = 3'bzzz; address = 2'bzz; next_address = 2'bzz; end end //State Register: always @(posedge clk) begin state = next_state; address = next_address; end endmodule Converting Behavioral Verilog to Transistor Counts • Machine Init FSM • Create registers: • 6 states => 3 D-flip-Flops • + 2bit SRAM address • State Change Logic: • Most changes are sequentially incrementing • Flip Flops are configured as counters • Further Logic: • Remaining logic consists of output signals generated mostly by state • Random logic can be approximated based on number and configuration of outputs D ~Q > Q D ~Q > Q D ~Q > Q D ~Q > Q D ~Q > Q 5 distinct 1bit outputs Each 1-bit output derived from a 3-bit input (state) Approx 2 / 2 input gates for each ~10 transistors tfor each distinct output 50 transistors total for random logic
Converting Behavioral Verilog to Transistor Counts (cont) Total: 1425
Converting Behavioral Verilog to Transistor Counts (cont) Total: 7254
Machine Init FSM Encryption Key SRAM User ID SRAM USER ID FSM COMMS Choice SRAM Selection FSM Comm Register Shift In Confirmation FSM User Input MUX Shift Out Write-In SRAM
Questions? Thank you!