420 likes | 654 Views
FPGA Implementation of Whirlpool and FSB Hash Algorithms. 6.375 Final Presentation Jeff Simpson, Jingwen Ouyang, Kyle Fritz. Outline. Overview Test Harness Hash Algorithms Whirlpool FSB Closing Remarks. Outline. Overview Test Harness Hash Algorithms Whirlpool FSB Closing Remarks.
E N D
FPGA Implementation of Whirlpool and FSB Hash Algorithms 6.375 Final Presentation Jeff Simpson, Jingwen Ouyang, Kyle Fritz
Outline • Overview • Test Harness • Hash Algorithms • Whirlpool • FSB • Closing Remarks
Outline • Overview • Test Harness • Hash Algorithms • Whirlpool • FSB • Closing Remarks
What is a Hash? • A hash is a fingerprint of sorts – a small key which can be used to identify a larger data set. • Hashes have many uses • Identifying that a data set is correct. • Performing database indexing • Cryptographic functions
SHA-3 Competition • National Institute of Science and Technology (NIST) is holding a competition to write the successor to the SHA-2 hashing algorithm. • Over 50 algorithms have been submitted for consideration. • NIST will make the final decision, but the community is performing analysis and making recommendations.
Project Goals • Implementation of hash algorithms on the Altera DE2-70 FPGA • Whirlpool hash • FSB hash (SHA-3 candidate, uses Whirlpool) • The process and results of implementing the SHA-3 candidate algorithm will serve as an analysis of the algorithm.
Outline • Overview • Test Harness • Hash Algorithms • Whirlpool • FSB • Closing Remarks
Test Harness • Provide a layer of abstraction • Simplify memory access • Provide FPGA interface • Provide simple and fast end-to-end testing
Hash Abstraction • Put Length • Put Word • Get Hash • Get Table Lookup • Put Table Lookup Response Hash does not need to know anything about memory organization, addressing, or interface Test harness does not need to know anything about the Hash function.
Memory 0400000:040105F – NIOS (4KB) 0410000:0417FFF – Input Message (32KB) 0440000:0447FFF – Hash Memory (32KB) 1000000:17FFFFF – Lookup Tables (8MB, Flash)
On FPGA • Intel HEX file is generated from test-case data for loading FPGA • Altera flash image is generated from lookup table • NIOS signals for the hash to start, then reads the result from memory when the hash has completed.
In Simulation • Verilog VMH file generated from test-case data, AND lookup table. • Hash is commanded to start automatically. • Result is displayed (saved to output log file)
Message Input VMH Format @0002 // Message size in bits (64) @0004 // Data address @0005 // Result address @400000 //Lookup table data (simulation only)
Testing • A suite of test-cases is used for automated testing • Reference hashes are automatically generated and compared to the simulation results. • FPGA results can be automatically compared in the same fashion. • A NIOS-based message generator is used to test message input > 32KB
Outline • Overview • Test Harness • Hash Algorithms • Whirlpool • FSB • Closing Remarks
Typical Hash Structure Preprocessing Compression Finalization F
Typical Hash Structure Preprocessing Compression Finalization 49 1d af F
Typical Hash Structure Preprocessing Compression Finalization 3c 491daf F
Typical Hash Structure Preprocessing Compression Finalization 3c 00000000 F 491daf
Typical Hash Structure Preprocessing Compression Finalization 3c 8 020 00000000 F 491daf
Typical Hash Structure Preprocessing Compression Finalization 3c8020 F 46a931ff
Typical Hash Structure Preprocessing Compression Finalization 46a931ff F 3c8020
Typical Hash Structure Preprocessing Compression Finalization F a903bd55
Typical Hash Structure Preprocessing Compression Finalization F a903bd55 03bd55
Outline • Overview • Test Harness • Hash Algorithms • Whirlpool • FSB • Closing Remarks
Whirlpool Introduction • A stand-alone hash function based on a substantially modified Advanced Encryption Standard (AES) • Given a message less than 2256 bits in length, it returns a 512-bit message digest. • Whirlpool is not a SHA-3 candidate • Will never be patented, free for public use • No Bluespec implementations exist
Whirlpool Preprocessing • Input: A input message being hashed (any size) • Padded input: • A ={ message,1, 0,0,0,…,0,0,0} (512N + 256 bits) • B =message length (256 bits) • Padded input = {A,B} (512 N + 512 bits) • Output: Split the padded input to small message blocks (512 bits each) Message bits Message bits 1 Zeroes Zeroes Length
Whirlpool Preprocessor Input Words • Input words are shifted into the message block one bit at a time until any of the following events: • Message block is full: It is sent and a new one is started. • Input word is finished: The next one is loaded. • Message is complete: The block is padded with a 1 and the message length (in bits) before being sent. • Because these events happen independently, the preprocessor does not depend on message size, message block size or input word size. • It requires very little logic, but is rather slow, as it requires 1 cycle per bit, minimally. Message Block
Whirlpool Compression • Inputs: • Current hash from previous iteration (8 bit x 64 vector) • Small message blocks (512 bit) • Output: • Intermediate Hash (8 bit x 64 vector) W
Whirlpool Compression • Block Diagram: • init: • takes in message blocks and resets internal states • processBuffer: • computes internal state from an internal block cipher • finalize: • newHash = currentHash ^ input message ^ state • newHash is sent out as result when there is no more input message blocks init processBuffer finalize
Whirlpool Compression • Internal block cipher in processBuffer: • Originally uses a randomly generated box, lack internal structure, hard to implement efficiently in hardware • Current version uses S-box, which has nice patterns for hardware implementation
Whirlpool Implementation • Do one branch at a time • Reuse hardware • Save logic • Take longer time • 10 rounds of iteration • Big for-loop takes a lot of logic, and increases critical path • Use counter to break into multiple cycles
Whirlpool Implementation • Use registers with ready bits instead of FIFOs • Put s-box’s lookup table onto SRAM • One table lookup per cycle • Concatenate vectors to avoid multi-layered MUX C3 C = {c3,c2,c1,c0} C3[2] C[15:12] C2 C2[1] C[11: 8] C2[1] C[9] C1 C1[3] C[ 7: 4] C0 C0[0] C[ 3: 0]
Whirlpool Finalization • Functionality: • Unwrap the intermediate hash from its vector form to a bit string as final output • (8 bit x 64 vector => 512 bit string) • No separate finalization module • Done at the end of the compression module
Whirlpool Result • Successfully simulated and verified in Bluespec compiler • Successfully put onto FPGA and verified • Noticeable trade-offs between speed and area • We choose area over speed
Outline • Overview • Test Harness • Hash Algorithms • Whirlpool • FSB • Closing Remarks
Fast Syndrome-Based hash function • FSB is a family of hash functions submitted to the SHA-3 competition. • Maintains a large internal state. • Requires a large lookup table. • Simple design, simple operations. • Proof of reduction to known hard problems. • Authors are French.
FSB Preprocessing • Message blocks of 1240 bits. • Filled first with bits from message. • After last message input, single bit appended. • Padded with zeroes. • Last 64 bits contain message length in bits. Message bits Message bits 1 Zeroes Zeroes Length
FSB Compression 1984 bits 8 bits 01001… 1101… 21 bits % 1240 bits 10110… Simple Math with Constants 5 bits 1987 bits / Memory >> π 1987 bits x 1024
FSB Compression • Implementation follows specification closely. • Single cycle division and modulo component. • Multiple cycle shifter. • Memory interface for loading pi vectors.
FSB Finalization • Breaks up 1984 bits into a stream of 32 bit input words for Whirlpool. Whirlpool 1984 bits 512 bits
Closing Remarks • FSB is not ideal for hardware. • Large lookup table. • Large internal state. • Simple operations on large values. • Generalized code can be reused for other hash functions.