180 likes | 384 Views
CUDA SHA. Andrew Fitzgerald Aric Schorr. Agenda. Tree Hashing Overview MD6 Algorithm CUDA Code Skein Algorithm CUDA Code Paper Milestones Achieved Future Work. Leaves of tree are bit-string digest values Main uses: Original: Lamport Signatures Current: Torrent Hash verification
E N D
CUDA SHA Andrew Fitzgerald Aric Schorr
Agenda • Tree Hashing Overview • MD6 • Algorithm • CUDA Code • Skein • Algorithm • CUDA Code • Paper • Milestones Achieved • Future Work
Leaves of tree are bit-string digest values Main uses: Original: Lamport Signatures Current: Torrent Hash verification File integrity verification in ZFS Our use: Allow for parallel computation of message digest Tree Hashing
MD6 Modes: PAR • Black dot as leaf – one chunk of data (1024 bits) • Gray dot – less than one chunk of data padded with zeros • White dot – one chunk of padding zeros • Black dot as node - compression function Images taken from MD6 report submitted to NIST
MD6 Compression Input • Compression Function f has five defined inputs • Q, K, U, V are the “auxiliary inputs” (25 words) • Q – constant equal to fractional part of sqrt of 6 (15 words) • K – key (salt, tag, secret key, etc) (8 words) • U – unique node ID (1 word) • V – control word (1 word) • B is the data payload (64 words, 4 chunks) Images taken from MD6 report submitted to NIST
MD6 Compression Function • Input: N[0…n-1] (n = 89 words) • Output: C[0…c-1] (c = 16 words) • Internal Structure: A[0…t+n-1] (t = r*c words) • For i = n to t+n-1, where t = r*c (c = 16) • x = Si-nxor Ai-nxor Ai-t0 • x = x xor (Ai-t1 and Ai-t2) xor (Ai-t3 & Ai-t4) • x = x xor (x >> ri-n) • Ai = x xor (x << li-n)
MD6 Compression Function • For i = n to t+n-1, where t = r*c (c = 16) • x = Sixor A89xor At0 • x = x xor (At1 and At2) xor (At3 & At4) • x = x xor (x >> ri-n) • Ay = Ay-1; A1= x xor (x << li-n) where 1 < y <= 89 Images taken from MD6 report submitted to NIST
Skein – Important Points M0 M1 M2 • Threefish, & UBI • Type field within the tweak • Keyed versus unkeyed hashing • The Configuration String P P P K C K C K C G T T T T0 T1 T2
Skein – Tree Hashing UBI Yf = 1 UBI UBI UBI UBI UBI UBI Nl
SHA-3 API (NIST Specification) • HashReturn Init (hashState* state, int hashbitlen); • Select the context size and initializes the context • HashReturn Update(hashState* state, const BitSequence *data, DataLength databitlen); • Processes the data to be hashed • HashReturn Final (hashState* state, BitSequence *hashval); • Finalizes the hash computation and outputs the result (hashbitlen bits) • HashReturn Hash (int hashbitlen, const BitSequence *data, DataLength databitlen, BitSequence *hashval); • Performs all hashing in one function call
Skein API • Init, Update and Final methods, similar to that of the SHA-3 API • MAC and Tree Hashing specific additions • intSkein_256_InitExt(Skein_256_Ctxt_t *ctx, size_t hashBitLen, u64b_t treeInfo, const u08b_t *key, size_t keyBytes); • Same parameters as Init() calls, plus treeInfo/key/keyBytes • intSkein_256_Final_Pad(Skein_256_Ctxt_t *ctx, u08b_t * hashVal); • Pad, do final block, but no OUTPUT type • intSkein_256_Output (Skein_256_Ctxt_t *ctx, u08b_t * hashVal); • Performs just the output stage • MANY other internal flags and functions for programmer convenience
Skein - Tree Specifics • Parallel operations in a tree row • Update() and Final_Pad() • Both functions need updated parameters (tweak values) to incorporate their positions within the tree • Once tree hashing is complete, Skein_output() is called perform the final output stage at the root node
Skein CUDA Code • Demo from website
Paper Status • Introduction • Cryptographic Hash Functions Overview of hashing functions • CUDA Overview of GPUs and CUDA • Design • Generic Hash Trees Description of original hash trees • MD6 Overview MD6 algorithm and tree structure • Skein Overview Skein algorithm and tree structure • Architecture Not Complete • Implementation • MD6 MD6 CUDA implementation details • Skein Skein CUDA implementation details • Performance Evaluation Methodology Description of important variables • Testing Environment Description of OAK specific details • Testing Results Not Complete • Applications A touch of reality (the importance) • Conclusion Not Complete
Milestones Achieved • Learned two new specifications (310+ pages) • Understand in-depth details of algorithms • Learned CUDA (no prior experience than this quarter) • Majority of paper is complete • Testing, conclusion, and analysis are needed • Core MD6 code is complete • Architecture related and memory optimizations are identified and need to be implemented • Skein testing wrapper is complete • Core functionality needs to be written (some of this is the same as the MD6 core)
Future Work • Aric • Take over Skein code for thesis • Finish full CUDA review of Skein by May • MD6 Code interesting, BUT … • Andy • Apply CUDA techniques to thesis algorithm (TBD) • Complete README.txt • Both • Information in paper can be used in thesis document