380 likes | 561 Views
2 Gbit/s Hardware Realizations of RIJNDAEL and SERPENT: A Comparative Analysis. Adrian K. Lutz 1 , Jürg Treichler 2 , Frank K. Gürkaynak 3 , Hubert Kaeslin 4 , Gérard Basler 2 , Andres Erni 1 , Stefan Reichmuth 1 , Pieter Rommens 2 , Stephan Oetiker 3 , Wolfgang Fichtner 3.
E N D
2 Gbit/s Hardware Realizations of RIJNDAEL and SERPENT:A Comparative Analysis Adrian K. Lutz1, Jürg Treichler2, Frank K. Gürkaynak3,Hubert Kaeslin4, Gérard Basler2, Andres Erni1,Stefan Reichmuth1, Pieter Rommens2,Stephan Oetiker3, Wolfgang Fichtner3 Integrated Systems Laboratory, ETH Zurich 1 Rijndael Design Team 2 Serpent Design Team 3 Integrated Systems Laboratory 4 Microelectronics Design Center
Context Architectural Transforms Variations of Architectural Parameters Rijndael Architecture Serpent Architecture Comparison Conclusions Contents
Context • Term project at the Integrated Systems Laboratory of ETH Zurich • 2 teams of 3 EE 4th year students • 14 weeks to tape-out • 0.6 μm 3 LM CMOS technology • Maximum die size 50 mm2 • Pin compatibility • Focus on ASIC hardware implementation
Impact of Architectural Transforms Legend Area A Actual Effect Replication Ideal Effect Towards Throughput Pipelining Iterative Decom- position Time Sharing Towards Hardware Efficiency Towards Hardware Economy Time per Data Item T
The Rijndael Chip AES 128bit implementation Andres Erni Adrian K. Lutz Stefan Reichmuth
Algorithm Summary • 128bit user key • 11 128bit round keys • 10 rounds • 8x8-S-box instantiate 1, 2, 5, or 10 rounds
Rijndael Architecture - Overview Parallel Round 1 128 128 32 128 Round Key Register Data Reg 32 128 AddKey Controller Data Reg Key Generator 32 128 Key Reg Parallel Round 2 128 128
Rijndael Architecture - Overview Parallel Round 1 128 128 32 128 Round Key Register Data Reg 32 128 AddKey Controller Data Reg Key Generator 32 128 Key Reg Parallel Round 2 128 128 Largest potential for optimizations in rounds
MixColumns SubBytes ShiftRows Pipeline Reg Pipeline Reg AddKey AddKey InvShiftRows InvSubBytes InvMixCol Rijndael Architecture - Round Basic architecture: separate encryption and decryption paths Encryption Path Decryption Path Problem: Large area requirement
MixColumns SubBytes ShiftRows Pipeline Reg AddKey InvShiftRows InvSubBytes InvMixCol Rijndael Architecture - Round Shared AddKey Function Encryption Path Decryption Path Problem: Large area requirement
MixColumns SubBytes ShiftRows Pipeline Reg AddKey InvShiftRows InvSubBytes InvMixCol Rijndael Architecture - Round SubBytes and InvSubBytes occupy 85 % of the area Encryption Path Decryption Path Problem: Large area requirement
Mult Inverse Aff Trans Mult Inverse Inv Aff Trans Rijndael Architecture - Round Rijndael S-box consists of two operations Encryption Path SubBytes InvSub Bytes Decryption Path Multiplicative inverse can be shared
Aff Trans Mult Inverse Inv Aff Trans Rijndael Architecture - Round Sharing Mult Inverse saves 30 % to 50 % of area Encryption Path SubBytes Decryption Path InvSubBytes
MixColumns Aff Trans ShiftRows AddKey Pipeline Reg InvMixCol Inv Aff Trans InvShiftRows Rijndael Architecture - Round Round architecture with shared Mult Inverse Encryption Path SubBytes 1.5 ns 4 ns 6 ns Mult Inverse 5.5 ns Decryption Path InvSubBytes Problem: Location of intraround pipeline register
Rijndael Architecture - Round Partition of the rounds not suited for intraround pipelining Encryption Decryption: Partition 1 AddRoundKey SubBytes ShiftRows MixColumns AddRoundKey SubBytes ShiftRows MixColumns AddRoundKey SubBytes ShiftRows AddRoundKey AddRoundKey InvSubBytes InvShiftRows InvMixColumns AddRoundKey InvSubBytes InvShiftRows InvMixColumns AddRoundKey InvSubBytes InvShiftRows AddRoundKey 10 1 9 2 9 1 10
Rijndael Architecture - Round Shifting round limits makes intraround pipelining possible. Encryption Decryption: Partition 1 AddRoundKey SubBytes ShiftRows MixColumns AddRoundKey SubBytes ShiftRows MixColumns AddRoundKey SubBytes ShiftRows AddRoundKey AddRoundKey InvSubBytes InvShiftRows InvMixColumns AddRoundKey InvSubBytes InvShiftRows InvMixColumns AddRoundKey InvSubBytes InvShiftRows AddRoundKey 10 1 9 2 9 1 10
Rijndael Architecture - Round Partition of the rounds suited for intraround pipelining Encryption Decryption: Partition 2 AddRoundKey SubBytes ShiftRows MixColumns AddRoundKey SubBytes ShiftRows MixColumns AddRoundKey SubBytes ShiftRows AddRoundKey AddRoundKey InvSubBytes InvShiftRows InvMixColumns AddRoundKey InvSubBytes InvShiftRows InvMixColumns AddRoundKey InvSubBytes InvShiftRows AddRoundKey 10 1 9 2 9 1 10
MixColumns Aff Trans ShiftRows Mult Inverse Pipeline Reg Inv Aff Trans InvShiftRows Rijndael Architecture - Round InvMixColumns can be shifted to a better location Encryption Path SubBytes AddKey InvMixCol Decryption Path InvSubBytes AddKey has to be duplicated
MixColumns Aff Trans ShiftRows Mult Inverse Pipeline Reg Inv Aff Trans InvShiftRows Rijndael Architecture - Round Architecture suited for intraround pipelining 6.5 ns Encryption Path SubBytes 4 ns AddKey 1.5 ns MUX AddKey InvMixCol Decryption Path InvSubBytes 5.5 ns
MixColumns Aff Trans ShiftRows Mult Inverse Pipeline Reg Inv Aff Trans InvShiftRows Rijndael Architecture - Round Introduction of intraround pipelining Encryption Path Encryption Path SubBytes AddKey Pipeline Reg AddKey InvMixCol Decryption Path InvSubBytes Final architecture implemented in silicon
The Serpent Chip Gérard Basler Pieter Rommens Jürg Treichler
Algorithm Summary • 256bit user key • 33 128bit round keys • 32 rounds • 8 types of 4x4-S-boxes instantiate preferably 8, 16, or 32 rounds also possible: 1, 2, or 4 rounds
Architecture Development – Step 1 Round 1 Round 25 Round 2 Round 26 Round 3 Round 27 Round key storage Round 4 Round 28 Round 5 Round 29 Round 6 Round 30 Round 7 Round 31 Round 8 Round 32 Problem: area requirement Key Generation
Architecture Development – Step 2 Round 1 Round 2 Round 3 Round key storage Round 4 Round 5 Round 6 Round 7 Round 8 Problem 1: insufficient throughput Key Generation
Architecture Development – Step 3 Round 1 Round 2 Round key storage Round 3 Round 4 Problem 2: area exceedsfixed allowance Round 5 Round 6 Round 7 Key Generation Round 8
Architecture Development – Step 4 Round 1,5 Round key storage Round 2,6 Round 3,7 Round 4,8 Key Generation
128 128 Serpent Architecture - Round Encryption Path Lin Trans Key Mix S-Box-Array Key Reg Lin Trans-1 S-Box-1-Array Key Mix Decryption Path
128 128 Serpent Architecture - Round Encryption Path S-Box-Array Lin Trans Key Mix S-Box-Array Key Reg S-Box-1-Array Key Mix Lin Trans-1 S-Box-1-Array Decryption Path
Architecture Development – Step 4 Round 1,5 Round key storage Round 2,6 Round 3,7 Problem: longest path Round 4,8 Key Generation
Architecture Development – Step 5 Temp Key Reg Round 1,5 Temp Key Reg Round 2,6 Key Generation On The Fly Temp Key Reg Round 3,7 Key Reg 33rd Key Temp Key Reg Round 4,8
Encryption 128 128 32 32 Round 1 Decryption 128 128 128 Key Generator 32 Round 2 Round 4 Controller Round 3 Serpent Architecture – Final Data In Reg Stack Data Out Key In Reg Ctrl Sig
Comparison: Figures of Merit Comparison based on 128bit keys
Conclusions (I) • Both chips were • designed from scratch • synthesized • fabricated • measured1 • Throughputs and areas almost identical: • in ECB and CTR modes: 2 Gbit/s • in feedback modes: 500 Mbit/s • Contributions towards optimum datapath design 1 Serpent chip: fully functional; Rijndael chip: limited functionality due to error in mask fabrication
Conclusions (II) Considering that the two algorithms are rather different in nature, their respective performances in hardware come remarkably close. • Rijndael • Reuse of the same S-box for all rounds • Serpent • Accommodates multiple key lengths with no impact on hardware architecture • Number of rounds (32) is hardware-friendly