An automated pipeline balancing in the SRC Reconfigurable Computer and its application to the RC5 cipher breaking

An automated pipeline balancingin the SRC Reconfigurable Computerand its application to the RC5 cipher breaking Hatim Diab1, Miaoqing Huang1, Kris Gaj2, Tarek El-Ghazawi1 , Nikitas Alexandridis1 1The George Washington University 2George Masson University

Objectives • Implement pipelined RC5 Key Breaker on a single chip, • Demonstrate automatic balancing of a pipeline by a compiler (SRC), • Show the cost of added pipeline. 1011/MAPLD'04

Requirements • Given: • A matching pair of Plain text message (M) and Cipher text (C) • Find the correct corresponding Secret Key • Test the possible Secrete Keys exhaustively, • Keys, 128bit-long key from all 0’s to all 1’s. • Requirements • The processing element (PE) to be fed a new Secrete Key (Ki) each cycle, • Compare C with the output Ci corresponding to Ki 1011/MAPLD'04

RC5 Algorithm • Mixing in the Secret Key. i=j=0 A=B=0 do 3*max(26,4) times // S[0..25] is the array to be mixed for rc5 encryption A=S[i]=(S[i]+A+B)<<<3; // L[0…3] is the array converted from the secrete key K[0..15] B=L[j]=(L[j]+A+B)<<<(A+B); i=(i+1) mod (26); // The output is the array S[0..25], which will be used to encrypt j=(j+1) mod (4); // the plain text. • Encryption. LE=A+S[0]; // A is the upper part of plain text RE=B+S[1]; // B is the low part of plain text for i=1 to 12 do LE=((LE⊕RE)<<<RE)+S[2*i]; RE=((RE⊕LE)<<<LE)+S[2*i+1]; The processed LE is the upper part of cipher text, The processed RE is the low part of cipher text. 1011/MAPLD'04

Key-Breaking Flowchart 1011/MAPLD'04

Condition & Implementation • RC5 32/12/16 • Cipher text 32*2 bits = 64 bits • 12 rounds • Key = 16 * 8bits = 128 bits • Implement RC5 encryption using • 12 rounds of encryption macros, with 6 clocks latency • 78 iterations of key generation macros, with 3 clocks latency 1011/MAPLD'04

Design & Bottleneck • Pipelined design • Process one key every clock cycle in a pipelined fashion • Data dependencies • One of the features of RC5 is the extensive use of data dependent rotations, • S value needed every 26th step, • L value needed every 4th step, • Manual HDL-based realization of the pipeline proved to be time-consuming and error-prone. 1011/MAPLD'04

Data Dependencies in Each Iteration 1011/MAPLD'04

Solution • Implement on one FPGA chip concurrently • 78 key initialization macros • 12 encryption macros • Connect the macros in a linear pipeline. • The SRC compiler will balance the pipeline by inserting delay channels to make all macros run synchronously. 1011/MAPLD'04

Delay 1 = 1 reg Delay 2 = 2 reg wire Delay 5 = 5 reg Delay Channels Added by SRC Compiler 1011/MAPLD'04

Detailed flow 1011/MAPLD'04

Compilation Result • Device utilization summary: Number of External IOBs 594 out of 1104 53% Number of LOCed External IOBs 594 out of 594 100% Number of Slices 33790 out of 33792 99% Number of BUFGMUXs 1 out of 16 6% • Maximum Clock Frequency 1011/MAPLD'04

Effectiveness of the Benchmark 1011/MAPLD'04

Conclusion • The objective was realized, i.e., every clock one 128bit-long variable is pushed into the processing chain, • A speed-up of 1000x over SW and 300x over serial HW implementations was achieved, • For the flexible parameters used in RC5 algorithm, different map routines can be designed respectively to fit the distinct area and throughput requirements, • The automated pipeline balancing of the SRC compiler proved to substantially decrease the development time of complex pipelined designs. 1011/MAPLD'04

An automated pipeline balancing in the SRC Reconfigurable Computer and its application to the RC5 cipher breaking

An automated pipeline balancing in the SRC Reconfigurable Computer and its application to the RC5 cipher breaking

Presentation Transcript

Graphics Performance: Balancing the Rendering Pipeline

Framework and Model for Automated Interoperability Test and its Application to ROHC

Supersymmetry and its breaking

Procrustes Analysis and Its Application in Computer Graphics

About the SRC Application Process

BREAKING THE SCHOOL TO PRISON PIPELINE

An Application of the Boe-Bot and Its Sensors

An approach to the SN ratios based on the proportional models and its application

Brain Computer Interfaces and its Application in Aeronautics

The Secret in Cipher

“Implementation of a RC5 block cipher algorithm and implementing an attack on it”

The Cipher Challenge

Optimum Implementation of Elliptic Curve Cryptosystems on the SRC-6E Reconfigurable Computer

An Application Specific Reconfigurable Graphics Processor

Discrete Math and Its Application to Computer Science

Generic Reconfigurable Computer

BUILDING AN AUTOMATED DATABASE DEPLOYMENT PIPELINE

“Implementation of a RC5 block cipher algorithm and implementing an attack on it”

SRC Application Process

Supersymmetry and its breaking

Balancing Interconnect and Computation in a Reconfigurable Array