Souradyuti Paul and Bart Preneel K.U. Leuven, ESAT/COSIC

A New Weakness in the RC4 Keystream Generator and an Approach to Improve the Security of the Cipher Souradyuti Paul and Bart Preneel K.U. Leuven, ESAT/COSIC FSE 2004 New Delhi, India February 6, 2004

Overview of the Presentation • Description of RC4 • Main Contributions • Anomaly in the first two bytes of RC4 • Estimating the bias in the first two bytes of RC4 • RC4A: A New Stream Cipher • Design Principle of RC4A • Conclusions

Description of RC4 • based on an exchange shuffle paradigm • the algorithm Runs in Two Phases • key-scheduling algorithm • pseudo-random generation algorithm • pseudorandom bytes are bit-wise XORed with the plaintext bytes

RC4 (1987) • designed by Ron Rivest (MIT) • leaked out in 1994 • Key Scheduling Algorithm: S[0..255] secret table derived from user key K (usually 40 to 256 bits) for i=0 to 255 S[i]:=i j:=0 for i=0 to 255 j:=(j + S[i] + K[i]) mod 256 swap S[i] and S[j] i:=0, j:=0

094 095 093 002 001 254 000 255 013 205 079 143 162 099 033 ... ... 092 RC4 (1987) Pseudo-random Generation Algorithm: Generate keystream which is added to plaintext i:=i+1 j:=(j + S[i]) mod 256 swap S[i] and S[j] t:=(S[i] + S[j]) mod 256 output S[t] t 162 92 i j

Main Contributions • A ‘new’ statistical bias in the distribution of the first two output bytes. • Existence of the Bias after dropping the first N bytes. • A possible method to improve the security and performance of the cipher.

The First Two Outputs are Unequal When S0[1]=2 Index: 0 1 2 3 4 N-1 i j • Assume that after the key scheduling algorithm P[S0[1]=2]=1/N.

The First Two Outputs are Unequal When S0[1]=2 (Contd.) Index: 0 1 2 3 4 …. N-1 i j Output: S1 [X+2] Index: 0 1 2 3 4 …. N-1 i j Output: S2 [Z+2] • S1[X+2] ≠ S2[Z+2]

Strong Distinguisher • A Distinguisher is an Algorithm which distinguishes a stream of bits from a perfectly random stream of bits. • A Strong Distinguisher is a distinguisher which detects bias at particular locations of several randomly chosen stream of bits.

Quantifying the Bias • We assume that the first two output bytes are equal with probability 1/N when S0[1] ≠ 2. • Therefore, the probability that the first two output bytes are equal is 1/N(1-1/N). • Sample Size to ‘noticeably’ distinguish RC4 keystream from random stream of bits is O(N3) bytes. • Experiments show 224 pairs of bytes suffice to show the bias for N= 256.

Distinguishing Attacks on RC4

The Bias after Dropping the initial N Bytes • We assume that P[j = 0]=1/N after the initial N rounds. • Therefore, after dropping the initial N bytes the probability that the first two output bytes are equal is 1/N(1-1/N2). • In this case, O(N5) bytes are required to ‘reliably’ distinguish RC4 outputs from random outputs. • Experimentally, 232 pairs of bytes suffice to detect the bias for N= 256.

Distinguishers after N bytes

Recommendation • Experimentally, our distinguisher works better, partly due to the huge difference between the permutation space and the key space. The fact necessarily implies non-uniformity of the distribution of the initial permutation. • Based on this observation we recommend to dump at least 2N bytes of RC4 outputs in all future applications of it.

RC4A: A Modification of RC4 • Two phases for RC4A - Key Scheduling Algorithm and after that the Pseudo-random Generation Algorithm. • We only modify the Pseudo-random Generation Algorithm of RC4 in order to achieve better Security. • The Key Scheduling Algorithm of RC4 is assumed to be ‘perfect’ and used in RC4A.

RC4A: Main Motivation • most of the known attacks on RC4 exploit the correlation between the outputs and random input variables • main objective is to make outputs depend on more random variables • to reduce the number of instructions per output byte. • exchange shuffle model

RC4A: Description • Take a key K1 and generate another key K2 using a pseudorandom bit generator (e.g. RC4). • Generate two random permutations of N elements, namely S1and S2, using K1and K2 on the identity permutation respectively. • To generate S1 and S2 we may use the Key Scheduling Algorithm of RC4.

RC4A: Description of the Pseudorandom Generation Algorithm of RC4A Input (S1,S2) 1. i:= 0, j1:=0, j2:=0; 2. i:= (i +1) mod N; 3. j1:=(j1 + S1[i] ) mod N; 4. Swap S1[i] and S1[j1]; 5. I:=(S1[i] + S1[j1]) mod N ; 6. Output:= S2[I];

RC4A: Description of the Pseudorandom Generation Algorithm of RC4A (contd.) 7. j2:=(j2 + S2[i]) mod N; 8. Swap S2 [i] and S2[j2]; 9. I:=(S2 [i]+ S2[j2]) mod N ; 10. Output:= S1[I]; 11. Repeat from Step 2.

Security: RC4A Vs RC4 • Number of Internal States of RC4A is approximately N3.(N!)2compared to N2.N! for RC4. • At every round of RC4A, one output byte depends on at least three variables compared to only two variables for RC4. • The upper bound on the probability of guessing maximum number of elements of the permutation from known outputs is 1/N2 compared to 1/N for RC4 under reasonable assumptions.

Security: RC4A Vs RC4 (Contd.) • The Computation Cost to derive the secret Internal State of RC4A is much higher (C2 compared to C under reasonable assumptions). • The number of Fortuitous States is less than in RC4A. • The ‘Second Byte’ attack on RC4 by Mantin and Shamir is also weakened in RC4A (N3 bytes).

Prospect of a fast stream cipher • RC4A uses fewer instructions: the i pointer is incremented once to generate two successive output bytes. • Existence of parallel steps.

Remarks on RC4A • It seems convincing to even improve RC4A. • The main idea was to decorrelate an index pointer and the value pointed to by the index. • The attack by Golic is still difficult to remove. • Generation of outputs of more than 8 bits: A possible future work.

Conclusions • We detected a new bias that does not disappear after N rounds. • A new stream cipher is designed after a simple modification of RC4.

Souradyuti Paul and Bart Preneel K.U. Leuven, ESAT/COSIC