240 likes | 453 Views
Longer Keys may Facilitate Side Channel Attacks. Colin D. Walter. www.comodogroup.com (Bradford, UK) colin.walter@comodogroup.com. C ● O ● M ● O ● D ● O RESEARCH LAB. Overview. Side Channel Attacks as motivation for looking at RSA key lengths. Extracting Data by Power and Timing Attacks.
E N D
Longer Keys may Facilitate Side Channel Attacks Colin D. Walter www.comodogroup.com (Bradford, UK) colin.walter@comodogroup.com C●O●M●O●D●O RESEARCH LAB
Overview • Side Channel Attacks as motivation for looking at RSA key lengths. • Extracting Data by Power and Timing Attacks. • Reconstructing Secret Keys. • Comparing different key lengths for: • a timing attack • a power attack • Conclusion Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions
Timing & Power Analysis Attacks • Conditional statements in executing code can cause minute variations in time for decryption and signing. This may leakinformation about the secret key. • Changing inputs to H/W gates causes minute data dependent current variations in a smart card. This leaks secret data when performing RSA decryption or signing. • For example, in the standard implementation, average time for a modular multiplication is different from that of a modular squaring. Power variations make this visible. Then use of the binary expn algmreveals the secret key. Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions
History • NSA Tempest programme • P. Kocher (Crypto 96) Timing attack on implementations of Diffie-Hellman, RSA, DSS, and other systems • Dhem, Quisquater, et al. (CARDIS 98) A practical implementation of the Timing Attack • P. Kocher, J. Jaffe & B. Jun (Crypto 99) Introduction to Differential Power Analysis …. • Messerges, Dabbish & Sloan (CHES 99) Power Analysis Attacks of Modular Exponentiation in Smartcards Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions
Recent Attacks • C. D. Walter & S. Thompson (CT-RSA 2001) Distinguishing Exponent Digits by Observing Modular Subtractions • a timing attack which averaged over a number of exponentiations with same exponent • C. D. Walter (CHES 2001) Sliding Windows succumbs to Big Mac Attack • a DPA attack which averaged using the trace from a single exponentiation Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions
Question • Counter-measures can be employed, but there is no guarantee that better monitoring machinery and better statistical techniques might not still reveal the key. So, • How much protection is there in selecting a longer key length for RSA? • The body of the talk looks at the last two attacks to see how much more difficult they are for longer keys. • Surprisingly, it appears that longer RSA & ECC keys are weaker under the power and timing attacks. Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions
Security Model • Smartcard running RSA; • Unknown secret exponent D; • Known algorithms & H/W characteristics; • Single H/W multiplier; • Non-invasive, passive attack; • Attacker unable to read or influence I/O directly; • Hecanobserve timing variations in long int multns; • Hecanmeasure multiplier power usage. • Hecan check correctness of D. Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions
The Timing Attack on RSA Context: • Need to compute AB mod M • Output from main loop of Montgomery Modular multiplier: P< 2M • Expected output P < M (or < 2n) • So conditionalsubtractionin S/W • This affects timing, and so we assume it can be observed. Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions
Distribution of Products • The loop output in Montgomery modr multn is uniformly distributed over the interval [ ABR–1, ABR–1 + M )So the probability of the conditional subtraction can be computed from the distributions of A and B. • This shows the probabilities πmuand πsq are different for squares and multiplications. So they can be distinguished if enough samples are available. • This makes the usual binary “square and multiply” algorithm vulnerable to attack. Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions
Separating Multiplications & Squares • Let Q = (qij) be the matrix for which qij= 1 or 0 according to whether or not there is a conditional subtraction in the ith modular multiplication of the jth exponentiation. • It is possible to compute the averages and variances etc of the Hamming weight distances between the rows. • Rows for multiplications have separations clustered round one average, rows for squares cluster round another, and distances from multiplications to squares around a third. This enables the rows to be partitioned into two sets, M and S. • The probability of one row being close to the wrong set is small, but computable (and decreases as the sample size increases). Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions
Doubling the Key Length • Now double the key length nbut keep all other parameters the same. Will the number of errors increase (a stronger key) or decrease (a weaker key)? • There are twice as many multiplicative operations, so the sets S and M of squares and multiplies are twice as big. • The average distances between one row and the (provisional) sets Sand M are unchanged, but the variances are halved. This makes an individual classification error less likely. • If the probability of one error in two multve opns of the 2n-bit key is less than that for one multve. opn. of the n-bit key, longer keys are weaker. Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions
Doubling the Key Length • Let Z be a normal N(0,1) random variable representing the (scaled) distance of a row of Q to the set S or the set M and let δ be the distance at which the row is more likely to belong to the other set. • Then δ2n = √2 δn because δ is inversely proportional to the S.D. • The probty of classifying an opn correctly for key length n is 1 – p(Z > δn)2 • The probty of classifying two opns correctly for key length 2n is (1 – p(Z > √2 δn)2 )2 • From tables, the first is smaller if δn > 0.616 Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions
Result • So longer keys are weaker if δ > 0.616 • But δ is proportional to √N where N is the sample size. So the condition holds and longer keys are weaker if enough exponentiations are available with the same key. • Several hundred samples are enough under good conditions. (The actual number depends on the accuracy of data collection, the ratio of the modulus to the Montgomery constant, etc. and decreases as key length increases.) Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions
The DPA Attack on RSA • Assume that the exponent is blinded and there is no timing variation. So the secret key must be recovered from a single use. • As a result of gate switching, a k-bitdigit multiplication a×bhas a data dependent contribution to power consumption roughly linear in the Hamming weights of a and b. • Variation resulting from the previous state can be averaged away for long integers A = i=0 airi: For each aithe traces for ai×bjare averaged as jvaries. These are concatenated to give a trace with length s, characteristic of A. s–1 Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions
Distances between Traces power tr0 tr1 i 0 s The scaled Euclidean distance between traces for A0 and A1 isd0,1 = (s–1i=0(tr0(i)tr1(i))2)½ s–1 Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions
Average Separation • Let Q = (qij) be the matrix for which qijє Ris the averaged trace weight associated with the jth multiplicand digit in the ith modular multn. Use Euclidean distance between rows, divided by #digits s. • For modular multiplications with different multiplicands using a k-bit multiplier, the average distance apart is (k(s+1)/2 + 2σ2 )½where σ2 is the variance of measurement noise. • For multns with a common multiplicand, this distance is only(k/2 + 2σ2 )½ Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions
Results • Multiplications can be identified because they are close together – they share a common multiplicand (the initial plaintext input). • Squares can be identified because they are not close – they have different multiplicands. • For m-ary exponentiation, different exponent digits can be recognised: the set of multiplications for the same digit share a common multiplicand and so are close together. • So the secret key can be recovered. Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions
Longer Keys? • Again, consider doubling the key length to see what happens. • A longer key means more k-bit digits, so a better average in traces and longer concatenated traces; so a higher probability of classifying multns correctly. • As before, sets M and S are twice the size, and so variances of interest are halved. • Since successive digit multiplications are not independent, simulations give a more accurate view than what can be achieved by theory. Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions
Simulation Example: Distance stats for gate switching in 8-ary expn with 32-bit multiplier. Key Length 128 256 512 1024 2048 Av to nearest 234 201 177 176 171 SD to nearest137 129 106 110 100 Av to others 324 434 843 1453 2153 SD to others 78 102 140 131 118 (Smaller key length choices to help illustrate trends.) Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions
Longer Keys? • For equal multiplicands, avage distance decreases as key length increases, with S.D. about 3/5ths of this. • For distinct multiplicands, avage distance increases almost in line with key length, but S.D. is close to constant. • Consequently, it becomes much easier to distinguish squares from multiplies and which multiplicand is used (i.e. what exponent digit occurs) as key length increases. • Specifically, from tables we can calculate the probability of correct exponent digit determination: p128 = 0.4836 p256 = 0.7114 p512 = 0.9932 p1024 = 0.9999... Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions
Result • Very easily, two exponent digits are correctly determined for key length 2n with higher probability than one digit for length n. • Thus, increasing key length is definitely unwise if such implementation attacks are possible! • The full power of the theory was not used: distances were between two traces, not between one trace and a provisional set which represents the same exponent digit. So better results hold in practice. Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions
Final Conclusion • Counter-intuitively, it appears that these attacks become easier when key length is increased. • The timing attack may become more difficult initially, but is easier eventually – but counter-measures are easy. • With the DPA averaging above, it appears possible to use a single exponentiation to obtain the secret key Despecially if key length is increased; • Then the counter-measure of blinding D+rφ(M) with random r is no defence. Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions