450 likes | 605 Views
Is there Safety in Numbers against Side Channel Leakage?. Colin D. Walter UMIST, Manchester, UK www.co.umist.ac.uk. History. NSA Tempest programme P. Kocher (Crypto 96) Timing attack on implementations of Diffie- Hellman, RSA, DSS, and other systems
E N D
Is there Safety in Numbers against Side Channel Leakage? Colin D. Walter UMIST, Manchester, UK www.co.umist.ac.uk
History • NSA Tempest programme • P. Kocher (Crypto 96) Timing attack on implementations of Diffie- Hellman, RSA, DSS, and other systems • Dhem,…, Quisquater, et al. (CARDIS 1998) A practical implementation of the Timing Attack • P. Kocher, J. Jaffe & B. Jun (Crypto 99) Introduction to Differential Power Analysis …. • Messerges, Dabbish & Sloan (CHES 99) Power Analysis Attacks of Modular Exponentiation in Smartcards C.D. Walter, UMIST
Recent Attacks • C. D. Walter & S. Thompson (CT-RSA 2001) Distinguishing Exponent Digits by Observing Modular Subtractions • a timing attack which averaged over a number of exponentiations with same exponent • C. D. Walter (CHES 2001)Sliding Windows succumbs to Big Mac Attack • a DPA attack which averaged using the trace from a single exponentiation C.D. Walter, UMIST
Security Model • Smartcard running RSA; • Unknown modulus M, unknown exponent D; • Known algorithms; • Single H/W multiplier; • Non-invasive, passive attack; • Attacker unable to read or influence I/O; • Can observe timing variations in long int ×n; • Can measure multiplier power usage. C.D. Walter, UMIST
The Timing Attack on RSA Context: • AB mod M • Output from Montgomery modular multiplication: S< 2M • Require output S < M or < 2n • So conditionalsubtractionin S/W • This affects timing, and we assume it can be observed. C.D. Walter, UMIST
Partial Product S • Last step of Montgomery modr multn : S (S + aB + qM)/ra = top digit of A, dependent on size of Aq, S effectively randomly distributed • For random A and fixed B, the average S is a linear function of B, indepnt of A • LargerBmore frequentfinal subtractions C.D. Walter, UMIST
Distribution of S • For amultiplyS behaves like random variable 2–nαβ + γwhere α, β have the distributions of A, Band γ is uniform. • For asquareS behaves like 2–nα2 + γ. • Integrating over values of α and β, the probability of S being greater than 2n is: …for multiply,…for square C.D. Walter, UMIST
Squares vs Multiplies …for multiply,…for square. • So probabilities of conditional subtraction of Mare different. • With sufficient observations we can distinguish squares from multiplies. • ( Care: non-uniform distribution on [0..2n]. ) C.D. Walter, UMIST
The Attack • Obtain frequencies for each opn by performing many exponentiations; • Separate squares from multiplications; • In square-and-multiply exponentiation obtain the bits of the secret key D. • Careless implementation of Modular Multiplication is dangerous. C.D. Walter, UMIST
m-ary Exponentiation • If square-and-multiply leaks, use m-ary exponentiation. Is it safer? • Example: 4-ary to compute ADmod M • Each multiply is by one of A,A2or A3 • Can these be distinguished? C.D. Walter, UMIST
Differentiating Multipliers • Pre-computations of A, A2 and A3 provide observation subsets with completely different distributions, hence different frequencies. • Form 8 subsets for which the conditional subtraction is / is not made for these. • Use vector of 8 freqs to identify multiplier and hence the exponent digit. C.D. Walter, UMIST
Sub in Initial Squaring C.D. Walter, UMIST
No Sub in Initial Squaring C.D. Walter, UMIST
Result • In m-ary exponentiation we may be able to discover the bits of secret key D. • Careless implementation of Modular Multiplication is dangerous also for m-ary exponentiation. • Counter-measures:avoid conditional subtractionsor replace D by D+rφ(M) for fresh, random 32-bit r. C.D. Walter, UMIST
Longer Keys? • Frequencies of multipliers & squares are unaffected by key length. • Exponent digits are equally identifiable. • If p = prob of correctly assigning exp digit, and t = no. of exp digits then p is independent of key length and pt= prob of correctly deducing key D. • pt decreases. So longer key length is safer. C.D. Walter, UMIST
The DPA Attack on RSA Summary: Differential Power Analysis (DPA) is used here to determine the secret key D from a single exponentiation. Assumption: The implementation uses a single, small multiplier whose power consumption is data dependent and measurable. C.D. Walter, UMIST
Multipliers • Switching a gate in the H/W requires more power than not doing so; • On average, a Mult-Acc opna×b+chas data dependent contributions roughly linear in the Hamming weights of a, b and c; • Variation occurs because of the state left by the previous mult-acc opn. C.D. Walter, UMIST
Combining Traces I • The long integer product A×B in an exponentiation contains a large number of small digit multiply-accumulates: ai×bj+ck • Identify the power subtraces of each ai×bj+ck from the power trace of A×B; • Average the power traces for fixed i as jvaries: this gives a trace triwhich depends on ai but only the average of the digits of B. C.D. Walter, UMIST
Combining Traces a0b0 a0b1 a0b2 a0b3 C.D. Walter, UMIST
Combining Traces a0b0 C.D. Walter, UMIST
Combining Traces a0b1 a0b0 C.D. Walter, UMIST
Combining Traces a0b2 a0b1 a0b0 C.D. Walter, UMIST
Combining Traces a0b3 a0b2 a0b1 a0b0 C.D. Walter, UMIST
Combining Traces C.D. Walter, UMIST
Combining Traces Average the traces: a0(b0+b1+b2+b3)/4 C.D. Walter, UMIST
Combining Traces _ • b is effectively an average random digit; • So trace is characteristic of a0 only, not B. tr0 _ a0b C.D. Walter, UMIST
Combining Traces II • The dependence of tri onBis minimal ifBhas enough digits; • Concatenate the average tracestrifor eachaito obtain a tracetrAwhich reflectsproperties ofAmuch more strongly than those ofB; • The smaller the multiplier or the larger the number of digits (or both) then the more characteristic trA will be. C.D. Walter, UMIST
Combining Traces tr0 C.D. Walter, UMIST
Combining Traces tr0 tr1 C.D. Walter, UMIST
Combining Traces tr0 tr1 tr2 C.D. Walter, UMIST
Combining Traces tr3 tr0 tr1 tr2 C.D. Walter, UMIST
Combining Traces • This is the analogue of the freqy vector. • Question: Is the trace trA sufficiently characteristic to determine repeated use of a multiplier A in an exponentiation routine? trA C.D. Walter, UMIST
Distinguish Digits? • Averaging over the digits of B has reduced the noise level; • In m-ary exponentiation we only need to distinguish: • squares from multiplies • the multipliers A(1), A(2), A(3), …, A(m–1) • For small enough m and large enough number of digits they can be distinguished in a simulation of clean data. C.D. Walter, UMIST
Distances between Traces power tr0 tr1 i 0 n n d(0,1) = ( i=0(tr0(i)tr1(i))2)½ C.D. Walter, UMIST
Simulation gate switch count tr0 tr1 i 0 n n d(0,1) = ( i=0(tr0(i)tr1(i))2)½ C.D. Walter, UMIST
Simulation Results 16-bit multiplier, 4-ary expn, 512-bit modulus. d(i,j) = distance between traces for ith and jth multiplications of expn. Av d for same multipliers 2428 gates SD for same multipliers 1183 Av d for different multipliers 23475 gates SD for different multipliers 481 C.D. Walter, UMIST
Simulation Results • Equal exponent digits can be identified – their traces are close; • Unequal exponent digit traces are not close; • Squares can be distinguished from multns: their traces are not close to any other traces; • There are very few errors for typical cases. C.D. Walter, UMIST
Expnt Digit Values • As in timing case, pre-computations A(i+1) A A(i) mod M provide traces for known multipliers. So: • We can determine which multive opns are squares; • We can determine the exp digit for each multn; • We can determine the secret exponent D. C.D. Walter, UMIST
Longer Keys? • Attack time is polynomial in key length t; • Longer key means better average in traces and longer concatenated traces; so higher probability pt of correct digits. • No greater safety against this attack from longer keys if ptt goes up with t. C.D. Walter, UMIST
Longer Keys – Simulation Example: 8-ary expn, 32-bit multiplier. Double the key length: is p2t2> pt ? Key Length t256 384 512 768 1024 Av to nearest1529 2366 3750 4501 6246 SD to nearest885 1403 2386 2535 3612 Av to others5890 11753 17896 32594 53070 SD to others1108 2412 2279 4646 4581 C.D. Walter, UMIST
Longer Keys? • Av distance between equal multipliers is linear in key length; • Av SD between equal multipliers is linear in key length; • Av distance between different multipliers is notlinear in key length: it goes up by a factor of 3 when key length doubles; • Av SD between equal multipliers is linear in key length. C.D. Walter, UMIST
Longer Keys? • So, to be closer to a wrong digit, traces have to be more than: • 2.2 SDs above average for 256-bit keys • 3.0 SDs above average for 512-bit keys • 5.7 SDs above average for 1024-bit keys • Assuming an approx. normal distribution, the probs ptare then, resp:0.98610.998650.9999999943 C.D. Walter, UMIST
Longer Keys? – No Way! • So, for the simulation, we can deduce two digits more accurately than one when the key length is doubled. • So the secret key is easier to deduce when its length is increased. • The implementation becomes more insecure as key length increases. C.D. Walter, UMIST
Warning • With the DPA averaging above, it may be possible to use a single exponentiationto obtain the secret key especially if the key length is increased; • Using D+rφ(M) with random r may be no defence. C.D. Walter, UMIST
Final Conclusion • Re-think the power of side-channel attacks on the implementation : • they may become easier when the key length is increased. C.D. Walter, UMIST