1 / 20

Recovering Secret Keys from Weak Side Channel Traces of Differing Lengths

This paper explores a method for recovering secret keys from weak side channel traces of differing lengths through recoding techniques. The paper discusses background, a typical randomized exponentiation algorithm, a metric for measuring fitness of recoding guesses, the decision tree for choosing bits, results, and a conclusion.

morriscarol
Download Presentation

Recovering Secret Keys from Weak Side Channel Traces of Differing Lengths

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Recovering Secret Keys from Weak Side Channel Traces of Differing Lengths Colin D. Walter Comodo CA, Bradford, UK Colin.Walter@comodo.com

  2. Outline • Background & History • A typical Randomised Exponentiation Algorithm. • A metric to measure fitness of recoding guesses. • The decision tree for choosing bits • Results • Conclusion

  3. Background Several Standard SW Counter-Measures to Side Channel Leakage from Exponentiation: • Blind the exponent by adding a random multiple of the group order. • Pick an algorithm where the pattern is independent of the secret exponent, e.g. • Square-and-always-multiply • Montgomery Powering Ladder • Pick an algorithm where the pattern is randomised: • Liardet-Smart – Ha-Moon • Oswald-Aigner – Mist

  4. Disadvantages None of these is a panacea: • The fixed pattern algorithms tend to be less efficient, and averaging may still reveal the exponent. • Processing the ~512 known bits of a 1024-bit RSA key may leak enough to reveal 32-bit blinding.(Fouque et al,CHES 2006) • Perfect knowledge of the pattern of squares and multiplications in a randomised exponentiation algorithm usually reveals the key if it is re-used.

  5. Problems for an Attacker • There is always a lot of noise in measurements. • Averaging to determine correct bits is essential. • For randomised exponentiation algorithms, Square & Mult operations cannot be aligned directly with key bits. • Incorrect bit deductions will always occur. • The locations of likely errors must be identified for a computationally feasible algorithm.

  6. Example: Liardet-Smart Recode the binary representation of key D from right to left: • Add in the Borrow of 0 or +1 to give new D. • Choose base 2 & digit 0 if D even. • Randomly choose base m = 2i (i≤ R) if D odd. • Digit d is the residue of D mod m with least absolute value. • The Borrow is 1 for –ve digits, otherwise 0. Exponentiation MD in ECC: • Pre-compute table of required odd powers Md. • Read the recoded digits left to right. • Perform i squares and a multiplication for m = 2i and d≠ 0 Traces may have different lengths: the ith operation is associated with different bits in different traces.

  7. Liardet-Smart 2 Here are some recodings of ...1310 with operators aligned: (D = double, A = add) ... 12 12 02 12 ... D A D A D D A ... 12 –14 02 12 D A D D A D D A ... 38 02 12 ... D D D A D D A ... 12 12 14 ... D A D A D D A ... 12 –14 14 D A D D A D D A ... 38 14 ... D D D A D D A ... 12 02 –38 ... D A D D D D A The average operator yields almost no information: data from the top bit is spread over three or four columns.

  8. Liardet-Smart 3 Here are the recodings with doubles aligned: 12 12 02 12 DA DA D DA 12 –14 02 12 DA D DA D DA 38 02 12 D D DA D DA 12 12 14 DA DA D DA 12 –14 14 DA D DA D DA 38 14 D D DA D DA 12 02 –38 DA D D D DA • There is a lot of information theoretic content: column contents reveal the key bits. • This reveals the key if leakage is strong.

  9. History Karlov & Wagner (CHES 2003) • Uses a Hidden Markov Model • Applies Viterbi’s algorithm to find the best fit key. • Assumes traces can be parsed precisely into symbols D and DA. • So cannot deal with weak leakage. Green, Noad & Smart (CHES 2005) • Repairs the parsing assumption for each trace, selecting the most likely string over symbols D and A. • Treats traces serially one by one • Convergence is unlikely with weak leakage – it can’t get started.

  10. A Way Forward Aim: • Treat all information about a given key bit at the same time in order to average out the noise. Problem: • Viterbi’s algorithm becomes computationally infeasible if the Markov Model treats the traces in parallel. So: • Re-structure / simplify the algorithm. • Don’t process the whole trace to get the best fit;only process the next few bits and choose the best fit. • Don’t evaluate all key prefix choices,only select the best few to consider.

  11. d1 d3 d4 A d2 d5 D Double/Add pattern of recoding Trace probabilities of matching “A” or “D” μ = d1 + d2 + d3 + d4 + ... The Metric • Define the distance between a trace t and recoding r by μ(t,r) = Σi (1 – pr(ti = ri)) where pr(ti = ri) is probability that the ith operation in tis the same as the ith operation for r. (cf Hamming Wt.)

  12. 1 1 ... min μ 1 1 1 0 0 0 0 1 0 0 The Metric • Define the distance between a trace t and a key choice D by μ(t,D) = min { μ(t,r) | r is a recoding of D } This selects the best match recoding of D.

  13. The Metric • Define the distance between a trace t and recoding r by μ(t,r) = Σi(1 – pr(ti = ri)) where pr(ti = ri) is probability that the ith operation in tis the same as the ith operation for r. This should be small for correct interpretation of the trace. • Define the distance between a trace t and a key choice D by μ(t,D) = min { μ(t,r) | r is a recoding of D } This selects the best match recoding of D. • Define the distance between a trace set T and a key D by μ(T,D) = ΣtTμ(t,D) The best fit key minimises this distance.

  14. The Markov Model For one trace & weak leakage the standard Markov model & Viterbi algorithm yields an almost useless guess at the key. But the Markov Model for T traces and a recoding automaton with S states is too big: • There are at least |S|T states for each processed key bit/digit • At least n |S|T states for key of length n. • Computationally infeasible even for |S| = 2 and T > 100. We need to be able to choose large T when leakage is weak.

  15. Re-organisation Instead of the Markov Model structure: • Build a binary tree representing all possible choices for the n key bits. • Descend the tree and for each node determine the best recoding for each trace, independently of length. • This defines a value of the metric μ for each node of the tree. • Eventually... pick the best leaf, selecting recodings whose length equals the trace length.

  16. Simplifications To keep the complexity down: • Prune the nodes at a given depth in the binary tree, keep only the B nodes with the best values for the metric. • Prune the recoding choices within each node: keep only the R best values for each trace. • Instead of looking at the leaves to decide the best branch,look at only λ more nodes beyond the last decided bit. • NB The cost of a recoding decision can spread over several more input symbols, so λ should not be too small.

  17. Benefits • Traces become aligned correctly (or almost correctly) with key bits/digits by selecting the best fit recoding. • Summing the metric values for best recodings of each trace provides the averaging that reduces noise and enables the best key bit to be selected. • Locations for incorrect bits can be determined by looking at the difference in the metric between the 0- and 1- branches of a node in the tree.A small difference means lack of certainty about the decision. • Key bit positions can be ordered according to this probability of correctness.

  18. Results Does it all work? • For square-and-multiply it reduces to the optimal averaging which Kocher used – this works! • For Oswald-Aigner & Ha-Moon it works with much lower leakage than prior algorithms. • For Liardet-Smart, the random variation in base has always made this recoding more difficult to handle – here it just requires a few more traces or stronger leakage.

  19. Some Figures • Take the Ha-Moon recoding. Digit set = { –1, 0, +1}. • Assume a 70% chance of deciding correctly between a square or multiplication from the side channel trace, but unable to distinguish the multiplications for –1 and +1. • Take typical 192-bit ECC key & only 10 traces. • In 88% of cases there are no errors in the 168 bits we are most certain of. • There are less than 9 errors (on av) in the remaining 24 bits. • It is computationally feasible to correct these errors. (Under 106 cases to check.)

  20. Conclusion • Traces from randomised exponentiation algorithmscan be aligned effectively to pool the leakage associated with individual key bits. • This is possible even with very weak leakage. • Locations of possible bit errors are identified with ease. • It is computationally feasible to correct all the errors in a high proportion of cases. • Designers of cryptographic chips should assumethe information content of side channel traces can be combined in a computationally effective way to reveal the secret key if it is re-used sufficiently often.

More Related