330 likes | 782 Views
Differential Power Analysis and Other Side Channel Attacks on Embedded Systems. Steven Butt, Jason Gordon, David Omoto, Ryan Speelman EE 202A Fall 2003. Overview. Introduction Secret key encryption Smart cards Side Channel Attacks Business Implications Side Channel Attacks Power Timing
E N D
Differential Power Analysis and Other Side Channel Attacks on Embedded Systems Steven Butt, Jason Gordon, David Omoto, Ryan Speelman EE 202A Fall 2003
Overview • Introduction • Secret key encryption • Smart cards • Side Channel Attacks • Business Implications • Side Channel Attacks • Power • Timing • EM • Fault • Countermeasures • General • Specific Examples • Conclusion
Secret Key Encryption • Traditional key-based cryptography is based around the notion of: • public knowledge - output of an encryption system • private knowledge - secret key used during encryption • Output is a function of the input and the secret key • Ex.) Y = X mod K Where: Output is Y, Input is X, Key is K • Y is said to be encrypted because it does not resemble the input data X and cannot be immediately interpreted without decryption • Only trusted users who know the secret key will be able to run this algorithm in reverse to obtain the input message from the output message • If the secret key is compromised then anyone can read the input data Pictures taken from paper “Introduction to Differential Power Analysis and Related Attacks” by Cryptographic Research.
Smart Cards • Microcontroller and circuitry built into a plastic card • Can hold information about the user such as bank / credit card records, medical history, etc.
Side Channel Attacks • Cryptographic algorithms are proven to be mathematically strong. • Most would take numerous years of trial end error analysis to break • What if we could somehow gain more information about the system? • Side-channel attacks: The trick is to not attack the algorithm, but the implementation. • Various bits of information can leak out of an electronic system implementation • Electromagnetic radiation • Chip power profile • Timing of instructions and cache hit rates • If a malicious user were to gain physical access to the device then they could obtain a great deal of information about the inner workings of the algorithm. • For example, the power used in a multiplication operation will differ greatly based upon the operands • This added information is enough to break the system in a short amount of time given the proper methodology. • Must know the algorithm used beforehand in order to apply statistical methods to determine secret keys. Pictures taken from paper “Introduction to Differential Power Analysis and Related Attacks” by Cryptographic Research.
Side Channel Attacks Continued… • Analogous to a doctor giving you an exam • The doctor cannot jump inside your body to see what is wrong with you • Instead: a doctor takes vital sign measurements and uses educated guesses to diagnose your problem and assign proper treatment • A side channel attack uses the same “exterior examination” methodology to break encryption
Implications of Side Channel Attacks - This is not a drill! • Side-Channel attacks have been proven to work on a variety of cryptographic systems at a very cheap cost. • Cryptographic Research has used power analysis techniques to break encryption on all types of smart cards using DES. • Example: Power analysis • Power analyzer can be as simple as a resistor in parallel with the processor unit • Two different types of power analysis attack (more on this later) • SPA : simple attack, few hundred dollars to setup, few seconds to break, relatively easy to defend against • DPA : more complex attack, few thousand dollars to setup, several hours to break, very difficult to defend against • Economic impacts of Smart Card encryption breakage are enormous • Smart cards are used in credit card readers, pay-tv access controllers, public transport ticket dispensers, mobile phones, etc. • A malicious user could duplicate one of your personal smart cards and commit identity theft
Required Equipment for SPA/DPA from “How do Side Channel Attacks Affect the Software Development Process” by Giesecke & Devrient
Types of Side Channel Attacks • Power Attacks • Timing Attacks • Electromagnetic Attacks • Fault Attacks
Attack Characteristics • Invasive vs. Non-Invasive • Invasive: Depackaging device to gain direct access to the internal components • Non-Invasive: Exploit externally available information • Semi-Invasive • Most dangerous type of attack because they “can be carried out using very cheap and simple equipment” • Active vs. Passive • Active: Tampering with proper functionality of device from “Side Channel Attacks” by Jean-Jacques Quisquater
Simple Power Analysis (SPA) • An attacker directly observes a system’s power consumption • Large features such as DES rounds and RSA operations may be identified, since the operations performed by the microprocessor vary significantly during different parts of these operations • Can be used to break RSA implementations by revealing differences between multiplication and squaring operations • Can be used to break DES implementations because of the visible differences between permutations and shifts.
SPA continued • Figure above shows SPA monitoring from a single DES operation performed by a smart card • The upper trace shows the entire DES encryption operation • The lower trace shows a detailed view of the 2nd and 3rd round of the operation Figure taken from “Introduction to Differential Power Analysis and Related Attacks” by Kocher, Jaffe, and Jun
Differential Power Analysis (DPA) • Much more powerful than SPA • Makes use of statistical analysis and error correction • techniques to obtain information about the secret keys • Two phases to a DPA attack: • Data collection • Data analysis • The basic idea is to take power consumption measurements of the last few rounds of 1000 DES operations, with each sample consisting of 100000 data points.
DPA continued • The data can be represented as a 2D array S[0…999][0…99999], where the first index is the operation number and the second index is the sample • The attacker is also assumed to have the encrypted ciphertexts C[0…999] • The attacker next chooses a key-dependent selection function D that has the form D(Ki, C) where Ki is some key information. • The goal here is to find the 6 bits of the DES key that are provided as input to the DES algorithm, meaning Ki is a 6-bit input • D(Ki, C) is found by injecting Ki into the DES algorithm at an intermediate step, and then is equal to 0 or 1 depending on if a selected bit from the output of the DES operation matches the same bit corresponding to a parameter (L) directly computed from C • Finally a differential average trace T[0…63][0…99999] is calculated using the formula
DPA continued … • The trace computes what the power profile should be given the guessed Ki and then compares it to the actual measured power profile. • If the guessed Ki is incorrect, there will be no correlation between computed power profile and measured power profile. • The trace is able to accomplish this because the correct value of Ki was actually stored in registers, manipulated in logic units, etc – yielding detectable power consumption differences.
Timing Attacks • Timing attacks are based on measuring the time a device takes to perform an operation Ex.) the RSA encryption method, which computes R = (y^x)mod(n) where n is known, y can be found be eavesdropping and x is the secret key which an attacker is looking for. • An algorithm to compute this is shown below: • Let s0 = 1. • For k = 0 upto w-1: • If (bit k of x) is 1 then • Let Rk = (sk*y) mod(n). -slow • Else • Let Rk = sk . -fast • Let sk+1 = (Rk)^2 mod n. • EndFor. • Return (Rw-1). From: http://www.cryptography.com/resources/whitepapers/TimingAttacks.pdf
Timing Attacks • The real issue is the If/else statement. There are two possible paths that the algorithm can take. One path has a multiplication and modulation and the other is an assign. • Therefore some information about which path was taken can be guessed by monitoring the time taken. • Thus if an attacker knows what the first (b-1) bits of x are they can accurately guess bit b. This can be done for all bits 0 through w without very sophisticated equipment. • Error correction algorithms can be implemented as well to reduce the effect of guessing incorrectly. • Other encryption algorithms(such as DSS and CRT) can have their Keys broken by a similar method of Timing Analysis. • An assumption is made that the attacker knows the method of encryption but not the key. In practice this is very likely to be the case.
Other attacks • HO-DPA • Instead of analyzing information across a single event between samples, high-order DPA can be used to correlate information between multiple cryptographic suboperations • EM • Uses antennas and probes to extract electromagnetic leakage information. • FAULT • Extract keys and investigate ciphers by observing naturally-occurring faults or by purposely generating faults in a system
Preventing Side Channel Attacks • Difficult because many countermeasures are side-channel specific (power, timing, etc.) as well as attack specific (SPA, DPA, etc.) • Implementation of one countermeasure may introduce possibilities of other side channel attacks • Combinations of countermeasures are often used • Difficult to foresee future side channel attacks • Performance and Area Penalties
General Countermeasures • Data-Independent Calculations • Time of all operations is independent of input data or key data • Prevents all timing attacks since there is no variation in computation time; however, other attacks may be present • Blinding/Masking • Hide system details to prevent attackers from knowing the input or internal state • Avoid Conditional Branching and Secret Intermediates • All lines of code execute regardless of input and key bits • Minimizes the extent to which time and power properties are revealed • License Modified Algorithms • Design and implement cryptosystems with the assumption that information will leak • RSA, DES, DSA, Diffie-Hellman, El Gamal, Elliptic Curve systems, etc. • Make devices tamper-resistant • Shielding • Detect supply voltage and clock speeds • Very costly from “Introduction to Side Channel Attacks” by Discretix Technologies Ltd.
Countermeasures – Power Analysis • Power Consumption Balancing • Adding dummy registers and gates so that power consumption for every operation is equal • More on this later! • Reduction of Signal Size (decrease Signal-to-Noise Ratio) • Smaller signal size means less information leakage • Ex. Constant execution path code, choosing operations that leak less information in their power consumption, balancing state transitions, physically shielding the device • Introduce Noise • Increases the number of samples required for an attack • Randomize execution timing and order • Ex.) varying the internal clock • Modification of Algorithm Design • Ex. Nonlinear key update procedures to ensure power traces cannot be correlated to certain transactions from “Introduction to Side Channel Attacks” by Discretix Technologies Ltd.
Countermeasures – Timing Attacks • Adding Delays • Make all operations take the same amount of time • Must take into account other possible side channel attacks Ex. Using a timer to delay results can still result in measurable differences in system responsiveness and/or power consumption • Perform operations whose outputs are discarded to mask these side channel attacks • Performance Penalty: system clock set by longest operation • Add random delays to hide time variation • Limited countermeasure: compensated by collecting more samples • Blinding - Hide Internal State • Prevents attacker from simulating internal computations • Time Equalization of Operations • Ex. Perform both multiplication and exponentiation operations even if only one is required • Prevents timing attacks against exponentiation operations that are performed as part of asymmetric encryption operations from “Introduction to Side Channel Attacks” by Discretix Technologies Ltd.
Countermeasures – Fault Attacks • Run the encryption two times • Output results only if results of the executions are the same • Only detects non-permanent faults • Increases computation time or required hardware components • Add hardware to perform checking • Very costly from “Introduction to Side Channel Attacks” by Discretix Technologies Ltd.
SABL - A Logic Solution To DPA • Sense Amplifier Based Logic • Change the Hardware Design to use up the same amount of Power no matter what the inputs. • Not only do 0-1 and 1-0 transitions use the same amount of power but so do non-transitions like 0-0 and 1-1. • Implementation is similar to Dynamic Logic with pre-charge and evaluate cycles, but with a couple of extra transistors to insure that all internal nodes have their capacitance discharged with all inputs under all changes. • SABL Logic from UCLA’s EE Dept. • From:http://www.ee.ucla.edu/%7Etiri/files/esscirc2002.pdf
SABL (continued) Very Similar to Dual-Rail Domino Logic, With a few changes -All internal nodes get pre-charged to a ‘1’ and during the operation all internal nodes will discharge once. 1 and only 1 of the outputs stays a 1 and the other discharges. From:http://www.ee.ucla.edu/%7Etiri/files/esscirc2002.pdf
SABL (continued) • Main change is the extra NMOS transistor M1 which had its gate tied to Vdd • The M1 transistors job is to make both sides of the Dual rail discharge by the end of the clock cycle • It does not effect the operation of the circuit because there is a slight delay through it. • Another little change is connecting the A-bar gate to the drain of the B gate. This insures that even the nodes on the drains of the B and B-bar gates with small capacitance still get discharged. • This style of logic makes sure every node even the ones with the smallest capacitance, and thus power swings, still get charged and discharged like the others. • Makes no operation, have any different power for differential power analysis to analyze
Boolean and Arithmetic Masking • Basic principle of masking: to change intermediate values inside an algorithm so that they do not resemble an easily breakable subset of the secret key • Two types of masking, r is a random number • Boolean : x’ = x (xor) r • Used for boolean operations (or, and, shift) • Arithmetic : x’ = (x – r) mod2k • used for arithmetic operations (multiply, add) • Depending on the algorithm, only one or both types must be used • If both types are used, there must be conversion from boolean masking to arithmetic masking and vice versa to maintain proper operation • Current algorithms to make this conversion do not work against DPA analysis, but hopefully future version will be secure.
Conclusions • Higher development costs (10%-15%) • Longer development time • Changes in the development process and in the organization • Higher product costs • Larger memory footprint • Lower performance and more power consumption • On-going development of attacks and their countermeasures • Higher security against attacks!
References • F. Weikmann. How do Side Channel Attacks Affect the Software Development Process?, 2003 • F. Koeune. Physical attacks of cryptosystems. UCL Crypto Group. • N. Smart. Physical Side-Channel Attacks on Cryptographic Systems. • R. Junee. Smart Cards and Side-Channel Cryptanalysis. • E. Hess, N. Janssen, B. Meyer, and T. Schjitze. Information Leakage Attacks Against Smart Card Implementations of Cryptographic Algorithms and Countermeasures. Siemens and Infineon Technologies. • H. Bar-El. Introduction to Side Channel Attacks. Discretix Technologies Ltd. • K. Tiri, M. Akmal, and I. Verbauwhede. A Dynamic and Differential CMOS Logic with Signal Independent Power Consumption to Withstand Differential Power Analysis on Smart Cards. • P. Kocher, J. Jaffe, and B. Jun. Introduction to Differential Power Analysis and Related Attacks. Crytography Research and Crytography Research Inc., 1998. • J. Quisquater. Side Channel Attacks. Math RiZK, 2002.