170 likes | 376 Views
Embedded Security. Embedded systemDesign metrics: power, performance, areaSecure embedded systemTrusted Platform Module (TPM): secure key storageEmbedded cryptosystem: perform encryption/decryptionSecurity-related metrics: tamper-resistance, anti-counterfeiting, data integrity and confidenti
E N D
1. Analysis and Mitigation of Process Variation Impacts on Power-Attack Tolerance This work is built on my co-author Prof. Burlesons research in deep sub-micron CMOS variation modeling and mitigation, over the last 5 years funded by SRC and Intel. This particular project is funded by an NSF Cyber-trust grant that explores various aspects of hardware security, including RFID, smart cards and other lightweight embedded systems.This work is built on my co-author Prof. Burlesons research in deep sub-micron CMOS variation modeling and mitigation, over the last 5 years funded by SRC and Intel. This particular project is funded by an NSF Cyber-trust grant that explores various aspects of hardware security, including RFID, smart cards and other lightweight embedded systems.
2. Embedded Security Embedded system
Design metrics: power, performance, area
Secure embedded system
Trusted Platform Module (TPM): secure key storage
Embedded cryptosystem: perform encryption/decryption
Security-related metrics: tamper-resistance, anti-counterfeiting, data integrity and confidentiality
Let me first introduce the concept of embedded security. As recognized by the theme and papers of this session, embedded systems design presents challenges in trading off multiple objectives of power, area and performance. These are the major design metrics that a system designer must take into account. When security is required as an additional design metric, the situation becomes considerably more complex.Let me first introduce the concept of embedded security. As recognized by the theme and papers of this session, embedded systems design presents challenges in trading off multiple objectives of power, area and performance. These are the major design metrics that a system designer must take into account. When security is required as an additional design metric, the situation becomes considerably more complex.
3. Embedded Security Embedded system
Design metrics: power, performance, area
Secure embedded system
Trusted Platform Module (TPM): secure key storage
Embedded cryptosystem: perform encryption/decryption
Security-related metrics: tamper-resistance, anti-counterfeiting, data integrity and confidentiality
For example, if an embedded system is used for security applications, it could be under the threats of an adversary. Let us see two typical security applications. As we know, most today's laptops have a trusted platform module to store the secret keys. However, we do not know how safe the key is stored, given the adversary with some advanced skills to extract the key. Another proper example is the embedded cryptosystem which performs a certain type of crypto function to protect the plaintexts. Again, we do not know how robust such crypto function can secure the data against possible eavesdrops and malicious attacks. In general, security is a difficult metric to quantify since it can take the form of tamper-resistance, anti-counterfeiting, data integrity and confidentiality, just to name a few.
For example, if an embedded system is used for security applications, it could be under the threats of an adversary. Let us see two typical security applications. As we know, most today's laptops have a trusted platform module to store the secret keys. However, we do not know how safe the key is stored, given the adversary with some advanced skills to extract the key. Another proper example is the embedded cryptosystem which performs a certain type of crypto function to protect the plaintexts. Again, we do not know how robust such crypto function can secure the data against possible eavesdrops and malicious attacks. In general, security is a difficult metric to quantify since it can take the form of tamper-resistance, anti-counterfeiting, data integrity and confidentiality, just to name a few.
4. Side-Channel Attacks Embedded security CANNOT be guaranteed by the strength of cryptographic algorithms!
Side-Channel Attacks: physical implementation of hardware can leak secret information to the adversary
Embedded system security is further complicated by the intrinsic vulnerabilities introduced by side-channel attacks.
Even if the cryptographic algorithms and key lengths suggest a given level of security, side-channel attacks on the circuit physical implementation show additional vulnerabilities. Side-channel information can come from the electromagnetic radiation, power consumption, and execution time of the embedded system and even active fault injection into the embedded system.
Attacks based on side-channel information are feasible on almost all embedded systems. To defend against the attacks, many different countermeasures have been proposed for securing these side-channels; however, they have significant impacts on the power, area and performance of the resulting embedded systems.
Several recent DAC papers have reviewed the topic of side-channels and how embedded system designers should cope with the additional constraints for side-channel countermeasures.
Embedded system security is further complicated by the intrinsic vulnerabilities introduced by side-channel attacks.
Even if the cryptographic algorithms and key lengths suggest a given level of security, side-channel attacks on the circuit physical implementation show additional vulnerabilities. Side-channel information can come from the electromagnetic radiation, power consumption, and execution time of the embedded system and even active fault injection into the embedded system.
Attacks based on side-channel information are feasible on almost all embedded systems. To defend against the attacks, many different countermeasures have been proposed for securing these side-channels; however, they have significant impacts on the power, area and performance of the resulting embedded systems.
Several recent DAC papers have reviewed the topic of side-channels and how embedded system designers should cope with the additional constraints for side-channel countermeasures.
5. Power Analysis Attack Every computation process is inevitably accompanied by an amount of power consumption.
Corr (Power, Logic values) > 0
Can digital information leak through the power traces?
Power Analysis Attack: measure and
analyze the data-dependent power traces
to extract the secret digital information
Simple Power Analysis (SPA)
a one-to-one mapping
Differential Power Analysis (DPA)
Correlation Power Analysis (CPA) One typical side-channel attack is the power analysis attack, which exploits the dependence of power consumption on the logic values.
In mathematics, the correlation coefficient of the power consumption and the logic values under processing is actually larger than 0. This is because: When an embedded system is processing crypto computations involving logic values, its power consumption in the form of an analog signal can show some special patterns. It is very interesting to know whether the secret digital information can leak through the power traces by analyzing the power patterns.Power analysis attack is supposed to give a YES to this question. A simple power analysis, as shown in the right figure, can directly tell the logic values from the power trace by a one-to-one mapping. Of course, simple power analysis is not practical on modern integrated circuits with complex power patterns. However, more advanced power analysis, such as differential power analysis and correlation power analysis can still parse the power traces with statistical methods.One typical side-channel attack is the power analysis attack, which exploits the dependence of power consumption on the logic values.
In mathematics, the correlation coefficient of the power consumption and the logic values under processing is actually larger than 0. This is because: When an embedded system is processing crypto computations involving logic values, its power consumption in the form of an analog signal can show some special patterns. It is very interesting to know whether the secret digital information can leak through the power traces by analyzing the power patterns.Power analysis attack is supposed to give a YES to this question. A simple power analysis, as shown in the right figure, can directly tell the logic values from the power trace by a one-to-one mapping. Of course, simple power analysis is not practical on modern integrated circuits with complex power patterns. However, more advanced power analysis, such as differential power analysis and correlation power analysis can still parse the power traces with statistical methods.
6. Procedure of DPA attack Here, we present a brief procedure of Differential Power Analysis attack, or DPA attack. Suppose we want to know the secret key stored in an embedded cryptosystem. We first give a number of known input plaintexts to the cryptosystem and measure the power traces. Then we create a power profile to map the plaintexts to the power traces. Next, we generate the differential power curves, the DPCs, to analyze the correlation between the power and logic values. Basically, we first make a guess of the secret key. Then we group the power traces according to a selection function determined by the key guess. The grouped power traces are added up and finally subtracted to get one DPC. In the same way, we generate all DPCs corresponding to all key guesses and align them on a single plot. If a peak value can be found on one of the DPCs, the secret key is successfully extracted and the number of the power traces is reported. If not, more plaintexts are required to get more power traces in order to extract the key. In general, the entire DPA procedure aims at statistically accumulating the data-dependent power and removing the data-independent power. It finally can find out the secret key which leads the logic values to show the maximum correlation with the power traces. Here, we present a brief procedure of Differential Power Analysis attack, or DPA attack. Suppose we want to know the secret key stored in an embedded cryptosystem. We first give a number of known input plaintexts to the cryptosystem and measure the power traces. Then we create a power profile to map the plaintexts to the power traces. Next, we generate the differential power curves, the DPCs, to analyze the correlation between the power and logic values. Basically, we first make a guess of the secret key. Then we group the power traces according to a selection function determined by the key guess. The grouped power traces are added up and finally subtracted to get one DPC. In the same way, we generate all DPCs corresponding to all key guesses and align them on a single plot. If a peak value can be found on one of the DPCs, the secret key is successfully extracted and the number of the power traces is reported. If not, more plaintexts are required to get more power traces in order to extract the key. In general, the entire DPA procedure aims at statistically accumulating the data-dependent power and removing the data-independent power. It finally can find out the secret key which leads the logic values to show the maximum correlation with the power traces.
7. How to tolerate power attacks? Goal: Corr (Power, Logic values) 0
Signal-to-noise ratio (SNR) reduction:
Decrease the data-dependent power (power balancing circuits, e.g. dual-rail logic and differential pair routing
)
Increase the data-independent power (noise insertion)
Drawbacks: power balancing techniques are impractical due to variations; noise can be statistically removed.
Time de-synchronization
Drawbacks: difficult to implement (the logic functions should not be disturbed) and not always effective. To make an embedded system tolerate DPA attacks, we must minimize the correlation between power and logic values. Basically, two methods can achieve this goal. The first one is Signal-to-noise ratio reduction, meaning that the correlation can be reduced either by decreasing the data-dependent power, or by increasing the data-independent power. By using various dual-rail & differential logic styles, the data-dependent power can be reduced to resist DPA attacks. However, as shown later in this talk, process variations have a very detrimental impact on such DPA-resistant logic styles by upsetting the carefully balanced circuits.
On the other hand, data-independent power can be increased by noise insertion to prevent power attacks, but noise can still be removed by statistical analysis on enough number of power traces.
The second method is time de-synchronization, such as using asynchronous logic or inducing non-deterministic delays or jitter to the embedded system. In DPA, since the measured power traces need a time reference to map the logic values, de-synchronization of system clock can disturb the analysis of power traces. The only drawback of this method is that the circuit implementation to de-synchronize the time is costly and not always effective.
To make an embedded system tolerate DPA attacks, we must minimize the correlation between power and logic values. Basically, two methods can achieve this goal. The first one is Signal-to-noise ratio reduction, meaning that the correlation can be reduced either by decreasing the data-dependent power, or by increasing the data-independent power. By using various dual-rail & differential logic styles, the data-dependent power can be reduced to resist DPA attacks. However, as shown later in this talk, process variations have a very detrimental impact on such DPA-resistant logic styles by upsetting the carefully balanced circuits.
On the other hand, data-independent power can be increased by noise insertion to prevent power attacks, but noise can still be removed by statistical analysis on enough number of power traces.
The second method is time de-synchronization, such as using asynchronous logic or inducing non-deterministic delays or jitter to the embedded system. In DPA, since the measured power traces need a time reference to map the logic values, de-synchronization of system clock can disturb the analysis of power traces. The only drawback of this method is that the circuit implementation to de-synchronize the time is costly and not always effective.
8. Power-Attack Tolerance ------ A new security metric Parse the power traces
Define the gate-level power-attack tolerance (PAT)
SNR of Pdyn and Pleak:
PAT is the inverse of SNR:
To quantify the ability of an embedded system to tolerate power analysis attacks, we need the help of a new security metric. To deal with power attacks on deep-submicron systems, the metric should target both the dynamic power and leakage power on the power traces, since both of them are data-dependent. Thus, we define a gate-level metric, the power-attack tolerance, or the PAT for short. This metric is applicable for all types of logic styles. For a generic N-input logic gate, there are 2^2N possible logic transitions for the dynamic power and 2^N possible logic values for the leakage power. We usually can use the ratio of standard deviation to mean to represent the sensitivity (or say the SNR) of the power to all possible logic values. More sensitive means more vulnerable to power analysis attack. We use the signal-to-noise ratio or the SNR to represent the sensitivity, which means the ratio of data-dependent power and data-independent power. And the PAT is thus the inverse of SNR.To quantify the ability of an embedded system to tolerate power analysis attacks, we need the help of a new security metric. To deal with power attacks on deep-submicron systems, the metric should target both the dynamic power and leakage power on the power traces, since both of them are data-dependent. Thus, we define a gate-level metric, the power-attack tolerance, or the PAT for short. This metric is applicable for all types of logic styles. For a generic N-input logic gate, there are 2^2N possible logic transitions for the dynamic power and 2^N possible logic values for the leakage power. We usually can use the ratio of standard deviation to mean to represent the sensitivity (or say the SNR) of the power to all possible logic values. More sensitive means more vulnerable to power analysis attack. We use the signal-to-noise ratio or the SNR to represent the sensitivity, which means the ratio of data-dependent power and data-independent power. And the PAT is thus the inverse of SNR.
9. Process Variations in Deep Submicron Deterministic design metrics become probabilistic with process variations
Parameters: transistor geometry, interconnect geometry, oxide thickness, doping profile, etc.
Assume Gaussian distribution -> analyze the Probability Density Function of design metrics (e.g., power and performance)
Intra-die vs. inter-die process variations
Process variation impacts on power
Leakage power is extremely sensitive
Dynamic power is also affected by the transistor size variations
After defining the PAT, let us start talking about process variations. This work entirely focuses on the Power Side-channel since it is significantly impacted by Process Variations. As commonly known, process variations can result from transistor geometry, interconnect geometry, doping profile and so on. They are generally modeled as Gaussian distribution functions, so that the deterministic design metrics become probabilistic with a probability density function. In deep submicron technologies, process variations affect the power consumption, especially the leakage power according to many published papers. Now, in the context of embedded security, we are particularly interested in their impacts on both the dynamic power-attack tolerance and leakage power-attack tolerance, the DPAT and LPAT. We believe that the result can make the study of PAT more realistic and comprehensive. After defining the PAT, let us start talking about process variations. This work entirely focuses on the Power Side-channel since it is significantly impacted by Process Variations. As commonly known, process variations can result from transistor geometry, interconnect geometry, doping profile and so on. They are generally modeled as Gaussian distribution functions, so that the deterministic design metrics become probabilistic with a probability density function. In deep submicron technologies, process variations affect the power consumption, especially the leakage power according to many published papers. Now, in the context of embedded security, we are particularly interested in their impacts on both the dynamic power-attack tolerance and leakage power-attack tolerance, the DPAT and LPAT. We believe that the result can make the study of PAT more realistic and comprehensive.
10. Experimental Setup Goal: find the distribution function of PAT under realistic process variations.
Device model: 45nm Predictive Technology Model (www.eas.asu.edu/~ptm/)
Intra-die process variations: ITRS 2006 reports for 45nm
Threshold voltage Vth : 42% variation (3s)
Effective channel length Leff: 12% variation (3s)
Methodology
Monte Carlo simulation in SPICE on standard-cell
CMOS gate
8000 iterations to achieve accurate curve fitting
We set up an experiment to find the distribution function of PAT considering the impacts of process variations. All the experiments are simulations with realistic models. The device model is 45nm predictive technology model. The process variation model is based on the International Technology Roadmap for Semiconductors 2006 reports. We study two important intra-die variations in this work. One is the threshold voltage with 42% variation in 3 sigma from the mean, and the other is the effective channel length of transistor with 12% variation in 3 sigma from the mean.
To introduce the process variations, we use Monte Carlo simulation in SPICE. We calculate each PAT by simulating the power traces of all possible logic values on a single gate. Then we repeat it by a large number of Monte Carlo simulation iterations to achieve an accurate PAT distribution.We set up an experiment to find the distribution function of PAT considering the impacts of process variations. All the experiments are simulations with realistic models. The device model is 45nm predictive technology model. The process variation model is based on the International Technology Roadmap for Semiconductors 2006 reports. We study two important intra-die variations in this work. One is the threshold voltage with 42% variation in 3 sigma from the mean, and the other is the effective channel length of transistor with 12% variation in 3 sigma from the mean.
To introduce the process variations, we use Monte Carlo simulation in SPICE. We calculate each PAT by simulating the power traces of all possible logic values on a single gate. Then we repeat it by a large number of Monte Carlo simulation iterations to achieve an accurate PAT distribution.
11. Probabilistic DPAT and LPAT with Process Variation Results of PAT distribution function
Use Weibull distribution function
Asymmetric distribution
Parameterized by
to mimic other distribution
functions
Used in reliability and
failure analysis The resulting PAT distribution function of a standard CMOS 2-input NAND gate are shown here. The left one is for the dynamic power-attack tolerance DPAT, and the right one is for the leakage power-attack tolerance LPAT. On both figures, the black curve represents the threshold voltage variation, and the red curve represents the channel length variation. We clearly see that the shape of the distribution curve is asymmetric. To model such curve, we use the Weibull distribution function, which can mimic many other distribution functions by adjusting two parameters: alpha and beta. The mathematic expression of Weibull function is shown on the right, where the gamma function is a commonly used function in science and engineering.The resulting PAT distribution function of a standard CMOS 2-input NAND gate are shown here. The left one is for the dynamic power-attack tolerance DPAT, and the right one is for the leakage power-attack tolerance LPAT. On both figures, the black curve represents the threshold voltage variation, and the red curve represents the channel length variation. We clearly see that the shape of the distribution curve is asymmetric. To model such curve, we use the Weibull distribution function, which can mimic many other distribution functions by adjusting two parameters: alpha and beta. The mathematic expression of Weibull function is shown on the right, where the gamma function is a commonly used function in science and engineering.
12. Result summary Degradation probability: the percentage of PAT less than the nominal value
The distributions have different skews relative to the nominal
More than 50% of both DPAT and LPAT are degraded due to Vth / Leff variations
LPAT has worse degradation probability Here we summarize the simulation results represented by the mean and standard deviation of Weibull function. By Comparing DPAT and LPAT, we see that the mean value of them are almost equal; while the standard deviation of LPAT is much larger than that of DPAT. The reason is that process variations affect leakage power more than dynamic power, regarding the sensitivity to the logic values.
To better understand the results, we use the degradation probability, the percentage of PAT smaller than the nominal value, to criticize the impacts of process variations on PAT. We find that more than 50% of both DPAT and LPAT have degraded to indicate a negative impact. Again, LPAT has worse degradation probability than DPAT. Here we summarize the simulation results represented by the mean and standard deviation of Weibull function. By Comparing DPAT and LPAT, we see that the mean value of them are almost equal; while the standard deviation of LPAT is much larger than that of DPAT. The reason is that process variations affect leakage power more than dynamic power, regarding the sensitivity to the logic values.
To better understand the results, we use the degradation probability, the percentage of PAT smaller than the nominal value, to criticize the impacts of process variations on PAT. We find that more than 50% of both DPAT and LPAT have degraded to indicate a negative impact. Again, LPAT has worse degradation probability than DPAT.
13. Process Variation Impact on PAT for DPA-resistant logic styles Sense Amplifier-Based Logic (SABL)
Tolerate power attacks by power balancing
circuit with 3-4x design overhead.
Ideally infinite PAT; but in reality, 10x larger
than the PAT of equivalent CMOS gates.
Process variation impacts
Results: 59-71% degradation probability
LPAT degrades even worse, as low as CMOS gates
After studying the process variation impacts on standard CMOS logic, we turn to the DPA-resistant logic styles. We take the sense amplifier-based logic as a show-case, since it is commonly accepted as a typical DPA-resistant logic. SABL employs a dynamic differential circuit, as shown in the right schematic. The power consumption of SABL gate is independent of the input logic values, with 3-4 times of design overhead. The actual PAT of SABL gate is 10 times larger than that of an equivalent CMOS gate.
By introducing the same process variation to SABL gate, we simulate the PAT distribution function. It is surprising that the degradation probability is as large as 59-71%. We compare the PAT distribution of SABL and CMOS in the figures below. The solid line is for SABL and the dash line is for standard CMOS. We find that SABL has a much larger standard deviation due to process variations. LPAT of SABL is seriously degraded to a region as low as standard CMOS! In all. Process variations have a negative impact on the PAT of DPA-resistant logic styles.After studying the process variation impacts on standard CMOS logic, we turn to the DPA-resistant logic styles. We take the sense amplifier-based logic as a show-case, since it is commonly accepted as a typical DPA-resistant logic. SABL employs a dynamic differential circuit, as shown in the right schematic. The power consumption of SABL gate is independent of the input logic values, with 3-4 times of design overhead. The actual PAT of SABL gate is 10 times larger than that of an equivalent CMOS gate.
By introducing the same process variation to SABL gate, we simulate the PAT distribution function. It is surprising that the degradation probability is as large as 59-71%. We compare the PAT distribution of SABL and CMOS in the figures below. The solid line is for SABL and the dash line is for standard CMOS. We find that SABL has a much larger standard deviation due to process variations. LPAT of SABL is seriously degraded to a region as low as standard CMOS! In all. Process variations have a negative impact on the PAT of DPA-resistant logic styles.
14. Case study of DPA How does the process variation impact a Differential Power Analysis attack on a cryptosystem?
Simulation-based DPA
Targeting secret information: the 6-bit subkey of the 5th Substitution-box (S-box) during the first round of DES
Hspice: simulate the power traces of a DES cryptosystem
Perl: manage the power traces and perform the DPA procedure to generate differential power curves (DPCs)
Metric for DPA attack: measurement to disclosure (MTD)
Number of power traces needed to break a cryptosystem
Depends on the cryptosystem implementations
Simulation-based DPA attack gives lower MTD than real attacks, due to the noise-free environment By recognizing the negative impacts of process variations on the gate-level PAT, we continue to study its impact on a differential power analysis attack. We implement a DES (the Data Encryption Standard) cryptosystem with both standard CMOS and SABL gates, and simulate the power traces by giving random input plaintext. A simulation-based DPA attack is written in Perl to statistically analyze the power traces and generate differential power curves.
The commonly used metric for DPA attack is the measurement to disclosure, or the MTD. As we mentioned in the DPA procedure, it is the number of power traces required to extract the secret key from an arbitrary module of the cryptosystem. The MTD from our simulations-based DPA attack is smaller than that in real attacks, because no on-chip noise is modeled in our simulations. By recognizing the negative impacts of process variations on the gate-level PAT, we continue to study its impact on a differential power analysis attack. We implement a DES (the Data Encryption Standard) cryptosystem with both standard CMOS and SABL gates, and simulate the power traces by giving random input plaintext. A simulation-based DPA attack is written in Perl to statistically analyze the power traces and generate differential power curves.
The commonly used metric for DPA attack is the measurement to disclosure, or the MTD. As we mentioned in the DPA procedure, it is the number of power traces required to extract the secret key from an arbitrary module of the cryptosystem. The MTD from our simulations-based DPA attack is smaller than that in real attacks, because no on-chip noise is modeled in our simulations.
15. Process Variation Impacts on DPA Validate the MTD distribution:
Simulate enough power traces to find out the correct key
Nominal MTD: standard CMOS=120 power traces
SABL=2300 power traces
Given 42% Vth variation, perform Monte Carlo simulation to find the MTD
Result: MTD of SABL shows worse degradation probability
To validate the MTD distribution, we first simulate enough power traces to find out the MTD without introducing process variations. The nominal MTD of CMOS is 120, and that of SABL is 2300. Then, we give 42% threshold voltage variation to the cryptosystem, and perform the Monte Carlo simulation to find the distribution of MTD. Since it is very time-consuming to simulate many power traces, we have to reduce the number of Monte Carlo iterations, but we still can keep the statistical accuracy as long as the results show a general trend of the MTD distribution. As seen in the bar charts, the result of CMOS has a degradation probability of 45%, while the result of SABL is as large as 57%. This indicates that SABL cryptosystem is more vulnerable to DPA attacks with process variations than the standard CMOS one.To validate the MTD distribution, we first simulate enough power traces to find out the MTD without introducing process variations. The nominal MTD of CMOS is 120, and that of SABL is 2300. Then, we give 42% threshold voltage variation to the cryptosystem, and perform the Monte Carlo simulation to find the distribution of MTD. Since it is very time-consuming to simulate many power traces, we have to reduce the number of Monte Carlo iterations, but we still can keep the statistical accuracy as long as the results show a general trend of the MTD distribution. As seen in the bar charts, the result of CMOS has a degradation probability of 45%, while the result of SABL is as large as 57%. This indicates that SABL cryptosystem is more vulnerable to DPA attacks with process variations than the standard CMOS one.
16. Mitigation: Transistor Sizing Optimization Goals:
Compensate for PAT uncertainty
Increase the mean PAT
Fine-grain transistor sizing
Set global sizing constraint
Find best-case PAT
System-level simulation on MTD
Resizing for MTD optimization
Design optimized; otherwise, run another iteration with reduced sizing constraints
Optimization of SABL DES
Degradation probability reduced from 57% to 18% (achieved by 4 iterations)
0.9% power / 1.5% area overhead To mitigate the PAT degradation due to process variations, we develop a transistor sizing optimization method. The basic idea is that a transistor with a larger size can better tolerate process variations, and thus tolerate power attacks. However, the size of transistors in a logic gate cannot be too large to violate other design metrics. Therefore, we first set a global sizing constraint and find the best-case gate-level PAT for each type of gate. We use the modified SABL gate library to build the cryptosystem and find the MTD. The MTD distribution may not be optimized because some up-sized transistors consume more data-dependent power. Thus we need to resize some critical transistors to get a better MTD. If it still gives non-optimized result, we need to go all back to the beginning by reducing the global sizing constraint, modifying the gate library and performing the MTD simulation again. Finally, we take 4 optimization iterations to achieve the best MTD distribution, by reducing the degradation probability from 57% to 18%. The power and area overhead of our optimization design are kept acceptable.To mitigate the PAT degradation due to process variations, we develop a transistor sizing optimization method. The basic idea is that a transistor with a larger size can better tolerate process variations, and thus tolerate power attacks. However, the size of transistors in a logic gate cannot be too large to violate other design metrics. Therefore, we first set a global sizing constraint and find the best-case gate-level PAT for each type of gate. We use the modified SABL gate library to build the cryptosystem and find the MTD. The MTD distribution may not be optimized because some up-sized transistors consume more data-dependent power. Thus we need to resize some critical transistors to get a better MTD. If it still gives non-optimized result, we need to go all back to the beginning by reducing the global sizing constraint, modifying the gate library and performing the MTD simulation again. Finally, we take 4 optimization iterations to achieve the best MTD distribution, by reducing the degradation probability from 57% to 18%. The power and area overhead of our optimization design are kept acceptable.
17. Conclusions Process variations deteriorate the Power-Attack Tolerance (up to 61% for DPAT and 71% for LPAT), and hence facilitate DPA attacks.
The advantage of DPA-resistant logic gates (e.g. SABL) is compromised by up to 57%.
Selective transistor upsizing in the gate library can mitigate the process variation impacts with minor design overhead.
Future work
Design methodology of secure embedded systems:
Evaluate the PAT at different abstraction levels
Determine the dominant factors
Modify gate library to trade-off security vs. overhead
Selectively use modified library on critical circuits
Extension to other side-channels
Upon all the above results, we conclude that process variations can negatively impact the power-attack tolerance of a deep submicron embedded cryptosystem, and to facilitate DPA attacks. We demonstrate that even the advantage of DPA-resistant logic gates can be compromised by process variations with up to 57%. To mitigate this, we proposed a simple but effective approach of selective transistor upsizing, and we achieve the best optimization result with minor design overhead. Based on this work, we hope to evaluate the PAT at different abstraction levels by modeling various factors, not only process variations. We also hope to understand the gate library design trade-off between security and other overhead. A proper solution may be selectively using the modified library only on the critical circuits with high-level security demand to minimize overhead. Although this work only focuses on power side-channels, we expect to extend our methods to study other side-channels, and new security metrics are needed as well.
Finally, we do hope that this work will help the design methodology of secure embedded systems in the near future.Upon all the above results, we conclude that process variations can negatively impact the power-attack tolerance of a deep submicron embedded cryptosystem, and to facilitate DPA attacks. We demonstrate that even the advantage of DPA-resistant logic gates can be compromised by process variations with up to 57%. To mitigate this, we proposed a simple but effective approach of selective transistor upsizing, and we achieve the best optimization result with minor design overhead. Based on this work, we hope to evaluate the PAT at different abstraction levels by modeling various factors, not only process variations. We also hope to understand the gate library design trade-off between security and other overhead. A proper solution may be selectively using the modified library only on the critical circuits with high-level security demand to minimize overhead. Although this work only focuses on power side-channels, we expect to extend our methods to study other side-channels, and new security metrics are needed as well.
Finally, we do hope that this work will help the design methodology of secure embedded systems in the near future.
18. THANK YOU!