350 likes | 475 Views
Hardware Core for Off-chip Memory Security Management in Embedded Systems. Russell Tessier 1 , Jeremie Crenne 2 , Romain Vaslin 2 , Guy Gogniat 2 , Jean-Philippe Diguet 2 , and Deepak Unnikrishnan 1 University of Massachusetts, Amherst 1 European University of Brittany 2. A need for security
E N D
Hardware Core for Off-chip Memory SecurityManagement in Embedded Systems Russell Tessier1, Jeremie Crenne2, Romain Vaslin2, Guy Gogniat2, Jean-Philippe Diguet2, and Deepak Unnikrishnan1 University of Massachusetts, Amherst1 European University of Brittany2
A need for security • Embedded systems & attacks • Threat model • State of the art • Contributions Introduction A need for security • Increase of personal mobile devices (cell phone, mp3 player, gps) • Digital convergence: • Several mobile devices in one: • Security concerns: • Intellectual property protection • Personal information NEED FOR PROTECTION AND PRIVACY
BOARD CHIP 1 External memory GPP Local memory GPP coprocessor DSP Local memory OS OS OS2 OS coprocessor Comm stack Communication network AES turbo code Hardware accelerator Local memory Shared memory Monitors Hardware accelerator KEY RAM Comm µP Communication interface RAM RSA CHIP 2 Power supply External communication interface • A need for security • Embedded systems & attacks • Threat model • State of the art • Contributions Introduction Embedded Systems & potential attacks • Example of embedded system architecture: • Threats: • Virus/Worms • Reverse engineering • Fault injection • Memory modification • Bus modification • Side channel • Bus probing
T=550 T=150 External memory External memory External memory Read @3 Read @3 Read @3 Inst cache Inst cache Inst cache 0x0FAE87C4 @3 0x0FAE87C4 @3 0x0FAE87C4 @3 Processor core Processor core Processor core Data cache Data cache Data cache 0xFFFFFFFF 0xFFAD9024 0xDA0067C4 0x00000045 @2 0x00000045 @2 0x00000045 @2 0xAD779056 @1 0xAD779056 @1 0xAD779056 @1 0xFFAD9024 @0 0xFFAD9024 @0 0xFFAD9024 @0 SECURE ZONE SECURE ZONE SECURE ZONE UNTRUSTED ZONE UNTRUSTED ZONE UNTRUSTED ZONE • A need for security • Embedded systems & attacks • Threat model • State of the art • Contributions Introduction The challenge of memory protection & Threat Model • External bus access leads to: • Code extraction\modification • Private data extraction\modification • Threat model: • A secure zone • Any possible modification and observation on the address and data buses • Targeted attacks: • Spoofing • Relocation • Replay SECURE ZONE UNTRUSTED ZONE External memory Inst cache Inst cache Bus control Data OS OS data Data cache Processor core Data bus Data bus OS code
Hardware Security Core Instruction cache Address DDR SRAM memory processor DDR IP controller Data Data cache Control TRUSTED ZONE UNTRUSTED ZONE • A need for security • Embedded systems & attacks • Threat model • State of the art • Contributions Introduction State of the art • Existing solutions relying on the same threat model: • AEGIS (MIT): One-time-pad / Cached hash tree (OS controlled) • XOM (Stanford): One-time-pad / MD5 (OS controlled) • PE-ICE (LIRMM): AES / Tag comparison • TEC-Tree (Princeton\LIRMM): PE-ICE / hash tree • Issues: • High memory overhead (>50%) • Software execution performance loss (>50%) • Area overhead (several AES cores, MD5 or SHA-1 cores)
RAM memory Flash memory FPGA SW Processor core Encrypted Code & data Encrypted applications Security core Ethernet Download • A need for security • Embedded systems & attacks • Threat model • State of the art • Contributions Introduction Contributions • Solution fitting with embedded systems resources: • Logic size • Memory footprint (including security data) • Power consumption • Performance • Flexible solution for the software designer: • Flexible architecture • Flexible security policy • End to end solution: • Secure system boot up • Application update • Security update
1 – Confidentiality & integrity scheme 2 – Hardware security management 3 – Security cost 4 – End to end solution • How to guarantee confidentiality & integrity? • Hardware security management • Evaluation of the security cost • End to end solution • Conclusion & perspectives
1 – Confidentiality & integrity scheme 2 – Hardware security management 3 – Security cost 4 – End to end solution • How to guarantee confidentiality & integrity? • Hardware security management • Evaluation of the security cost • End to end solution • Conclusion & perspectives
Write request of a cache line Instruction cache @ of Cache line Ciphered cache line AES key Clear cache line External memory Processor core AES core ciphering AES input AES output Data cache • Common security tools • AES-CTR mode • Fast integrity checking with AES-GCM • Confidentiality & integrity in action • Comparison with previous work Hash core Hash input Hash output TRUSTED ZONE UNTRUSTED ZONE Read request of a cache line Instruction cache @ of Cache line Ciphered cache line AES key Clear cache line External memory AES core deciphering Processor core AES input AES output = ? Data cache Hash core Hash output Hash input TRUSTED ZONE UNTRUSTED ZONE 1 – Confidentiality & integrity scheme 2 – Hardware security management 3 – Security cost 4 – End to end solution Common security tools • AES based: • Add latency (~10 cycles/AES computation) • Critical data path latency (70 cycles for a read) • Processor based architecture • Hash algorithm based: • Add latency (60, 80 cycles/ hash computation)
Common security tools • AES-CTR mode • Fast integrity checking with AES-GCM • Confidentiality & integrity in action • Comparison with previous work @ TS IV Data fetching Sending data to core AES deciphering (a) Sending data to core Data fetching (b) Keystream generation(AES) Latency gain xor 1 – Confidentiality & integrity scheme 2 – Hardware security management 3 – Security cost 4 – End to end solution AES-CTR: An efficient confidentiality scheme • AES in Counter mode of operation (AES-CTR) • AES input composed of: • Time stamp/counter (replay) • Data address (relocation) • Initialization vector • Deciphering latency gain Deciphering Ciphering AES key AES core 128 bits Ciphertext Keystream Plaintext AES input AES output Plaintext Ciphertext
Common security tools • AES-CTR mode • Fast Integrity Checking with AES-GCM • Confidentiality & integrity in action • Comparison with previous work Data fetching Sending data to core AES deciphering (a) Data fetching Sending data to core Latency gain (b) Keystream generation(AES) xor ic 1 – Confidentiality & integrity scheme 2 – Hardware security management 3 – Security cost 4 – End to end solution AES-GCM: A counter based mode with a low latency integrity checking • AES-GCM • is NIST standardized • Relies on 128-bit AES and can be parallelizable and pipelined • Provide fast integrity checking • Integrity operations rely on Galois Field operation • Multiplication on GF(2128) can be done in 1 cycle ! but has to be carefully designed to avoid huge logic overhead • An 128-bit data integrity check can be done in 3 additional cycles !
Common security tools • AES-CTR mode • Fast Integrity Checking with AES-GCM • Confidentiality & integrity in action • Comparison with previous work 1 – Confidentiality & integrity scheme 2 – Hardware security management 3 – Security cost 4 – End to end solution AES-GCM: A counter based mode with a low latency integrity checking IV96 || @32 || TS32 IV96 || @32 || (TS+1)32 TS+1 128 bit 128 bit 128 bit 128 bit ENCRYPTION & DECRYPTION CIRCUITRY 128-bit AES 128-bit AES 128 bit 128 bit 128 bit 128 bit Plaintext 1 Plaintext 2 128 bit 128 bit Ciphertext 1 Ciphertext 2 128 bit 128 bit MultH MultH 128 bit 128 bit 128 bit 064 || Len(C)64 AUTHENTICATION CIRCUITRY 128 bit MultH 128 bit Tag
Common security tools • AES-CTR mode • Fast integrity checking with AES-GCM • Confidentiality & integrity in action • Comparison with previous work 1 – Confidentiality & integrity scheme 2 – Hardware security management 3 – Security cost 4 – End to end solution Typical use with a processor architecture – Write request Trusted Area Untrusted Area Instruction cache Hardware Security Core Timestamp memory AES Key (UKey) IV @ AES-GCM Timestamp generator AES input AES output Data cache Processor External memory IC Tagmemory tag Input cache line Output cache line keystream core control = ? bypass Cache data fetching xor Sending ciphered data Keystream generation(AES) • Operations scheduling ICG TS+ TSm ICm
Operations scheduling • Common security tools • AES-CTR mode • Fast integrity checking with AES-GCM • Confidentiality & integrity in action • Comparison with previous work 1 – Confidentiality & integrity scheme 2 – Hardware security management 3 – Security cost 4 – End to end solution Typical use with a processor architecture – Read request Trusted Area Untrusted Area Hardware Security Core Instruction cache Timestamp memory AES Key (UKey) @ IV AES-GCM Timestamp generator AES output AES input Data cache External memory Processor tag IC Tagmemory Output cache line Input cache line = ? keystream core control = ? bypass valid Ciphered data fetching Sending data to cache Keystream generation(AES) xor ICG TSm ICm
AES-GCM / 160 kB PE-ICE / 280 kB XOM / 288 kB TEC-Tree / 390 kB AEGIS / 468 kB Off-chip - data On-chip - data Off-chip - code On-chip - code 32 32 64 160 128 24 128 128 96 128 276 • Common security tools • AES-CTR mode • Fast integrity checking with AES-GCM • Confidentiality & integrity in action • Comparison with previous work 128 262 1 – Confidentiality & integrity scheme 2 – Hardware security management 3 – Security cost 4 – End to end solution Comparison with state of the art • Memory footprint for 256kB of data & 256 kB of code: • Approaches overview: % DES 90 Object tracking 95 Dhrystone 87 ADPCM 80 84 81 78 70 67 66 60 50 51 50 40 [1] 43 AES-GCM XOM (MD5) 30 PE-ICE 31 28 20 [1] AES-GCM produces 128-bit IC Tag for a 128-bit word. We only keep the 32 MSBs to avoid the memory penalty. The security level can be increased to 1/2128 .
1 – Confidentiality & integrity scheme 2 – Hardware security management 3 – Security cost 4 – End to end solution • How to guarantee confidentiality & integrity ? • Hardware security management • Evaluation of the security cost • End to end solution • Conclusion & perspectives
Architecture & security flexibility • Security memory mapping • SMM construction example • Integration of SMM • Architecture detailed view 1 – Confidentiality & integrity scheme 2 – Hardware security management 3 – Security cost 4 – End to end solution A need for architecture and security flexibility • Cost of security is high (area, performance, memory) • Requires resource usage within the FPGA • Memory (between 30 & 50% overhead) • Software execution performance (between 15 & 30% overhead) • New solutions to save resources: • Hardware? • Software? • Hardware & software? • Offer more control on security policy to the designer
Architecture & security flexibility • Security memory mapping • SMM construction example • Integration of SMM • Architecture detailed view Uniform protection Non protected Confidentiality only Confidentiality / Integrity 1 – Confidentiality & integrity scheme 2 – Hardware security management 3 – Security cost 4 – End to end solution Security memory mapping • Security management based on memory mapping of the code & data • Adapted for application running with an Operating System • Advantages: • Reduction of security memory overhead • Reduction of software execution losses • Reduction of power consumption due to security Task 1 code Task 2 code Code Task n code OS code R/W data OS data Data Task 1 stack Task 2 stack Task n stack
Architecture & security flexibility • Security memory mapping • SMM construction example • Integration of SMM • Architecture detailed view 1 – Confidentiality & integrity scheme 2 – Hardware security management 3 – Security cost 4 – End to end solution SMM construction • Segment 0: • Base @: 0x8000020 • Size: 1028 bytes • Confidentiality & integrity • Code • Segment 1: • Base @: 0x8000424 • Size: 680 bytes • Confidentiality only • Code • Segment 2: • Base @: 0x80006ac • Size: 2048 bytes • Confidentiality & integrity • Code 0x8000020 <alt_exception>: 8000020: addi sp,sp,-76 8000024: stw ra,0(sp) ... 0x80001d0 <task1>: 80001d0: call 800eff8 <OSFlagPend> 80001d4: call <alt_timestamp_start> 80001d8: cmpge r2,r2,zero ... 0x80002e8 <task2> 80002e8: addi sp,sp,-20 80002ec: stw ra,16(sp) 80002f0: stw fp,12(sp) ... 0x8000424 <task3>: 8000424: call 800eff8 <OSFlagPend> 8000428: movhi r4,2049 800042c: addi r4,r4,17116 ... 0x80006ac <task4>: 80006ac: stb r2,9(fp) 80006b0: ldbu r2,9(fp) 80006b4: cmpgeui r2,r2,119 ...
Architecture & security flexibility • Security memory mapping • SMM construction example • Integration of SMM • Architecture detailed view Base @ Size Security level Code/data 1 – Confidentiality & integrity scheme 2 – Hardware security management 3 – Security cost 4 – End to end solution Secure architecture with SMM • Security Memory Mapping • Not dedicated to a given security mode (AES-GCM) • Fully done in hardware, no OS modification Trusted Area Untrusted Area Hardware Security Core Instruction cache @ SMM Segment 1 Segment n Base @ Size Security level Address Code/data External memory Processor Code/data Security level Data Data cache Ciphering/Hashing core Control
Architecture & security flexibility • Security memory mapping • SMM construction example • Integration of SMM • Architecture detailed view 1 – Confidentiality & integrity scheme 2 – Hardware security management 3 – Security cost 4 – End to end solution Hardware security core with SMM Trusted Area Untrusted Area Instruction cache Hardware Security Core Security Memory Map core control Timestamp memory AES Key (UKey) @ Segment ID AES-GCM Timestamp generator AES input AES output Data cache External memory Processor IC Tagmemory tag Output cache line = ? Input cache line keystream core control = ? bypass valid
1 – Confidentiality & integrity scheme 2 – Hardware security management 3 – Security cost 4 – End to end solution • How to guarantee confidentiality & integrity? • Hardware security management • Evaluation of Security Cost • End to end solution • Conclusion & perspectives
PLB bus arbiter 1 Hardware Security Core PLB bus arbiter 2 Instruction cache Master Address DDR SRAM memory Microblaze processor DDR IP controler Master Slave Slave Data Data cache Master Control TRUSTED ZONE UNTRUSTED ZONE • Experimental approach • Applications security policy • Experimental results • A Trade-off for benefits 1 – Confidentiality & integrity scheme 2 – Hardware security management 3 – Security cost 4 – End to end solution Experimental approach • Architecture overview: • Microblaze 7.00 • High resolution timer • Flash bridge • DDR sdram bridge • JTAG • 4 applications running with MicroC/OS-II: • Image processing (morphological image processing) • Video On Demand (RS, AES, MPEG-2) • Communication (RSd, AES, RSc) • Multi hash (MD5, SHA-1, SHA-2 )
27 60 40 73 45 74 26 55 100 100 100 100 % Protected % Not protected • Experimental approach • Applications security policy • Experimental results • A Trade-off for benefits 1 – Confidentiality & integrity scheme 2 – Hardware security management 3 – Security cost 4 – End to end solution Applications security policy Data Code • Image processing: • Only algorithm core code & data protected (CI) • Video-On-Demand: • MPEG decoder code must not be stolen (CO) • Image must not be stolen (CO) • AES sensitive data must be protected (CI) • Communication: • Processed data must not be stolen (CO) • Code must not be attacked (CI) • Hash: • Code must not be stolen (CO) • Processed data can be stolen Programmable protection != Uniform protection OS protected Programmable protection ≈ Uniform protection
Experimental approach • Applications security policy • Experimental results • A Trade-off for benefits 1 – Confidentiality & integrity scheme 2 – Hardware security management 3 – Security cost 4 – End to end solution Logic area overhead • Uniform protection: • CI or CO for the whole memory • Programmable protection: • Policy decided by the software designer • Base Microblaze architecture: ~3335 LUTs [2] application Uniform protection Programmable protection µB + HSC HSC µB + HSC HSC Image 6820 3485 6962 3627 VOD 6954 3619 6934 3599 Comm. 6805 3470 6845 3510 Hash 5911 2576 5878 2543 ~ +104 % ~ +107 % ~ +77 % ~ +76 % [2] All results target a Spartan-6 device SP605 (XC6SLX45T). The base configuration uses a Microblaze with 2KB D/I caches and operated at 86 MHz.
Experimental approach • Applications security policy • Experimental results • A Trade-off for benefits 1 – Confidentiality & integrity scheme 2 – Hardware security management 3 – Security cost 4 – End to end solution Software performance losses • Software performances losses compared with non protected approach • Performance loss is security policy dependent No Protection Uniform Protection Programmable Protection (ms) (ms) (ms) Image 2k 131.3 156.9 -19.5% 146.9 -11.9% VOD 2k 11940.3 13751.2 -15.2% 13453.5 -12.7% Comm 2k 60.2 66.7 -10.8% 65.4 -8.6% Hash 2k 7.5 8.7 -15.9% 8.6 -14.4% -15.35 % -11.9 %
IC tag data TS data IC tag code • Experimental approach • Applications security policy • Experimental results • A Trade-off for benefits 1 – Confidentiality & integrity scheme 2 – Hardware security management 3 – Security cost 4 – End to end solution Security memory footprint • Memory overhead is fully dependant of the designer choice for security policy 199.6 kbytes 53.8 180 160 140 107.8 75% 120 100 80 60 48.8 43.2 42.2 33.2 23% 14.1 40 8.5 7.4 52% 38 28.3 17 20 8.5 14.8 20 7 6.8 5.4 20 17.8 17.8 100% 0 8.3 6.8 6.5 6.3 UP PP UP PP UP PP UP PP Image VOD Comm Hash
Experimental approach • Applications security policy • Experimental results • A Trade-off for benefits 1 – Confidentiality & integrity scheme 2 – Hardware security management 3 – Security cost 4 – End to end solution A trade-off between security & resources & performance • Benefit of our complete approach • Increase software performance versus uniform (~ +3%) • Reduce the memory security footprint (~ -50%) • Increase security flexibility for the designer • Increase logic size (~ +3%) Values depending on security policy & designer wishes
1 – Confidentiality & integrity scheme 2 – Hardware security management 3 – Security cost 4 – End to end solution • How to guarantee confidentiality & integrity? • Hardware security management • Evaluation of the security cost • End to end solution • Conclusion & perspectives
Context of secure boot • System boot up case study • Experimental results Processor core Security core 1 – Confidentiality & integrity scheme 2 – Hardware security management 3 – Security cost 4 – End to end solution Lightweight boot approach • Infrequent task • Challenges of secure boot: • Secure FPGA configuration • Secure code loading into RAM from Flash • Again, cost-conscious: • Low logic boot scheme • Low power of boot logic during execution • Issues to tackle: • Efficient & secure data Flash loading in RAM memory • Initialization of ciphered data in RAM, IC Tag & TS in on-chip memory RAM memory Flash memory FPGA SW Encrypted Code & data Encrypted applications Ethernet Download
Context of secure boot • System boot up case study • Experimental results Microblaze Bitstream Processor core CHC SMM Hardware Security Core Appli config SMM CHC Init. Vector Timestamp FPGA supplier data protection AES with ExecGCM AES with LoadGCM policy 1 – Confidentiality & integrity scheme 2 – Hardware security management 3 – Security cost 4 – End to end solution Boot execution scheme • Boot done in 2 steps: • FPGA secure configuration • Application secure loading from Flash to RAM memory Flash memory RAM memory Application code FPGA SW Secure FPGA configuration LoadGCM Application code ExecGCM IC Tag memory
Context of secure boot • System boot up case study • Experimental results 1 – Confidentiality & integrity scheme 2 – Hardware security management 3 – Security cost 4 – End to end solution AES-GCM Results • Application secure boot time: • Very low memory overhead • 32-bit counter value • 96-bit initialization vector • 128-bit IC tag • Non protected boot time: less than 5 ms • Extra boot time due to AES-GCM: ~500µs • Boot time is a small part of the system lifetime Trend of boot time depending on security policy % 120 100 106 104 100 80 60 40 AES-GCM UP AES-GCM PP 20 NP
Conclusion & perspectives • How to guarantee confidentiality & integrity? • Hardware security management • Evaluation of the security cost • End to end solution • Conclusion & perspectives
Conclusion & perspectives • Major contributions • Perspectives Conclusion • A cost-conscious approach fitting with embedded systems resources: • Low cost security • A full evaluation of the security cost: • Area (~75/110%), memory (~20/30%) • Cost flexibility (~3%) • End to end solution: • Secure boot up • Multi-application support & architecture multi-configuration: • Configurable SMM at boot up • Boot loader
Conclusion & perspectives • Major contributions • Perspectives Perspectives • CAD tool: • Security policy & resources exploration • Extended threat model: • Evaluation of the cost for DPA, fault injection, … • Behavior guessing protection • Explore the FPGA reconfigurable capabilities: • Weakness of reconfiguration ? (a new pass for potential threats) • Strength of reconfiguration ? (dynamic behavior) • Emerging technology, a new challenge: • Multi-processor architecture • Multi-OS architecture, OS virtualization