510 likes | 657 Views
ISENET: Integrated Security for the Expanded Internet. Gookwon Edward Suh Massachusetts Institute of Technology. Cell phone network. Internet. Home network Car network. Sensor network. The Next Internet. The Internet is expanding into the physical world
E N D
ISENET: Integrated Security for the Expanded Internet Gookwon Edward Suh Massachusetts Institute of Technology
Cell phone network Internet Home network Car network Sensor network The Next Internet • The Internet is expanding into the physical world • Billions of devices will be “on the Internet” • Devices on this Expanding Net will sense properties of the real world, process and store data, and even take physical action
New Applications • Cell phones controlling physical objects through the Internet • Connect to home appliance network • Traffic control • Congestion/usage-based real-time tolls • Traffic light control system based on real-time info • Sensor networks for environmental monitoring • Energy usage in buildings • Disaster response • More responsibilities mean more risk • Security is a key enabling factor to these new applications
Attacks on the Expanded NET • Software Attacks (traditional): • Break in by exploiting one application’s security holes • Infect other applications • Devices are disseminated, unsupervised, and physically exposed • Physical Attacks • Install malicious software • Invasive probing • Non-invasive measurement • Environmental attacks
CPU System Flash TPM Traditional Approaches Don’t Work • Cryptographic primitives assume keys remain secret • Security protocols assume channels are protected against eavesdroppers • Firewall approach does not work for disseminated devices • Anyone in range can communicate with wireless device • No natural place to put firewall
Spoof Network Server Discover Secret key What is ISENET? • Fundamental research on security of Expanding Net • Many ideas will apply to more conventional nets as well • Need an Integrated effort to Secure the Expanding NET • Security efforts isolated to a particular layer may easily be subverted by attacks carried out at a different layer Distributed Applications Distributed Services Distributed Services node System Software System Software Hardware Hardware
ISENET Research Areas Distributed Applications Distributed Services Distributed Services Theory Privacy and Policy System Software System Software Hardware Hardware
ISENET Research Goals Secure Distributed Applications Isolate Untrusted Distributed Software Isolate Untrusted Distributed Software Provable Security Anonymity Policy-Aware Systems Tamper-Resistant Hardware Tamper-Resistant Hardware
Malicious users Malicious software agents Compromised Nodes/Data Measurement Attacks Probing Attacks Key Ideas and Techniques in ISENET Runtime system and Query language Correlation Engine to secure distributed application against environmental attacks Physical Security Theory: Providing security with partially leaky hardware Policy-Aware Systems where users negotiate appropriate anonymity agreements Distributed Compartments (AsbestOS) Distributed Compartments (AsbestOS) Aegis: Single-Chip Secure Processor (PUFs) Aegis: Single-Chip Secure Processor (PUFs)
ISENET Research Thrust Teams Tamper-resistant Madden, Kohler, Balakrishnan, Kaashoek, Weitzner Applications hardware Devadas, Micali, Anderson, van Dijk Privacy and Weitzner, Abelson, Anderson, Lysyanskaya System software policy Kaashoek, Morris, Liskov, Lynch, Kohler, Mazières Rivest, Goldwasser, Micali, Devadas, Lysyanskaya, Lynch, van Dijk Theory and Networking, distributed computing cryptography Morris, Balakrishnan, Abelson, Katabi, Liskov, Madden
Center for Information Security and Privacy (CISP) • http://cisp.csail.mit.edu/ • 22 faculty associated with the Center and over 50 graduate students • Research projects in cryptography, secure distributed systems, Internet security and software security • Current partners: ITRI, Nokia, Quanta, RSA, Sun • Likely additional partners (9/2005): Cisco, BT, Philips
ISENET Secure Computing Platform • How can we “trust” remote computation? • Need a secure platform • Authenticate “itself (device)” • Authenticate “software” • Guarantee the integrity and privacy of “execution” Aegis Processor
Tamper-Proof Package: IBM 4758 Sensors to detect attacks Expensive Continually battery-powered Trusted Platform Module (TPM) A separate chip (TPM) for security functions Decrypted “secondary” keys can be read out from the bus Existing Approaches
Protected Environment I/O Our Approach • Build a secure platform with a “single-chip” processor (AEGIS) as the only trusted hardware component Security Kernel (trusted part of an OS) Protect Memory Identify • A single chip is easier and cheaper to protect • The processor authenticates itself, identifies the security kernel, and protects program state in off-chip memory
Authenticating the Processor • Each processor should be unique (has a secret key) so that a user can authenticate the processor Sign/MAC a message the key A user can authenticate the processor Encrypt for the processor key Only the processor can decrypt • Key challenge Embedding secrets in the processor in a way that is resistant to physical attacks
EEPROM/ROM Probe Processor Problem Storing digital information in a device in a way that is resistant to physical attacks is difficult and expensive. • Adversaries can physically extract secret keys from EEPROM while processor is off • Trusted party must embed and test secret keys in a secure location • EEPROM adds additional complexity to manufacturing
Physical System Processor Our Solution:Physical Random Functions (PUFs) • Generate keys from a complex physical system Hard to fully characterize or predict characterize configure Use as a secret Response (n-bits) Can generate many secrets by changing the challenge Challenge (c-bits) • Security Advantage • Keys are generated on demand No non-volatile secrets • No need to program the secret • Can generate multiple master keys • What can be hard to predict, but easy to measure?
Silicon PUF – Concept • Because of random process variations, no two Integrated Circuits even with the same layouts are identical • Variation is inherent in fabrication process • Hard to remove or predict • Relative variation increases as the fabrication process advances • Experiments in which identical circuits with identical layouts were placed on different ICs show that path delays vary enough across ICs to use them for identification. Challenge c-bits Response n-bits Combinatorial Circuit
1 0 0 0 1 1 A (Simple) Silicon PUF [VLSI’04] Each challenge creates two paths through the circuit that are excited simultaneously. The digital response of 0 or 1 is based on a comparison of the path delays by the arbiter We can obtain n-bit responses from this circuit by either duplicate the circuit n times, or use n different challenges Only use standard digital logic No special fabrication c-bit Challenge 0 1 1 1 1 D Q 1 if top path is faster, else 0 0 0 0 … RisingEdge 0 0 0 G 1 1 1
PUF Experiments • Fabricated 200 “identical” chips with PUFs in TSMC 0.18m on 5 different wafer runs Security • What is the probability that a challenge produces different responses on two different PUFs? Reliability • What is the probability that a PUF output for a challenge changes with temperature? • With voltage variation?
Measurement noise for Chip X = 0.9 bits Distance between Chip X and Y responses = 24.8 bits Inter-Chip Variation • Apply random challenges and observe 100 response bits Can identify individual ICs
Measurement noise with 10% voltage variation = 4 bits Environmental Variations • What happens if we change voltage and temperature? Measurement noise at 125C (baseline at 20C) = 3.5 bits Even with environmental variation, we can still distinguish two different PUFs
New Response One-Way Hash Function BCH Decoding k Syndrome BCH Encoding n - k Syndrome Reliable PUFs PUFs can be made more secure and reliable by adding extra control logic Challenge Response PUF c n For Re-generation For calibration • Hash function (SHA-1,MD5) precludes PUF “model-building” attacks since, to obtain PUF output, adversary has to invert a one-way function • Error Correcting Code (ECC) can eliminate the measurement noise without compromising security
Authentication • The processor identifies security kernel by computing the kernel’s hash (on the l.enter.aegis instruction) • Similar to ideas in TCG TPM and Microsoft NGSCB • Security kernel identifies application programs • H(SKernel) is included in a signature by the processor • Security kernel includes H(App) in the signature Application (DistComp) H(App) • H(Skernel), H(App) in a signature • A server can authenticate the processor, the security kernel, and the application Security Kernel H(SKernel)
Microsoft Xbox • Xbox is a Trusted PC Platform • All executables are digitally signed and verified before execution • Trusted (encrypted) bootloader/OS prevents from • Using copied or imported games • Online game cheating • Using Xbox as a stock PC
Secure boot RC4/128-bit key Bootloader/OS Inside Xbox: How was It Broken? • Southbridge ASIC • Secure boot code • 128-bit secret key • Flash ROM • Encrypted Bootloader • Broken by tapping bus • Read the 128-bit key • Cost about $50 • Observation • Adversary can easily • read anything on off- • chip bus From Andrew “Bunnie” Huang’s Webpage
Off-Chip Memory Protection External Memory Processor write ENCRYPT / DECRYPT INTEGRITY VERIFICATION read • Privacy Protection [MICRO36] • Encrypt private information on off-chip memory • Integrity Verification [HPCA’03,MICRO36,IEEE S&P ’05] • Check if a value from external memory is the most recent value stored at the address by the processor
Previous Work (1): Use Message Authentication Code (MAC) • Hash Functions • Functions such as MD5 or SHA-1 • Given an output, hard to find what input produced it • Also, hard to find two inputs with the same output Hash Function Input MAC Secret Key • Message Authentication Code (MAC) can be seen as a keyed hash function • Only an entity with the correct key can generate a valid MAC • An entity with the correct key can verify the MAC
Address 0x45 124, MAC(0x45, 124) IGNORE Previous Work (1): MAC-based Integrity Verification Algorithm Untrusted RAM Processor write VERIFY read 120, MAC(0x45, 120) MAC Key • Store MAC(address, value) on writes, and check the MAC on reads (used by XOM architecture by David Lie) • Does NOT work Replay attacks • Need to securely remember the off-chip memory state
root = h(h1.h2) VERIFY h1=h(V1.V2) h2=h(V3.V4) VERIFY L2 block Data Values READ Previous Work (2):Hash Trees (Merkle Trees) Processor Construct a hash tree • On-chip Storage:16-20 Bytes • Read/Write Cost: logarithmic 10x slowdown V1 V2 V3 V4 MISS Untrusted Memory
root = h(h1.h2) VERIFY h1=h(V1.V2) h2=h(V3.V4) VERIFY cache VERIFY cache cache Cached Hash Trees [HPCA’03] Processor Cache hashes on-chip On-chip cache is trusted Stop checking earlier DONE!!! V1 V2 V3 V4 MISS MISS Untrusted Memory
Hardware Implementation Issue [ISCA’05] Processor • What happens if a new block evicts a dirty one? write-back • If we cache all hashes … • Each access can cause another read and a write-back Need a large buffer • Cache hashes only if we have buffer space available Hash H Cache Buffer for IV Data B Hash I Hash K Hash K Hash H Data C Data D Data A Hash J Data D Hash J MISS UPDATE Parent Hash WRITE-BACK READ WRITE-BACK READ Hash K Hash H Data D Hash J Untrusted Memory
Exception? load r0,(r1) addi r0,r0,#3 store (r1),r0 Integrity Verification Load data Use data Store Result Hiding Verification Latency • Integrity verification takes at least 1 hash computation • SHA-1 has 80 rounds 80 cycles to compute • Should we wait for integrity verification to complete? • No need for precise integrity violation exceptions; they should never happen. If they do, we simply abort • Memory verification can be performed in the background, so no latency is added except for instructions that can compromise security
Worst case 50% degradation Most cases < 25% degradation Simulated Performance • Assume that the entire program and data protected • If we add encryption, the overhead is 27% on average and 56% in the worst case Normalized Performance (IPC) L2 Caches with 64B blocks
Implementation • Fully-functional system on an FPGA board [ICS’03][ISCA’05] • AEGIS (Virtex2 FPGA), Memory (256MB SDRAM), I/O (RS-232) • Based on openRISC 1200 (a simple 4-stage pipelined RISC) • AEGIS instructions are implemented as special traps Processor (FPGA) RS-232 External Memory
Area Estimate • Synopsys DC with TSMC 0.18u lib • New instructions and PUF add 30K gates, 2KB mem (1.12x larger) • Off-chip protection adds 200K gates, 20KB memory (1.8x larger total) • The area can be further optimized I/O (UART, SDRAM ctrl, debug unit) 0.258mm2 Cache (16KB) IV Unit (5 SHA-1) Encryption Unit (3 AES) 0.864mm2 1.050mm2 1.075mm2 Cache (4KB) 0.504mm2 Code ROM (11KB) Scratch Pad (2KB) 0.138mm2 0.261mm2 D-Cache (32KB) PUF 0.027mm2 I-Cache (32KB) 0.086mm2 Core 1.815mm2 2.512mm2 0.512mm2
EEMBC/SPEC Performance • Performance overhead comes from off-chip protections • 5 EEMBC kernels and 1 SPEC benchmark • EEMBC kernels have negligible slowdown • Low cache miss-rate • Only ran 1 iteration • SPEC twolf also has reasonable slowdown
Security Review • Have we built a secure computing system? Secure - Integrity verification and encryption Flash SDRAM Kernel SDRAM • Secure • Verified using hashes • With PUFs, attacks should be carried out while the processor is running • Probing on-chip memory • Side channels (address bus, power, electromagnetic fields) • Can be further secured using techniques from smartcards
Related Projects • XOM (eXecution Only Memory) • Stated goal: Protect integrityand privacy of code and data • Operating system is completely untrusted • Memory integrity checking does not prevent replay attacks • Privacy enforced for all code and data • TCG TPM / Intel LaGrande / Microsoft NGSCB / ARM TrustZone • Protects from software attacks • Off-chip memory integrity and privacy are assumed • AEGIS provides “higher security” with “smaller Trusted Computing Base (TCB)”
Summary • Physical attacks are becoming more prevalent • Untrusted owners, physically exposed devices • Requires secure hardware platform to trust remote computation • The trusted computing base should be small to be secure and cheap • Hardware: single-chip secure processor • Physical Random functions • Memory protection mechanisms • AMD and ARM currently evaluating PUF technology with a view toward building PUFs in CPUs
Performance Slowdown • Performance overhead comes from off-chip protections • Synthetic benchmark • Reads 4MB array with a varying stride • Measures the slowdown for a varying cache miss-rate • Slowdown is reasonable for realistic miss-rates • Less than 20% for integrity • 5-10% additional for encryption
Protecting Program States • On-chip registers and caches • Security kernel handles context switches • Virtual memory permission checks in MMU • Off-chip memory • Memory Encryption [MICRO36] • Counter-mode encryption • Integrity Verification [HPCA’03,MICRO36,IEEE S&P ’05] • Hash trees and Log-hash schemes • Swapped pages in secondary storage • Security kernel encrypts and verifies the integrity
A Simple Protection Model • How should we apply the authentication and protection mechanisms? • What to protect? • All instructions and data • Both integrity and privacy • What to trust? • The entire program code • Any part of the code can read/write protected data Uninitialized Data (stack, heap) Encrypted & Integrity Verified Initialized Data (.rodata, .bss) Hash Program Identity Program Code (Instructions) Memory Space
What Is Wrong? • Large Trusted Code Base • Difficult to verify to be bug-free • How can we trust shared libraries? • Applications/functions have varying security requirements • Do all code and data need privacy? • Do I/O functions need to be protected? • Unnecessary performance and power overheads • Architecture should provide flexibility so that software can choose the minimum required trust and protection
Distributed Computation Example • If Func(x) is a private computation • Needs both privacy and integrity • Calling a signing system call to security kernel • Only needs integrity • Receiving the input and sending the result (I/O) • No need for protection • No need to be trusted DistComp() { x = Receive(); result = Func(x); sig =sys_pksign(x,result); Send(result,sig); } Func(…) { … } Receive() { … } Send(…) { … }
AEGIS Memory Protection • Architecture provides five different memory regions • Applications choose how to use • Static (read-only) • Integrity verified • Integrity verified & encrypted • Dynamic (read-write) • Integrity verified • Integrity verified & encrypted • Unprotected • Only authenticate code in the verified regions Receive(), Send() data Unprotected Dynamic Encrypted Func() data Dynamic Verified DistComp() data Static Encrypted Static Verified DistComp(), Func() Receive(), Send() Unprotected Memory Space