680 likes | 691 Views
This proposed dissertation aims to develop a novel method for detecting and mitigating hardware malicious inclusions in microprocessors using 3D fabrication techniques and execution monitor theory. The research will focus on detecting and monitoring control and data flows in untrusted processors to enhance the security of high-assurance computing systems.
E N D
Applying Security Techniques to 3D Fabrication Methods for General Purpose Processors CDR Mike Bilzor
Simple Overview ExecutionMonitorTheory 3D Processor ExecutionMonitor 3D Fabrication Techniques + = Develop a "recipe" for making this, from a processor's specification
Proposed Dissertation Abstract Hardware malicious inclusions, or hardware Trojans, in microprocessors present an increasing threat to U.S. high-assurance computing systems, particularly those of the Department of Defense, due to vulnerabilities at several stages in the acquisition chain. Existing testing techniques are limited in their ability to detect these maliciously modified integrated circuits. We propose a novel method, based on the evolution of three-dimensional integrated circuit fabrication techniques and on execution monitor theory, by which malicious inclusions, including those not detectable by existing means, may be detected and potentially mitigated in the lab and in fielded, real time operation. The proposed work will develop and implement techniques for detecting and mitigating hardware malicious inclusions by utilizing 3D connections to monitor the control and data flows in an untrusted, target commodity processor from a trusted attached processor called the "control plane".
Outline • Threat to General Purpose Processors • 3D Fabrication Techniques • Processor Execution • Malicious Inclusions and Existing Detection Methods • Example of a 3D Execution Monitor • Planned Experiments and Tools • Related Work and Limitations • Proposed Contributions
Microprocessor Threat • High assurance customers in DoD need reliable processors • Classified systems, weapons, aircraft • The most advanced processors are still designed in the U.S., but almost none are manufactured here • China, Taiwan, Korea, Philippines • U.S. companies becoming "fabless" – design only • DoD's Trusted Foundry program cannot support all high-assurance needs
Microprocessor Threat • "The Hunt for the Kill Switch" [Adee08] • In 2007, Israel bombs a suspected Syrian nuclear facility, but Syrian radars are not functioning – were they disabled using a "kill switch"? A New York Times article cites a source claiming knowledge of the operation • In 2008, an anonymous defense contractor reports that a European manufacturer has designed a chip that can be remotely disabled, and claims that French contractors have used the chip in military equipment • "The Hacker in Your Hardware" [Villa10] • Scientific American, August 2010
Microprocessor Threat • High-assurance supply chain contains counterfeit processors [King10, Grow08] • Over 400 fake Cisco routers seized in 2007-2008, many sold to DoD, some for classified systems • January 2007 – counterfeit chip discovered in an F-15 flight computer during maintenance at Warner-Robins AFB, GA • DoD discovered 9,356 fake electronic parts in 2008 alone • Estimate: as many as 15% of all replacement processors purchased by DoD may be counterfeit
Microprocessor Threat • Where processors can be subverted • Design and Fabrication • Changes to HDL design • Insider threat • Use/re-use of compromised modular, publicly available HDL design components • Changes to low-level, optimized layout (netlist) • Shipping, distribution, component assembly • Replacing bona fide parts with counterfeit parts • Advanced techniques like FIB milling
Processor Development ArchitecturalDesignSpecification High Level Processor Design Low Level Processor Design Fabrication Assembly andDistribution InstallationandOperation -Libraries -Packages and Entities -Logic Design -VHDL, Verilog -Wafer Mask Generation -Wafer Productionand Test -Processor Finishing -Instruction Set -Registers -Cache -Interrupts -Privilege Levels -Optimization -Place and Route -Netlist andSchematics -Processor Batch Testing -Printed Circuit Board Packaging and Test -Shipping -SystemIntegration-OperationalTest
Processor Development ArchitecturalDesignSpecification High Level Processor Design Low Level Processor Design Fabrication Assembly andDistribution InstallationandOperation -Libraries -Packages and Entities -Logic Design -VHDL, Verilog -Wafer Mask Generation -Wafer Productionand Test -Processor Finishing -Instruction Set -Registers -Cache -Interrupts -Privilege Levels -Optimization -Place and Route -Netlist andSchematics -Processor Batch Testing -Printed Circuit Board Packaging and Test -Shipping -SystemIntegration-OperationalTest Trusted Either / Mix Source: DARPA TRUST in Integrated Circuits Program, Industry Day Brief, 26 March 2007 Untrusted
Processor Development 3D ExecutionMonitor Implementation 3D ExecutionMonitor Source Data Assembly Parallel Design 3D Fabrication ArchitecturalDesignSpecification High Level Processor Design Low Level Processor Design Fabrication Assembly andDistribution InstallationandOperation -Libraries -Packages and Entities -Logic Design -VHDL, Verilog -Wafer Mask Generation -Wafer Productionand Test -Processor Finishing -Instruction Set -Registers -Cache -Interrupts -Privilege Levels -Optimization -Place and Route -Netlist andSchematics -Processor Batch Testing -Printed Circuit Board Packaging and Test -Shipping -SystemIntegration-OperationalTest TargetedThreats
3D Fabrication • Manufacturers (Intel, AMD) are motivated to develop 3D techniques for performance • 2D feature size is near its physical performance limit • Heat, timing, and leakage issues dominate • Early uses • 3D Cache memory with lower latency due to proximity • 3D Stacked coprocessor-type modules • Special uses like image sensors • Future implementations • Many-core CPU stacks • Performance monitoring, verification, and security?
Image: Synopsis 3D Fabrication • 3D methods • MCMs – "Multi-Chip Modules" • TSVs – "Through Silicon Vias" • Wire-bonded connections • Micro-RF relays • If you could connect to any circuit in the adjacent plane directly, how might that be useful?
Image: [Dav05] 3D Fabrication
Possible Layouts Optional Control Plane – Custom, Trusted IC “3D” Interconnect Connection fromPrinted Circuit Boardto Integrated Circuit Computation Plane – Commodity, Untrusted IC OR Computation Plane – Commodity, Untrusted IC “3D” Interconnect Optional Control Plane – Custom, Trusted IC Connection fromPrinted Circuit Boardto Integrated Circuit
Processor Elements • How can we categorize what operations are performed in a general purpose processor? - Instruction Execution - Movement of Data - Storage and Retrieval - Arithmetic and Logic 31 (PC)30 (Link) rd rt rs zero? busy? Opcode load MA ALU Op. loadA loadB load IR Reg. Sel. MA addr IR 0 rs A B addr rt 1 rd 2 ... On-ChipMemory Ex. Sel. Immediate Extended 30 Link ALU PC 31 Mem. Wrt. Reg. Wrt. Registers En. Mem. En. Reg. En. Imm. data En. ALU data Bus
Processor Elements • The basic elements that characterize what happens in a general processor: • Execution Flow (following instructions) • Data Paths • Storage and Retrieval • Arithmetic and Logic Computation • Conjecture: Everything that happens in a general purpose processor falls into one (or more) of these categories
Processor Operation • How do we trust the basic types of operations in a general-purpose processor? • Execution Flow • Data Paths • Storage and Retrieval • Arithmetic and Logic Computation • Must also add "Functional Protection" – ensuring continued processor availability • Disabling direct vulnerabilities (zero-cycle attacks)
Processor Operation • Execution Flow • Turing Machine analogy: • Read an input symbol, write to the tape, move left or right • Need to perform these functions in the specified order • Data Paths • Turing Machine analogy: • After reading an input symbol (the data), its value must not change before a transition is identified and executed, or correctness is violated • Storage and Retrieval • Turing Machine analogy: • Read and write – need to ensure that when we write something to the tape, it's the same value when we go back to read it later
Processor Operation • Math and Logic Computation • Turing Machine analogy: • If you change the encoding of the TM while it's running, or modify the input during execution, correctness is violated • Functional Protection (Keep-Alive) • Turing Machine analogy: • If you turn the machine off, it no longer works correctly (cannot fulfill its obligations)
Correct Processor Operation • Processor operation is correct only if: • Execution flow is correct, i.e. it proceeds according to the architectural design specification, WRT the instruction opcode, and the instruction opcode itself is correct • Data path integrity is preserved • Arithmetic and logic computations in the processor all function correctly, according to their specifications • Storage and retrieval in the processor always function correctly (to include storage and retrieval of instruction opcodes, either from inputs or from local memory) • There is no direct impairment: no uncommanded halts, resets, shutdowns, or other functional attacks
Malicious Inclusions • "Malicious Inclusion" = "Hardware Trojan" • Physical change to a processor that causes a deviation from its specified functionality • Details of actual attacks may be classified • Several academic investigations have demonstrated malicious inclusions in the last few years • Subverted hardware cannot be corrected using software
Taxonomy slightly modified from [Tehr10] Malicious Inclusion Taxonomy Classification Characteristics Activation Action* External Trigger Leak Information Distribution Internal Trigger Modify Data Structure Modify Functionality Always Active After Trigger Size DisableFunctionality Conditionally Active Type *May include more than one type of malicious action
Malicious Inclusion Taxonomy Classification Characteristics Activation Action 3D Detection or MitigationTechnique External Trigger DatapathIntegrity Leak Information Distribution Math/LogicVerification Internal Trigger Modify Data Structure Load/Store Verification Modify Functionality ExecutionMonitor Always Active After Trigger Size Keep-AliveProtections DisableFunctionality Conditionally Active Type
Malicious Inclusion Taxonomy Classification Characteristics Activation Action 3D Detection or MitigationTechnique External Trigger DatapathIntegrity Leak Information Distribution Math/LogicVerification Internal Trigger Modify Data Structure Load/Store Verification Modify Functionality ExecutionMonitor Always Active After Trigger Size Keep-AliveProtections Disable Functionality Conditionally Active Type ResearchFocus
Malicious Inclusion: HDL Example IF (r.d.inst (conv_integer (r.d.set)) = X"80082000") THEN hackStateM1 <= '1'; END IF; IF (hackStateM1 = '1' and r.d.inst (conv_integer (r.d.set)) = X"80102000") THEN r.w.s.s <= '1'; END IF; Trigger: Instruction codes 80082000 and 81012000 translate to: AND R0, #0 OR R0, #0 Instruction Register Control Register Privilege Bit [VHDL code for Leon3 processor. Example from King, Hicks]
Execution Flow 3D Execution Monitor Example
Execution Flow • With respect to the processor only, execution flow is correct only if: • The opcode is correct: not modified since being input to the chip, or retrieved from local memory (cache) • The control flow precisely follows the design specification for that instruction opcode • In the following example, we assume the instruction opcode is correct, and look to verify that the execution follows the design specification • Consider an example bus-based MIPS architecture...
Source: MIT Open Courseware 6.823 A MIPS Bus-Based Architecture 31 (PC)30 (Link) rd rt rs zero? busy? Opcode load MA ALU Op. loadA loadB load IR Reg. Sel. MA addr IR 0 rs A B addr rt 1 rd 2 ... On-ChipMemory Ex. Sel. Immediate Extended 30 Link ALU PC 31 Mem. Wrt. Reg. Wrt. Registers En. Mem. En. Reg. En. Imm. data En. ALU data Bus
Register-Register Add 31 (PC)30 (Link) rd rt rs zero? busy? Opcode load MA ALU Op. loadA loadB load IR Reg. Sel. MA addr IR 0 rs A B addr rt 1 rd 2 ... On-ChipMemory Ex. Sel. Immediate Extended 30 Link ALU PC 31 Mem. Wrt. Reg. Wrt. Registers En. Mem. En. Reg. En. Imm. data En. ALU data Bus
Register-Register Add 31 (PC)30 (Link) rd rt rs zero? busy? Opcode load MA ALU Op. loadA loadB load IR Reg. Sel. MA addr IR 0 rs A B addr rt 1 rd 2 ... On-ChipMemory Ex. Sel. Immediate Extended 30 Link ALU PC 31 Mem. Wrt. Reg. Wrt. Registers En. Mem. En. Reg. En. Imm. data En. ALU data Bus
Register-Register Add 31 (PC)30 (Link) rd rt rs zero? busy? Opcode load MA ALU Op. loadA loadB load IR Reg. Sel. MA addr IR 0 rs A B addr rt 1 rd 2 ... On-ChipMemory Ex. Sel. Immediate Extended 30 Link ALU PC 31 Mem. Wrt. Reg. Wrt. Registers En. Mem. En. Reg. En. Imm. data En. ALU data Bus
Malicious Inclusion Upon observing an ALU operation with special operands as triggers, this malicious inclusion commands the ALU result to be written from the bus to a specified "secret" address in local memory (in addition to being written to the destination register). As a result, the operand value is "leaked". B A FF0A 13D0 Equal? Equal? Mem Write load MA 99FF MA
Register-Register Add 31 (PC)30 (Link) rd rt rs zero? busy? Opcode load MA ALU Op. loadA loadB load IR Reg. Sel. MA addr IR FF0A FF0A 0 rs A B addr rt 1 rd 2 ... On-ChipMemory Ex. Sel. Immediate Extended 30 Link ALU PC 31 Mem. Wrt. Reg. Wrt. Registers En. Mem. En. Reg. En. Imm. data En. ALU data Bus
Register-Register Add 31 (PC)30 (Link) rd rt rs zero? busy? Opcode load MA ALU Op. loadA loadB load IR Reg. Sel. MA addr IR 13DO FF0A FF0A 0 rs A B addr 13DO rt 1 rd 2 ... On-ChipMemory Ex. Sel. Immediate Extended 30 Link ALU PC 31 Mem. Wrt. Reg. Wrt. Registers En. Mem. En. Reg. En. Imm. data En. ALU data Bus
Register-Register Add 31 (PC)30 (Link) rd rt rs zero? busy? Opcode load MA ALU Op. loadA loadB load IR Reg. Sel. MA addr IR 13DO FF0A FF0A 0 rs A B addr 13DO rt 1 12DA rd 2 12DA ... 99FF On-ChipMemory Ex. Sel. Immediate Extended 30 Link ALU PC 31 Mem. Wrt. Reg. Wrt. Registers En. Mem. En. Reg. En. Imm. data En. ALU data Bus
But What if We Can Monitor the Bus and the Control Signals? 31 (PC)30 (Link) rd rt rs zero? busy? Opcode load MA ALU Op. loadA loadB load IR Reg. Sel. MA addr IR 0 rs A B addr rt 1 rd 2 ... On-ChipMemory Ex. Sel. Immediate Extended 30 Link ALU PC 31 Mem. Wrt. Reg. Wrt. Registers En. Mem. En. Reg. En. Imm. data En. ALU data Bus Control Plane Monitor Point
Register-Register Add 31 (PC)30 (Link) rd rt rs zero? busy? Opcode load MA ALU Op. loadA loadB load IR Reg. Sel. MA addr IR FF0A FF0A 0 rs A B addr rt 1 rd 2 ... On-ChipMemory Ex. Sel. Immediate Extended 30 Link ALU PC 31 Mem. Wrt. Reg. Wrt. Registers En. Mem. En. Reg. En. Imm. data En. ALU data Bus
Register-Register Add 31 (PC)30 (Link) rd rt rs zero? busy? Opcode load MA ALU Op. loadA loadB load IR Reg. Sel. MA addr IR 13DO FF0A FF0A 0 rs A B addr 13DO rt 1 rd 2 ... On-ChipMemory Ex. Sel. Immediate Extended 30 Link ALU PC 31 Mem. Wrt. Reg. Wrt. Registers En. Mem. En. Reg. En. Imm. data En. ALU data Bus
Register-Register Add 31 (PC)30 (Link) rd rt rs zero? busy? Opcode load MA ALU Op. loadA loadB load IR Reg. Sel. MA addr IR 13DO FF0A FF0A 0 rs A B addr 13DO rt 1 12DA rd 2 12DA 99FF ... On-ChipMemory Ex. Sel. Immediate Extended 30 Link ALU PC 31 Mem. Wrt. Reg. Wrt. Registers En. Mem. En. Reg. En. Imm. data En. ALU data Bus Incorrect flow based on current state
0,2,1,0,0,0,1,1,0,0,0,0,0 0,0,0,1,1,0,0,0,0,0,0,0,0 0,1,0,1,0,1,0,0,0,0,0,0,0 ADD_1 ADD_0 ADD_2 No_Op Add All Else All Else All Else Load FAULT FETCH_0 LOAD_0 Mult Hypothesis: The totality of control-type signals forms a stateful representation of a circuit’s control flow, which can be modeled by a finite automata MULT_0 . . .
Execution Flow • Since we are describing an execution monitor in this section, we compare its claimed properties to the execution monitor (EM) definitions from [Schn2000] • Ψ: the universe of all possible sequences • ΣS: a subset of Ψ corresponding to the execution of some target S • Definition: a security policy is specified by giving a predicate on sets of executions. A target S satisfies security policy P if and only if P(ΣS) is true.
Execution Flow • Execution monitors (EMs) [Schn00] • Any security policy P that can be enforced from an enforcement mechanism on an execution set Π must be able to be specified by a predicate of the form: • DFAEM Meets this criteria • DFAEM must also terminate an unauthorized execution after some finite time • The system would do this by halting on the FAULT state, then executing some prescribed corrective action
Execution Flow • DFAEM meets the criteria for a security automata, as described in [Schn00]. • Needs to be able to accept infinite-length inputs • Acceptance requires revisiting at least one accepting state an infinite number of times (in this case, the FETCH_0 state) 0,2,1,0,0,0,1,1,0,0,0,0,0 0,0,0,1,1,0,0,0,0,0,0,0,0 0,1,0,1,0,1,0,0,0,0,0,0,0 ADD_1 ADD_0 ADD_2 No_Op Add All Else All Else All Else Load FAULT FETCH_0 LOAD_0
Execution Flow • Malicious Changes or Errors? • Though malicious inclusions may motivate 3D security designs, at this level a malicious change may be indistinguishable from a design flaw or a transient electrical error • A 3D execution monitor detects any deviation from the specified behavior, and therefore would detect malicious inclusions, design flaws, and transient errors in the same manner • Though malicious inclusions are the motivation, a processor EM might also be useful for detecting transient errors or flaws
Execution Flow • Primary Research tasks • Examine techniques for identification of the control signals which must be monitored • Show the connection between established "execution monitor" theory and the 3D monitoring of a processor's instruction execution flow • Design a working execution monitor in HDL for one or more simple processors • Demonstrate detection of execution-flow malicious inclusions, in software simulation • Demonstrate 3D execution monitor cost metrics – number of gates, time requirements, number of Automata states required