1 / 25

Shimin Chen LBA Reading Group

Complete Information Flow Tracking from the Gates Up Tiwari, Wassel, Mazloom, Mysore, Chong, Sherwood, UCSB, ASPLOS 2009. Shimin Chen LBA Reading Group. Introduction. In a traditional microprocessor, information is leaked practically everywhere and by everything

Download Presentation

Shimin Chen LBA Reading Group

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Complete Information Flow Tracking from the Gates UpTiwari, Wassel, Mazloom, Mysore, Chong, Sherwood, UCSB, ASPLOS 2009 Shimin Chen LBA Reading Group

  2. Introduction • In a traditional microprocessor, information is leaked practically everywhere and by everything • Can be a serious problem for exceptionally sensitive financial, military, and personal data • Cryptography, authentication • Developers in these domains are willing to go to remarkable lengths to minimize the amount of leaked information • flushing the cache before and after executing a piece of critical code (Osvik et al. 2006) • attempting to scrub the branch predictor state (Aciicmez et al. 2007) • normalizing the execution time of loops by hand (Kocher 1996) • randomizing or prioritizing the placement of data into the cache (Lee et al. 2005) • Previous works on DIFT are not adequate

  3. GLIFT: Gate-Level Information-Flow Tracking • This paper: • presents a processor architecture and implementation • can track all information flows • A novel logic discipline: GLIFT logic • Augment arbitrary logic blocks with tracking logic • Make compositions of augmented blocks • Synthesizable processor implementation with a restricted ISA • Provably-sound information-flow tracking • Allow tasks such as public-key cryptography and message authentication

  4. Theoretical Understanding • In a Turing-complete machine, the general problem of determining whether information flows in a program from variable x to variable y is undecidable: • “any procedure purported to decide it could be applied to the statement if f(x) halts then y := 0 and thus provide a solution to the halting problem for arbitrary recursive function” (Denning and Denning 1977). • The paper builds a machine: • by construction, will not allow unbounded execution • All hidden flows of information are made explicit

  5. Outline • Introduction • Gate Level Information Flow Tracking • Architecture • Evaluation • Conclusions

  6. Idea • Understand how information flows through primitive logic gates • Compose these gates together into more complex structures • Treat the whole processor as a logical function • Operates on a set of inputs • Results in a set of outputs • The trust of outputs should be determined based on the trust of inputs • Assumption: • Binary state: trusted (0) or untrusted (1)

  7. GLIFT for an AND gate AND Gate AND GateTruth Table Partial truth table for the shadow logic Shadow logic for AND Gate

  8. Composing Larger Functions • Use MUX as a simple example • The shadow logic can be composed from shadow logics of gates • Not minimum but always sound, for example, the two inputs to the OR gate cannot be both 1 • If S is trusted and the selected input is trusted, o is trusted • If S is untrusted, o is untrusted unless both a and b are trusted and are equal

  9. Outline • Introduction • Gate Level Information Flow Tracking • Architecture • Evaluation • Conclusions

  10. Step 1: Handling Conditionals • Problem with conventional architecture • If X is untrusted, then PC becomes untrusted • Selected instruction becomes untrusted • Bits that select target register are untrusted • All of the registers may be marked as untrusted • Must keep PC trusted

  11. Solution: Predication • All the instructions are executed • If predicate is 0, the instruction does not have effects: target register is not overwritten • PC is trusted • Predicates can become untrusted • Suppose P0 is untrusted

  12. Example target • The line selecting R2 is untrusted • The other control lines are trusted • R2 will be marked untrusted no matter P0= 0 or 1 • End result: no matter the untrusted predicate is true or not, the destination is marked as untrusted.

  13. Step 2: Handling Loops • Loops are hard • for (i=0; i<=X; i++) A[i]=1; • Information flow from X to A[X+1] • A[X+1]==0 tells us about X • Information flow from X to A[X+n] for all n • Implicit timing channel

  14. Solution: Statically Specify Number of Iterations • countjump instruction: • Specify number of loop iterations • jump target address • Example (my understanding from the description) • Loop start address:…………countjump # iterations, loop start address • The first time countjump is encountered, the # iterations is loaded into an internal loop counter register • The loop counter register is decremented every time countjump is encountered, and PC  loop start address • When the register becomes 0, PC  PC + 1 • countjump cannot be predicated

  15. Early Termination • In “C”, we have “break” statement that can terminate a loop early • Here, the paper proposes: • Predicate all the instructions in the loop with the termination condition • When the termination condition becomes true, the loop body does not have effects

  16. Step 3: Constraining Loads and Stores • Indirect loads and stores are bad • e.g., M[reg]  value • If reg is untrusted, then essentially all the memory locations become untrusted • “Intuitively, the problem is that accessing one untrusted address causes every other address to become implicitly untrusted by virtue of them not being accessed or modified.” • Limit the ISA to only allow: • Direct load/store: addresses are immediate constants • Loop-relative addressing: load-looprel, store-looprel • e.g., load-looprel R0, 0x100, C0 • Loads M[0x100 + C0] • C0..C7 are counters: explicitly initialized by init-counter, and incremented by a fixed value w/ increment-counter • counter operations cannot be predicated

  17. Proof-of-Concept Implementation • Verilog • Use Altera’s QuartusII software to synthesize it onto a Stratix II FPGA • 32-bit machine • 64KB Instruction memory, 64KB Data Memory • Registers: • A program counter • 8 general purpose registers • 2 predicate registers • 8 registers to store loop counters (that count down the number of iterations) • 8 other registers to store explicit array indices (used as offsets for load-looprel and store-looprel instructions). • No pipelining

  18. Augment the Processor with GLIFT Logic • Each bit of processor state is explicitly shadowed: • every register gets a shadow register • every memory has a shadow RAM • The logic and signals are shadowed by generating the proper trust propagation logic

  19. ISA

  20. A code snippet from the SubBytes function in AES encryption algorithm Basically this is the following in “C”: for (i=0; i<16; i++) { state[i] = SBox[state[i]]; }

  21. Outline • Introduction • Gate Level Information Flow Tracking • Architecture • Evaluation • Conclusions

  22. Hardware Impact Altera’s Nios is a commercial product: RISC instruction set, reasonably optimized Nios econ: unpipelined 6 stage core, without caches, branch-predictors etc. Nios std: pipelined, 4KB instruction cache GLIFT base: unpipelined, no tracking GLIFT full: GLIFT base + tracking

  23. Hardware Impact 70 % area increase compared to GLIFT base Small frequency degradation: adding GLIFT tracking does not have big impact on the latency

  24. Application Kernels • Dynamic instruction counts vary substantially • FSM and AES have a lot of table look-ups, which become full table iterations

  25. Conclusions • Bigger, slower, harder to program, and computationally less powerful • For the first time, provides the ability to account for all information flows through the chip. • My learning: • Understanding deeper about information leaks • Efforts to prevent leaks are very significant • Sacrifice programmability: restrictions on loop, load/store • Proof-of-concept does not even talk about issues such as cache

More Related