800 likes | 1.07k Views
Memory Protection in Resource Constrained Sensor Nodes. Ram Kumar Rengaswamy (ram@ee.ucla.edu) Ph.D. Defense. Memory Corruption in Mote-class Sensor Nodes. Single address space CPU Shared by apps., drivers and OS Many bugs in deployed systems come from memory corruption
E N D
Memory Protection in Resource Constrained Sensor Nodes Ram Kumar Rengaswamy (ram@ee.ucla.edu) Ph.D. Defense
Memory Corruption in Mote-class Sensor Nodes • Single address space CPU • Shared by apps., drivers and OS • Many bugs in deployed systems come from memory corruption • Corrupted nodes trigger network-wide failures Run-time Stack Globals and Heap (Apps., drivers, OS) No Protection Sensor Node Address Space PhD Defense
Programming errors cause Corruption • Motes are difficult to program • Error in “well-tested” sampling module in SOS hdr_size = SOS_CALL(s->get_hdr_size, proto); s->smsg = (SurgeMsg*)(pkt + hdr_size); s->smsg->type = SURGE_TYPE_SENSORREADING; • Return value is not being checked • SOS_CALL fails in some conditions, returns -1 • Error code used as offset into a buffer, causes corruption • Protection mechanisms prevent such corruption Memory protection is required for building robust sensor software PhD Defense
Desired Protection Domains Data RAM - Non contiguous partitions • Domains - Logical partitions of address space • Module loaded into a single domain • Protect domain from corruption by other domains Program FLASH - Contiguous partitions Kernel Driver App. 1 App. 2 Harbor - Create and enforce protection domains PhD Defense
MMU in Micro-Controllers • MMU can provide protection domains • No MMU in embedded micro-controllers • MMU hardware requires lot of RAM • Increases area and power consumption • Poor performance - High context switch overhead • Microcontroller Design Constraints • Minimize Price, Power and Memory • Real-time performance requirements PhD Defense
Software-based Fault Isolation • Coarse-grained protection; check all memory accesses • Checks in-lined by binary rewriter; separately verified • Challenge - Ensure checks are not circumvented • Sandboxing [wahbe92SFI] • Designed for RISC architectures • Static partitioning of virtual address space • Dedicated sandbox registers ensure checks are not skipped • XFI [xfi06osdi] • Control Flow Integrity using static analysis and run-time checks • t-kernel [gu06tkernel] • Naturalization - Rewrite binary on mote to make it safe • Virtual memory for heap accesses for protection X PhD Defense
Kernel Ext. #1 Kernel Ext. #2 Static Kernel Challenge - Fault Isolation on Mote • SFI partitions virtual address space • Leverage large address space due to MMU • 4 GB per process on x86 • Static, continuous, and large domains • Checks have low overhead • Minimal state maintenance per domain • Static partitioning is impractical in motes • Limited address space - No MMU • Severely limited physical memory • Wastage through internal fragmentation PhD Defense
Software-based Approaches • Application Specific Virtual Machine • Interpreted code is safe and efficient • High-level VM instructions are not type-safe • [levis05active] [balani06dvm] • Type Safe Languages • Fine grained protection • Non-type safe extensions for low level code • Compiler output difficult to verify • [titzer06virgil], [necula02ccured] • Static Analysis for Resource Management • Exchange heap ownership tracking in Singularity [sing06lang] • Dynamic memory ownership tracking in SOS [roy07lighthouse] PhD Defense
Harbor Design Goals • Fault Isolation as building block for protection • Sandbox for ASVM instructions • Static analysis for performance enhancement • Motivation of Harbor • Memory protection in SOS operating system • Provide coarse-grained memory protection • Protect OS from applications • Protect applications from one another • Targeted for resource constrained systems • Low RAM usage • Minimal performance overhead • Amenable to software and hardware implementations PhD Defense
Thesis Contributions • Designed Harbor Memory Protection • Primitives suited for mote-class sensor nodes • Memory Map • Fine grained ownership and layout info. • Enables protection without static partitioning of RAM • Control Flow Manager • Preserve control flow integrity • Cross domain call [wahbe92sfi], Safe Stack [xfi06osdi] • Stack Bounds - Prevent run-time stack corruption • Designed and evaluated two systems based on Harbor • Software-based memory protection for SOS • Hardware-enhanced memory protection unit for AVR PhD Defense
Outline • Introduction • Harbor Primitives • Memory Map • Sandbox for SOS Modules • UMPU • Conclusion PhD Defense
Memory Map - Protection without static partitions • Store information for all blocks • Encoded information per block • Ownership - Kernel/User-ID • Layout - Start of segment bit • Block size is 8 bytes for Mica2 Fine-grained layout and ownership information Partition address space into blocks Allocate memory in segments (Set of contiguous blocks) User Domain Kernel Domain PhD Defense
Using Memory map for protection • Ownership information in memory map is current • Only block owner can free/transfer ownership • Trusted domain can access memory map API • Store memory map in protected memory • Easy to incorporate into existing systems • Modify memory allocators - malloc, free • Track function calls that pass memory across domains • Changes to SOS Kernel ~ 1% • Scheme is portable to other OSes and MCUs PhD Defense
Memmap Checker • Enforces a protection model • Write access to block granted only to owner • Invoked before every write access • Operations performed by checker • Lookup memory map based on write address • Verify current executing domain is block owner PhD Defense
Memory Map Tradeoffs • More protection domains More bits per block Larger memory map • Smaller protected address range Smaller memory map • Block size • Larger block size Smaller memory map • Larger block size Greater internal fragmentation • Match block size to size of memory objects • Mica2 - 8 bytes, Cyclops - 128 bytes Memory map size for 8 Byte Blocks PhD Defense
Outline • Introduction • Harbor Primitives • Control Flow Manager • Sandbox for SOS Modules • UMPU • Conclusion PhD Defense
Managing Control Flow • Harbor needs to manage control flow • Prevent circumvention of run-time checks • Track the currently active domain • Enforced control flow restrictions • Enter/Exit a domain through designated points • Control flow restrictions can be statically verified • Control flow can become corrupt at run-time • Calls using corrupt function pointers • Returns on corrupted stack • Memory map cannot prevent such corruption PhD Defense
Cross Domain Call Stub • Verify call into jump table • Compute callee domain ID • Determine return address Register exported function Cross Domain Call Domain A call fooJT foo_ret: Domain B foo: … ret Jump Table fooJT:jmp foo Program Memory PhD Defense
Stack Bounds Stack Base • Single stack shared by all domains • Bounds set during cross domain call • Protection Model • No writes beyond stack bound • Prevent cross domain corruption • Enforced by checker before all writes Caller Domain Stack Frame Stack Bound Callee Domain Stack Frame Stack Ptr. Run Time Stack PhD Defense
Safe Stack • Extra stack maintained in protected memory • Used for storing Harbor state information • e.g. Return Address Protection • Grows up in memory towards run-time stack HEAP and GLOBALS SAFE STACK RUN-TIME STACK Stacks grow towards one another PhD Defense
func_entry_stub: func_exit_stub: foo_ret foo_ret foo_ret foo_ret RUN-TIME STACK SAFE STACK RUN-TIME STACK SAFE STACK Restore from safe stack Copy to safe stack Return Address Protection call foo foo_ret: foo: … ret PhD Defense
Outline • Introduction • Harbor Primitives • Sandbox for SOS Modules • UMPU • Conclusion PhD Defense
Binary Re-Writer Sandbox Binary Raw Binary Binary Verifier Memory Safe Binary Memory Map Control Flow Mgr. SOS Sandbox System Overview Desktop Sensor Node PhD Defense
System Design Choices • Binary Re-writer • Independent of programming language, compilers • Binary output is easier to decode and verify • Only verifier implementation needs to be correct • Verification at every node • No need to trust the source of a binary • Safety ensured by properties of the binary • Do not need secure code dissemination protocols PhD Defense
Binary Re-Writer push X push R0 movw X, Z mov R0, Rsrc call memmap_checker pop R0 pop X st Z, Rsrc • Sandbox stores and computed control flow • Sequence is re-entrant, works in presence of interrupts • Push/Pop removed by using dedicated registers • Insert calls to run-time checks • Checks located in trusted domain • Inlining increases module size • Preserve original control flow e.g. branch targets etc. PhD Defense
Verifier Operation • Single in-order pass through instruction stream • Process one basic block at a time • Constant state maintained • For each basic block, the verifier ensures • Stores are sandboxed • Computed control flow instructions are checked • Static jump/call/branch targets lie within domain bounds • Disallow certain instructions - SPM etc. PhD Defense
Multi-Word Instructions • Multi-word instructions pose a problem for verifier • st <addr>, call <addr> • Cannot detect jumps into the middle of an instruction • <addr> could be decoded as a store instruction • Such store instructions can circumvent checks • Re-writer inserts special marker symbols • Ensure all control transfers to the start of basic block • Special marker symbol is decoded as NOP • Cannot be a valid instruction in the ISA • No valid immediate argument can equal the marker symbol • Unique symbol present at start of basic blocks only PhD Defense
Comparison with t-kernel Overheads of both systems are comparable PhD Defense
Outline • Introduction • Harbor Primitives • Sandbox for SOS Modules • Evaluation • UMPU • Conclusion PhD Defense
Sandbox Evaluation • Sandbox feasible on mote-class sensors • Modest resource usage • Minimal overhead for typical sensor workloads • Code Memory Usage • Blank SOS kernel compiled for Mica2 platform • Compiler: avr-gcc -Os • Jump table = 2 KB • Checkers + Memory Map API ~ 4 KB PhD Defense
9.5% 5.8% 5.1% 3.4% Data Memory Usage • Flexible data-structure - Tradeoff RAM for protection • Memory map RAM usage: 70 - 256 Bytes • Additional constant overhead: 28 bytes PhD Defense
Code Size of Sandbox Modules • Significant increase in code size: 30% - 65% • Rewriting store by long sequence to call checker • Overhead can be reduced by dedicated registers • Static analysis can eliminate redundant checks PhD Defense
Memory Map Overhead • Overhead introduced in dynamic memory API • Memory map needs to be updated • Higher overheads of free and change_own • Added checks to prevent illegal freeing or transfer • Averaged over long simulation; avg. allocation size = 16 bytes • Measured using Avrora for Mica2 platform PhD Defense
Run-Time Checkers Overhead • Incur high overhead, despite implementation in assembly • Checks occur in critical path, impact application performance PhD Defense
CPU Intensive Benchmarks • FFT - Fixed point operation on 64 samples • Outlier - Distance based threshold on 4 samples • Buffer Writer - Averaged over buffer size (16 - 96) bytes • Worst case work-load for Harbor • 100% CPU utilization in executing benchmarks • Checks introduced in critical path, directly impact performance PhD Defense
Sandbox versus Virtual Machine • Virtual machines provide fine-memory protection • Average interpretation overhead 115 [ASVM] • Interpretation overhead significantly higher than sandboxing • Sandboxing VM slows it down by factor of 2.6 • VM provides fine-grained protection to scripts • Harbor provides coarse protection to system running VM PhD Defense
Data Collector Application • Experiment Setup • 3-hop linear network simulated in Avrora • Simulation executed for 30 minutes • Tree Routing and Surge modules inserted into network • Data pkts. transmitted every 4 seconds • Control packets transmitted every 20 seconds • 1.7% increase in relative CPU utilization • Absolute increase in CPU - 8.41% to 8.56% • 164 run-time checks introduced • Checks executed ~20000 times PhD Defense
SOS Sandbox Conclusions • Sandboxing is feasible on mote-class sensors • Memory map is flexible; tune to protection needs • Low performance impact for typical workloads • Run-time checks have high overhead • Sandbox has lower overhead than virtual machine • Building block for enhanced protection schemes PhD Defense
SOS Sandbox Future Work • Verifier design tradeoff • Static analysis eliminates redundant checks • Improve overall system performance • Increase complexity of verifier, infeasible on mote • Incorporate network into the system • Trusted code sources within the network • Sandbox on Micro-servers or Backend • Secure code dissemination to mote cluster PhD Defense
Outline • Introduction • Harbor Primitives • Sandbox for SOS Modules • UMPU • Conclusion PhD Defense
UMPU Design Goals • Exploit hardware to improve Harbor’s performance • Provide coarse-grained memory protection • Targeted for resource constrained processors • Low RAM and ROM usage • Minimize area of protection logic • Practical system • No modifications to processor instructions • Customizable - Soft IP Core, Configuration Registers PhD Defense
Memory Protection Unit (MPU) • Low-cost hardware assisted protection • Static partition of address space • Base and Bound registers define partitions • Max. of 8 partitions in a processor • Access permissions defined for every region • Memory access validated against permissions • Simple Protection Features • Supports only two domains • Static partitioning of address space • Limited control flow protection PhD Defense
UMPU Overview Hardware Software Co-Design Memory Map API Memory Map Checker Stack Bounds Checker (STORE) Cross Domain Linking Cross Domain Call Safe Stack (CALL & RET) UMPU Device Driver Domain Bounds Checker Stack Overflow Checker Software Components Hardware Extensions Complexity Efficiency PhD Defense
Memory Map Controller ST_INSTR Fetch Decoder Memory Map Controller DATA RAM • Functional unit that validates store operations • Triggered by fetch decoder on a store instruction • Compute memory map byte address for issued write address • Retrieves memory map permission and validates stores • Map maintained in protected RAM • RAM access is bottleneck i.e. min. one cycle latency CPU_ADDR CPU_ADDR CPU_WR_EN CPU_WR_EN CPU_STALL DATA_BUS PhD Defense
CPU_WR_ADDR MMC_RD_ADDR CLK CPU_ADDR CPU_WR_EN MMC_ADDR MMC_WR_EN CPU_STALL MMC Store Operation Cycle 1 Cycle 2 Cycle 3 Regular Mode Protected Mode PhD Defense
RET_ADDR Safe Stack DATA RAM SS_PTR SSP_WR_EN <CROSS_DOMAIN_FRAME> CURR_DOM_ID Domain Bounds Checker PC Cross Domain Call Unit Fetch Decoder Cross Domain Call Trigger Cross Domain Call State M/C CALL_INSTR CDC_EN CALL_ADDR CALL_ADDR PhD Defense
Cross Domain Call Unit From fetch_decoder on call instr. • Jump table setup by software • Location configurable through UMPU registers • Cross Domain Call State M/C • Triggerred by CDC_EN • Operation similar to software implementation • Bottleneck is the push operation to safe stack jmp_tbl_base call_addr jmp_tbl_bnd <= < AND CDC_EN PhD Defense
Domain Bounds Checker PC Domain Bounds Checker BND_PANIC • Restrict control flow to address range of a domain • Every domain occupies address range [dom_lb, dom_ub] • Implemented as simple combinational logic in HW • dom_lb <= PC <= dom_ub • Domain bounds need to be initialized, stored and updated • Initialized by loader or at link-time in static image • Stored in Register array or RAM: Area-Performance tradeoff • Updated on cross domain calls CURR_DOM_ID PhD Defense
UMPU Micro-Benchmarks PhD Defense
UMPU Logic Area • Synthesized microcontroller using Xilinx tools • Design targeted towards Vertex2 FPGA • 64% increase in area of core • Measured in terms of equivalent logic gates • Spec sensor node [Jason Hill] • Identical processing core • Chip Area = 6.25 mm2 • Core Area = 0.39 mm2 (Utilization = 6.3%) • UMPU Enhanced Core = 0.64 mm2(Utilization = 10.3%) • Area utilization of the core increases by 4% • Packed without changing die size • 10x increase in chip area with the addition of MMU PhD Defense