1 / 60

Practical Data Confinement

Practical Data Confinement. Andrey Ermolinskiy, Sachin Katti, Scott Shenker, Lisa Fowler, Murphy McCauley. Introduction. Controlling the flow of sensitive information is one of the central challenges in managing an organization Preventing exfiltration (theft) by malicious entities

gella
Download Presentation

Practical Data Confinement

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Practical Data Confinement Andrey Ermolinskiy, Sachin Katti, Scott Shenker, Lisa Fowler, Murphy McCauley

  2. Introduction • Controlling the flow of sensitive information is one of the central challenges in managing an organization • Preventing exfiltration (theft) by malicious entities • Enforcing dissemination policies

  3. Why is it so hard to secure sensitive data?

  4. Why is it so hard to secure sensitive data? • Modern software is rife with security holes that can be exploited for exfiltration

  5. Why is it so hard to secure sensitive data? • Modern software is rife with security holes that can be exploited for exfiltration • Users must be trusted to remember, understand, and obey dissemination restrictions • In practice, users are careless and often inadvertently allow data to leak • E-mail sensitive documents to the wrong parties • Transfer data to insecure machines and portable devices

  6. Our Goal • Develop a practical data confinement solution

  7. Our Goal • Develop a practical data confinement solution • Key requirement: compatibility with existing infrastructure and patterns of use • Support current operating systems, applications, and means of communication • Office productivity apps: word processing, spreadsheets, … • Communication: E-mail, IM, VoIP, FTP, DFS, … • Avoid imposing restrictions on user behavior • Allow access to untrusted Internet sites • Permit users to download and install untrusted applications

  8. Our Assumptions and Threat Model • Users • Benign, do not intentionally exfiltrate data • Make mistakes, inadvertently violate policies • Software platform (productivity applications and OS) • Non-malicious, does not exfiltrate data in pristine state • Vulnerable to attacks if exposed to external threats • Attackers • Malicious external entities seeking to exfiltrate sensitive data • Penetrate security barriers by exploiting vulnerabilities in the software platform

  9. Central Design Decisions • Policy enforcement responsibilities • Cannot rely on human users • The system must track the flow of sensitive information, enforce restrictions when the data is externalized

  10. Central Design Decisions • Policy enforcement responsibilities • Cannot rely on human users • The system must track the flow of sensitive information, enforce restrictions when the data is externalized • Granularity of information flow tracking (IFT) • Need fine-grained byte-level tracking and policy enforcement to prevent accidental partial exfiltrations

  11. Central Design Decisions • Placement of functionality • PDC inserts a thin software layer (hypervisor) between the OS and hardware • The hypervisor implements byte-level IFT and policy enforcement • A hypervisor-level solution • Retains compatibility with existing OSes and applications • Has sufficient control over hardware

  12. Central Design Decisions • Placement of functionality • PDC inserts a thin software layer (hypervisor) between the OS and hardware • The hypervisor implements byte-level IFT and policy enforcement • A hypervisor-level solution • Retains compatibility with existing OSes and applications • Has sufficient control over hardware • Resolving tension between safety and user freedom • Partition the application environment into two isolated components: a “Safe world” and a “Free world”

  13. Partitioning the User Environment Safe Virtual Machine Unsafe Virtual Machine Access to sensitive data Unrestricted communication and execution of untrusted code Hypervisor IFT, policy enforcement Hardware (CPU, Memory, Disk, NIC, USB, Printer, …)

  14. Partitioning the User Environment Sensitive data Non-sensitive data Trusted code/data Exposure to the threat of exfiltration Untrusted (potentially malicious) code/data

  15. PDC Use Cases • Logical “air gaps” for high-security environments • VM-level isolation obviates the need for multiple physical networks • Preventing information leakage via e-mail • “Do not disseminate the attached document” • Digital rights management • Keeping track of copies; document self-destruct • Auto-redaction of sensitive content

  16. Talk Outline • Introduction • Requirements and Assumptions • Use Cases • PDC Architecture • Prototype Implementation • Preliminary Performance Evaluation • Current Status and Future Work

  17. PDC Architecture: Hypervisor • PDC uses an augmented hypervisor to • Ensure isolation between safe and unsafe VMs • Tracks the propagation of sensitive data in the safe VM • Enforces security policy at exit points • Network I/O, removable storage, printer, etc.

  18. PDC Architecture: Tag Tracking in the Safe VM • PDC associates an opaque 32-bit sensitivity tag with each byte of virtual hardware state • User CPU registers accessible • Volatile memory • Files on disk

  19. PDC Architecture: Tag Tracking in the Safe VM • These tags are viewed as opaque identifiers • The semantics can be tailored to fit the specific needs of administrators/users • Tags can be used to specify • Security policies • Levels of security clearance • High-level data objects • High-level data types within an object

  20. PDC Architecture: Tag Tracking in the Safe VM • An augmented x86 emulator performs fine-grained instruction-level tag tracking (current implementation is based on QEMU) • PDC tracks explicit data flows (variable assignments, arithmetic operations) eax add %eax, %ebx ebx

  21. PDC Architecture: Tag Tracking in the Safe VM • An augmented x86 emulator performs fine-grained instruction-level tag tracking (current implementation is based on QEMU) • PDC also tracks flows resulting from pointer dereferencing eax Tag merge mov %eax, %(ebx) ebx Memory

  22. Challenges • Tag storage overhead in memory and on disk • Naïve implementation would incur a 400% overhead • Computational overhead of online tag tracking • Tag explosion • Tag tracking across pointer exacerbates the problem • Tag erosion due to implicit flows • Bridging the semantic gap between application data units and low-level machine state • Impact of VM-level isolation on user experience

  23. Talk Outline • Introduction • Requirements and Assumptions • Use Cases • PDC Architecture • Prototype Implementation • Storing sensitivity tags in memory and on disk • Fine-grained tag tracking in QEMU • “On-demand” emulation • Policy enforcement • Performance Evaluation • Current Status and Future Work

  24. Policy daemon PDC Implementation: The Big Picture Dom 0 Safe VM QEMU / tag tracker App1 App2 Safe VM (emulated) PageTag Descriptors NIC Network daemon VFS PDC-ext3 NFS Server NFS Client Xen-RPC Xen-RPC Event channel Shadow page tables Safe VM page tables Shared ring buffer PageTag Mask PDC-Xen (ring 0) CR3 CPU

  25. Storing Tags in Volatile Memory • PDC maintains a 64-bit PageTagSummary for each page of machine memory • Uses a 4-level tree data structure to keep PageNumberPageTagSummary mappings 31 29 19 9 0 PageNumber Array of 64-bit PageTagSummarystructures

  26. Storing Tags in Volatile Memory Page-wide tag for uniformly-tagged pages PageTagSummary • PageTagDescriptor stores fine-grained (byte-level) tags within a page in one of two formats Pointer to a PageTagDescriptor otherwise Linear array of tags (indexed by page offset) PageTagDescriptor RLE encoding

  27. Storing Tags on Disk • PDC-ext3 provides persistent storage for the safe VM • New i-node field for file-level tags • Leaf indirect blocks store pointers to BlockTagDescriptors • BlockTagDescriptor byte-level tags within a block Data block i-node FileTag Linear array Leaf Ind. block BlockTagDescriptor Ind. block RLE

  28. Policy daemon Back to the Big Picture Dom0 Safe VM QEMU / tag tracker App1 App2 Emul. CPU Context Safe VM (emulated) NIC Network daemon VFS PDC-ext3 NFS Server NFS Client Xen-RPC Xen-RPC Event channel Shadow page tables Safe VM page tables Shared ring buffer PageTag Mask PDC-Xen (ring 0) CR3 CPU

  29. Guest machine codeblock (x86) Fine-Grained Tag Tracking • A modified version of QEMU emulates the safe VM and tracks movement of sensitive data • QEMU relies on runtime binary recompilation to achieve reasonably efficient emulation • We augment the QEMU compiler to generate a tag tracking instruction stream from the input stream of x86 instructions Intermediate representation (TCG) stage 2 Host machine code block (x86) stage 1 Tag tracking code block

  30. Fine-Grained Tag Tracking • Tag tracking instructions manipulate the tag status of emulated CPU registers and memory Basic instruction format Action Dest. Operand Src. Operand {Clear, Set, Merge} {Reg, Mem} {Reg, Mem} • The tag tracking instruction stream executes asynchronously in a separate thread

  31. Fine-Grained Tag Tracking • Problem: some of the instruction arguments are not known at compile time • Example: mov %eax,(%ebx) • Source memory address is not known • The main emulation thread writes the values of these arguments to a temporary log (a circular memory buffer) at runtime • The tag tracker fetches unknown values from this log

  32. Binary Recompilation (Example) Input x86 instructions Intermediate representation Tag tracking instructions mov %eax, $123 movi_i32 tmp0,$123 Clear4 eax st_i32 tmp0,env,$0x0 push %ebp ld_i32 tmp0,env,$0x14 Set4 mem,ebp,0 ld_i32 tmp2,env,$0x10 Merge4 mem,esp,0 movi_i32 tmp14, $0xfffffffc add_i32 tmp2,tmp2,tmp14 qemu_st_logaddr tmp0,tmp2 st_i32 tmp2,env,$0x10 MachineAddr(%esp) Tag tracking argument log

  33. Binary Recompilation • But things get more complex… • Switching between operating modes (Protected/real/virtual8086, 16/32bit)

  34. Binary Recompilation • But things get more complex… • Switching between operating modes (Protected/real/virtual8086, 16/32bit) • Recovering from exceptions in the middle of a translation block

  35. Binary Recompilation • But things get more complex… • Switching between operating modes (Protected/real/virtual8086, 16/32bit) • Recovering from exceptions in the middle of a translation block • Multiple memory addressing modes

  36. Binary Recompilation • But things get more complex… • Switching between operating modes (Protected/real/virtual8086, 16/32bit) • Recovering from exceptions in the middle of a translation block • Multiple memory addressing modes • Repeating instructions rep movs

  37. Binary Recompilation • But things get more complex… • Switching between operating modes (Protected/real/virtual8086, 16/32bit) • Recovering from exceptions in the middle of a translation block • Multiple memory addressing modes • Repeating instructions rep movs • Complex instructions whose semantics are partially determined by the runtime state saved SS saved ESP saved EFLAGS saved CS iret saved EIP

  38. Policy daemon Back to the Big Picture Dom0 Safe VM QEMU / tag tracker App1 App2 Emul. CPU Context Safe VM (emulated) NIC Network daemon VFS PDC-ext3 NFS Server NFS Client Xen-RPC Xen-RPC Event channel Shadow page tables Safe VM page tables Shared ring buffer PageTag Mask PDC-Xen (ring 0) CR3 CPU

  39. “On-Demand” Emulation • During virtualized execution, PDC-Xen uses the paging hardware to intercept sensitive data access • Maintains shadow page tables, in which all memory pages containing tagged data are marked as not present QEMU / tag tracker PageTag Descriptors PageTag Mask • Access to a tagged page from the safe VM causes a page fault and transfer of control to the hypervisor Shadow page tables Safe VM page tables PDC-Xen (ring 0)

  40. “On-Demand” Emulation • If the page fault is due to tagged data, PDC-Xen suspends the guest domain and transfers control to the emulator (QEMU) • QEMU initializes the emulated CPU context from the native processor context (saved upon entry to the page fault handler) and resumes the safe VM in emulated mode Dom0 Safe VM QEMU / tag tracker Access to a tagged page Safe VM memory mappings Emul. SafeVM CPU Page fault handler Dom0 VCPU Dom0 Memory SafeVM VCPU Safe VM Memory

  41. “On-Demand” Emulation • Returning from emulated execution • QEMU terminates the main emulation loop, waits for the tag tracker to catch up • QEMU then makes a hypercall to PDC-Xen and provides • Up-to-date processor context for the safe VM VCPU • Up-to-date PageTagMask

  42. “On-Demand” Emulation • Returning from emulated execution • QEMU terminates the main emulation loop, waits for the tag tracker to catch up • QEMU then makes a hypercall to PDC-Xen and provides • Up-to-date processor context for the safe VM VCPU • Up-to-date PageTagMask • The hypercall awakens the safe VM VCPU (blocked in the page fault handler) • The page fault handler • Overwrites the call stack with up-to-date values of CS/EIP, SS/ESP, EFLAGS • Restores other processor registers • Returns control to the safe VM

  43. “On-Demand” Emulation - Challenges

  44. “On-Demand” Emulation - Challenges • Updating PTEs in read-only page table mappings • Solution: QEMU maintains local writable “shadow” copies, synchronizes them in background via hypercalls

  45. “On-Demand” Emulation - Challenges • Updating PTEs in read-only page table mappings • Solution: QEMU maintains local writable “shadow” copies, synchronizes them in background via hypercalls • Transferring control to the hypervisor during emulated execution (hypercall and fault handlers) • Emulating hypervisor-level code is not an option • Solution: Transient switch to native execution • Resume native execution at the instruction that causes a jump to the hypervisor (e.g., int 0x82 for hypercalls)

  46. “On-Demand” Emulation - Challenges • Delivery of timer interrupts (events) in emulated mode • The hardware clock advances faster in the emulated context (i.e., each instruction consumes more clock cycles) • Xen needs to scale the delivery of timer events accordingly

  47. “On-Demand” Emulation - Challenges • Delivery of timer interrupts (events) in emulated mode • The hardware clock advances faster in the emulated context (i.e., each instruction consumes more clock cycles) • Xen needs to scale the delivery of timer events accordingly • Use of the clock cycle counter (rdtsc instruction) • Linux timer interrupt/event handler uses the clock cycle counter to estimate timer jitter • After switching from emulated to native execution, the guest kernel observes a sudden jump forward in time

  48. Policy Enforcement • The policy controller module • Resides in dom0 and interposes between the front-end and the back-end device driver • Fetches policies from a central policy server • Looks up the tags associated with the data in shared I/O request buffers and applies policies Dom0 Safe VM Netw. interface back-end Netw. Interface front-end Block storage back-end Block storage front-end Policy controller

  49. Network Communication • PDC annotates outgoing packets with PacketTagDescriptors, carrying the sensitivity tags • Current implementation transfers annotated packets via a TCP/IP tunnel EthHdr IPHdr TCPHdr Payload Annotation TCP/IP encapsulation EthHdr IPHdr TCPHdr Tags EthHdr IPHdr TCPHdr Payload

  50. Talk Outline • Introduction • Requirements and Assumptions • Use Cases • PDC Architecture • Prototype Implementation • Preliminary Performance Evaluation • Application-level performance overhead • Filesystem performance overhead • Network bandwidth overhead • Current Status and Future Work

More Related