Peng Liu Pennsylvania State University University Park, PA 16802 July 20, 2007

MURI: Autonomic Recovery of Enterprise-wide Systems After Attack or Failure with Forward Correction:System-Level Design & Implementation Peng Liu Pennsylvania State University University Park, PA 16802 July 20, 2007

Outline • Recovery angle of enterprise health care • The recovery problem • The state-of-the-art • Our goal • Year-by-year plan: overview • Year one plan: zoomed-in

The recovery problem:(1) Patient systems A patient mankind A “patient” system App processes OS Components: code, stack, heap, (VM) pages, files, sockets, PCB, page tables, registers, sys. calls, drivers, … Threat: virus infection

(2) Why a compromised system could be called a patient A “patient” system A patient mankind Process Organ Text, stack, heap, pages, files Tissues Memory unit, register, disk block, … Cells OS Neuro + blood PCB, page tables, drivers, sys. calls, scheduler, sockets, interceptions, … Neuro sub-systems, Blood sub-systems Memory unit, register, disk block, … Cells

The recovery problem:(3) System state transition A system’s state is determined by the state values of its components: stack, heap, files, registers, … Component x is poisoned by attack at 9:30am Time State at 8am State at 9am State at 10am State at 5pm … … … … … Checkpoint C-8am C-9am C-10am C-5pm Fact: “infection” can propagate

(4) Simplest full-system recovery: before Lymph, gallbladder, etc. Body at week 10 Body at week 1 Body at week 2 Body at week 3 … … Time Component x (liver) is poisoned by attack at 9:30am State at 9am State at 10am State at 5pm State at 8am … … … … … Checkpoint C-8am C-9am C-10am C-5pm

(5) Simplest full-system recovery: after Body at week 1 Body at week 2 Bob’s memory after week 2 is lost: very painfulfor Bob Component x (liver) is poisoned by attack at 9:30am Time The work after 9am is lost State at 8am State at 9am Memory-less recovery Checkpoint C-8am C-9am

(6) Memory preserving, full-system recovery • What is memory-preserving recovery? • When we perform surgeries on the liver, do not roll-back the state of the brain • When we repair an infected process, do not roll-back any uninfected process • Memory-preserving recovery requires fine-grained process-level (i.e., organ-level) and operation-level forward correction surgeries • Memory-preserving recovery is challenging • Due to infection propagation, it is hard to know which (part of an) organ should be cut-off and which should be kept

(7) Full-body anesthesia vs. local anesthesia • There are two ways to perform surgeries: • Full-body “anesthesia”: The machine is halted during recovery • Local “anesthesia”: The uninfected processes can still be executed as usual • For non-stop enterprise computing, local “anesthesia” is required

The recovery problem:(8) Infection quarantine • Why quarantine? • The under-repair components are infectious prevent infecting clean processes • Execution of the uninfected processes may interfere with the surgeries  guarantee correctness • Quarantine = “disinfection” + local “anesthesia” • Quarantine strategies • Two-way quarantine • One-way quarantine

The state-of-the-art • Memory-less recovery • Re-playable systems • Process checkpointing • Process migration • Memory-preserving subsystem recovery with full-body anesthesia

The state-of-the-art(1) memory-less recovery • One-button recovery • A standard feature in laptops (HP, Dell, etc.) • The OS will lose all “memory” • Simplest full-system recovery • Checkpoint-based • E.g., the whole state of a VM at time t can be copied to disk (State Procurement) • Will lose “memory” after the moment of attack

The state-of-the-art(2) re-playable systems • E.g., Revirt can log and replay all operations of a virtual machine • re-playable ≠ recoverable • Revirt cannot detangle bad operations from good ones • Revirt cannot replay only the unaffected good operations • Revirt cannot do forward correction • Revirt cannot do local anesthesia • Revirt cannot quarantine infection

The state-of-the-art(3) process checkpointing • Per-process checkpointing: Flashback (and Rx) can checkpoint the whole state of a process at time t in RAM • checkpoint-able ≠ recoverable • Flashback cannot detangle bad operations from good ones within the same process • Flashback cannot track taint-propagation channels • Flashback cannot do forward correction • Flashback cannot quarantine infection

The state-of-the-art(4) process migration • Process migration: • A Pod is a group of processes “tangled” with each other • Zap can migrate a Pod from machine A to B • Migrate-able ≠ recoverable • Zap cannot detangle bad operations from good ones • A partially infected Pod has to be totally “discarded” • Zap cannot track taint-propagation • Zap cannot do forward correction

(5) Memory-preserving subsystem recovery with full-body anesthesia • Taser can do memory-preserving recovery, however, • Not full-system recovery: It can only repair file systems • Taser requires full-body anesthesia • Taser cannot quarantine infection • Taser cannot do on-the-fly surgeries • Compared with our blueprint, Taser does not have the capabilities to do: • Remote surgeries • Nested recovery • Replicated Recovery • Non-stop Recovery

Our goal • Do memory-preserving, self-recoverable, non-stop enterprise computing: • Fine-grained recovery surgeries • Forward correction • Keep good “memory” in a consistent way • Remove bad “memory” • Local “anesthesia” • Quarantine infection during recovery • Transparent to uninfected processes

Challenges: Multi-Granularity Recovery • Machine-level recovery • Processes are usually “tangled” with each other • It is not hard to checkpoint a VM, but • It is hard to detangle bad operations from good ones • Pod-level recovery • Zap can checkpoint and migrate a Pod, but • It is hard to do detangling • A partially infected Pod has to be totally “discarded” • Process-level recovery • A partially infected process has to be totally “discarded” • Need to track taint-propagation channels • Operation-level recovery: the desired granularity • Need fine-grained surgeries inside the “body” of a VM • Very hard to do selective replay or migration • Tough tradeoffs between recoverability and consistency

Recovery Services: Roadmap Initial Capability Gold Capability • Focus: processes, files • Logger: VMM based • Atomicity: per-process • Dependency analysis • Quarantine via VMM • Roll-Forward correction • Nested recovery • - Intra-process checkpointing • - Nested transactions Platinum Capability • Replicated recovery • - Heterogeneous VM replica • - Standby VM Silver Capability • Holistic recovery • - Sockets, shared memory, • DBMS, attributes, … • - Control dependencies: • process forking, workflows • -Remote “surgeries” • - EHCC sends instructions • to remote surgery agents Diamond Capability • Non-stop recovery • - Transparent switching • - Stateful migration

New recovery capabilities: basic ones

New recovery capabilities: advanced ones • New capabilities can • Provide transactional atomicity & consistency • Do non-stop warm-start or hot-start recovery • Perform remote surgeries • Do intra-process checkpointing • Do nested recovery within a process • Do heterogeneous VM replication • Construct standby VM • Do stateful recovery-driven process migration • Side benefits: • Improved observation/inspection capability • Improved diagnosis/forensics capability • Improved detection capability

Year one: Initial Capability • Scope: local health care • Focus: app processes, files • Logging: VMM based • Atomicity: per-process • Dependency analysis based detangling • Local anesthesia via host kernel • Quarantine via VMM • On-the-fly, roll-forward correction

System architecture App A App B Display process Stack Timer Log Heap Dependency Analyzer Keyboard Task structure Guest OS Guest OS Ports CPU VMM auditor Quarantine Task structure Roll-Forward Correction Instruction Generator Disks Hook Cache Surgery Agent Host Kernel Drivers

Why run “patient” systems in a VM? • Enhanced security • App processes are isolated in separate VM • The host kernel does not directly interact with the app processes • Although any component of a “patient” system may be compromised, the host kernel is quite safe • The audits and recovery code are well protected • Enhanced observation/inspection capability • Much easier to do local anesthesia • Much easier to quarantine • Much easier to perform repair surgeries • Downside: performance degradation

Year one work plan • Team 1: QEMU-based implementation • Team 2: UML-based implementation • Each team has two graduate students • Goal of each implementation: • Phase I: be able to do incremental logging • Phase II: be able to do damage assessment and detangling • Phase III: be able to perform on-the-fly repair “surgeries”

Phase I: do incremental logging • VM state-procurement techniques are recently proposed, but • If the checkpoints are taken frequently  too much overhead • If the checkpoints are not taken frequently  “memory” loss • A better idea is logging only the changes • Any operation could change the state • If we log every state change  too much • So what changes do not need to be logged? • Are we able to log all changes? • QEMU-based CPU emulator can log every change • UML-based logger can log every trap to OS

Phase II: do damage assessment • The goal is to detangle tainted operations from untainted operations • Dependency analysis is required in order to do detangling • We have built various kinds of dependency graphs for data processing systems • We will extend these graphs to capture the taint-propagation channels in a VM • Fine-grained VM information flow analysis techniques are recently proposed, • Although their purpose is intrusion detection, they may be applied to serve our recovery purposes

Phase III: perform on-the-fly repair surgeries • How to do local anesthesia? • Let the host kernel not schedule any tainted process that is under surgery • How to quarantine? • Let the VMM enforce the quarantine policy • Two-way quarantine: the tainted components are totally contained • One-way quarantine: a tainted process may access a version of an untainted component, but, not vise versa • How to do forward correction surgeries? • Naïve idea: Replace the state value of every tainted component with a clean version • Real challenge: How to keep the consistency among the clean versions of e.g. 50 components

Questions?

Peng Liu Pennsylvania State University University Park, PA 16802 July 20, 2007

Peng Liu Pennsylvania State University University Park, PA 16802 July 20, 2007

Presentation Transcript

Nutritional Sciences The Pennsylvania State University

The Moore Building Addition University Park, PA 16802

Hyunjoon Park Sociology, University of Pennsylvania Jaesung Choi Economics, University of Pennsylvania

The Pennsylvania State University

John T. Cameron Pennsylvania State University Dr. Sean Brennan Pennsylvania State University

Thomas W. Farmer Pennsylvania State University

Penn State University – University Park

Thomas W. Farmer Pennsylvania State University

Pennsylvania State University

William T. Hartman, Professor, Pennsylvania State University, University Park, PA

A. S. Hoenshel and R. Mittra EMC Lab Pennsylvania State University University Park, PA 16802

S. Gautam, Panjab University, Chandigarh, July 20-21, 2007

Peng Liu, Jun Dai, Xiaoyan Sun, Robert Cole Penn State University pliu@ist.psu

Pennsylvania State University John Yen Chen Zhong Peng Liu

Margaret Mwangi The Pennsylvania State University University Park 16802 PA

Pennsylvania State University

A. S. Hoenshel and R. Mittra EMC Lab Pennsylvania State University University Park, PA 16802

Dr. C. Lee Giles The Pennsylvania State University University Park, PA, USA giles@ist.psu

Jon R. Star Michigan State University Harvard University (as of July 2007)