1 / 76

VERNIER Virtualized Execution Realizing Network Infrastructures Enhancing Reliability

VERNIER Virtualized Execution Realizing Network Infrastructures Enhancing Reliability. VERNIER Project Team DARPA Application Communities Kickoff Meeting July 7, 2006. Outline. Background Project Overview Objectives Project Scope Research Challenges Breakthrough Capabilities

blake
Download Presentation

VERNIER Virtualized Execution Realizing Network Infrastructures Enhancing Reliability

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. VERNIERVirtualized Execution Realizing Network Infrastructures Enhancing Reliability VERNIER Project Team DARPA Application Communities Kickoff Meeting July 7, 2006

  2. Outline • Background • Project Overview • Objectives • Project Scope • Research Challenges • Breakthrough Capabilities • Expected Results • Team — Key Personnel and Roles • Technical Approach • Scenario Exemplars • Project Plan — Schedule and Milestones • Experimentation and Evaluation • Technology Transition Plan

  3. Background • Commercial-off-the-shelf (COTS) software • Large organizations, including DoD, have become dependent on it • Yet, most COTS software is not dependable enough for critical applications • Security breaches • Misconfiguration • Bugs • Large, homogeneous COTS deployments, such as those in DoD, accentuate the risk, since many users • Experience the same failures caused by the same vulnerabilities, configuration errors, and bugs • Suffer the same costly, adverse consequences • Alternatives, such as government-funded development of high-assurance systems present significant barriers in • Cost • Functionality • Performance

  4. VERNIER Project Objectives • Develop new technologies to deliver the benefits of scaling techniques to large application communities • Provide enhanced survivability to the DoD computing infrastructure • Enhance the cost, functionality, and performance advantages of COTS computing environments • Investigate and develop new technologies aimed at enabling communities of systems running similar, widely available COTS software to perform more robustly in the face of attacks and software faults • Deliver a demonstrated, functioning, transition-ready system that implements these new AC survivability technologies • Technical approach: Augmented virtual machine monitor • Commercial transition partner: VMware, Inc.

  5. Project Scope • Collaborative detection and diagnosis of failures • Collaborative response to failures • Advanced situational awareness capabilities • Collective understanding of community state • Predictive capability: Early warning of potential future problems • Key goal: turn the size and homogeneity of the user community into an advantage by converting scattered deployments of vulnerable COTS systems into cohesive, survivable application communities that detect, diagnose, and recover from their own failures • What COTS? • Microsoft Windows, IE, Office suite, and the like

  6. Research Challenges • Extracting behavioral models from binary programs • Breakthrough novel techniques required • Quasi-static state analysis for black-box binaries • Scaled information sharing • Networked application communities sharing knowledge about the software they run • Intelligent, comprehensive recovery • Predictive situational awareness • Automatic, easy-to-understand gauges

  7. Breakthrough Capabilities

  8. Expected Results and Impact • COTS Product (VMware) with breakthrough capabilities for application communities • Scalability to 100K nodes running augmented VMware and custom Vernier software • Automatic collaborative failure diagnosis and recovery • Survivable robust system • Community-aware solution

  9. VERNIER Team • SRI International, Menlo Park, CA • Patrick Lincoln, Principal Investigator • Steve Dawson, Project manager; integration • Linda Briesemeister, Knowledge sharing; collaborative response • Hassen Saidi, Learning-based diagnosis; code analysis; situation awareness • Stanford University • John Mitchell, Stanford PI; code analysis; host-based detection and response • Dan Boneh, Knowledge sharing protocols • Mendel Rosenblum, VMM infrastructure; collaborative response; transition liaison • Palo Alto Research Center (PARC) • Jim Thornton, PARC PI; configuration monitoring and response; situation awareness • Dirk Balfanz, Community response management • Glenn Durfee, Configuration monitoring and response; situation awareness • Technology transition partner: VMWare, Inc.

  10. John Boyd’s OODA Loop Observe Orient Decide Act ImplicitGuidance& Control ImplicitGuidance& Control UnfoldingCircumstances CulturalTraditions Observations GeneticHeritage Decision(Hypothesis) Analyses &Synthesis Action(Test) FeedForward FeedForward FeedForward NewInformation PreviousExperience OutsideInformation UnfoldingInteractionWithEnvironment UnfoldingInteractionWithEnvironment Feedback Feedback Note how orientation shapes observation, shapes decision, shapes action, and in turn is shaped by the feedback and other phenomena coming into our sensing or observing window. Also note how the entire “loop” (not just orientation) is an ongoing many-sided implicit cross-referencing process of projection, empathy, correlation, and rejection. From “The Essence of Winning and Losing,” John R. Boyd, January 1996. Defense and the National Interest, http://www.d-n-i.net, 2001

  11. VERNIER Technical Approach

  12. Notional Host System Architecture

  13. An Abstraction-Based Diagnosis Capability for VERNIER Hassen Saidi, SRI

  14. Objectives Based on the general principle: “much of security amounts to making sure that an application does what it is suppose to do…….. and nothing else!” • Build models of applications behaviors (what the application is suppose to do). • Monitor applications behavior and report malfunctions and unintended behaviors (deviations from behavior). • Use the recorded execution traces as raw data to a set of abstraction-based diagnosis engines (why did the deviation from good intended behavior occurred……to the extent to which we can do a good job answering such question). • Share the state of alerts and diagnosis among the nodes of the community (sharing the bad news.…but also the good ones!). • Aggregate the diagnosis outputs and the alerts into a situation awareness gauge.

  15. App binaries COTS App 1 App 2 . . . App OS Global situation awareness Situation Awareness Gauge & UI Secure Knowledge Sharing Network Collaborative diagnosis, collaborative response Collaborative Response Learning-BasedDiagnosis Local diagnosis, local response Quasi-Static Code Analysis Configuration Analysis Network Traffic Analysis INCREASED APPLICATION COMMUNITY SURVIVABILITY Safe execution Runtime data Monitoring and Control App & OS Execution,Configuration, Network Traffic Dynamic VMM VERNIER OS Base VM Kernel

  16. Approach We combine a set of well known and well established techniques: • building increasingly accurate models of applications behaviors: • Static analysis combined with predicate abstraction to build Dyke and CFG models used for static analysis-based intrusion detection • Implement mechanisms for monitoring sequences of states and actions of an application for the following purposes: • Check if a known bad sequence is executed (signature-based!) • Check for previously unknown variations of known bad sequences (correlation!) • Find root-causes for unexpected malfunction and malicious exploits (Diagnosis) • Diagnosis is performed using techniques borrowed from • Delta-debugging (root-cause diagnosis) • Anomaly detection (correlation) • The situation awareness gauge is implemented as a platform independent web interface

  17. Monitoring-Based Diagnosis • We combine these techniques into two phases: • Monitoring: Applications are monitored and sequences of executions along with configurations are stored. • Diagnosis: Differences between good runs and bad runs are the first clues used for diagnosis • Traces of executions are sequences of: • System calls • Method calls • Changes in configurations • The more information is stored, the better chance that malfunctions and malicious behaviors are properly diagnosed.

  18. Quasi-static binary analysis and predicate abstraction-based intrusion detection • Use static analysis for recovering the control flow graph the application. • CFG generated by compliers for source code. • Recover class hierarchy for object code of OO applications. • Build a pushdown system which is a model that represents an over approximation of the sequences of methods and system calls of the application. • Deal with context sensitivity to match exit calls to return locations. • Use predicate abstraction and data flow analysis to refine the pushdown system and obtain a more accurate model. • Improving the knowledge about arguments to monitored calls.

  19. Better Models and Better Monitoring We are not just interested in detection intrusions, but by also generating high-level explanations of why an application deviates from its intended behavior. • CFG and Dyke models are all over-approximations of the applications behavior (potential attacks are only discovered when the application behavior deviates from the model). • We will use the runs of the application to generate under-approximations of the applications behavior! • Alternatively, ever model representing an over-approximation has a dual that represents an under-approximation (over and under-approximations don’t have to be the same type of models!). • We will combine over and under approximation to reduce the risk of missing possible attacks. • We will refine the over and under approximations to improve the application model.

  20. Behavior outside the over approximation Is unsafe Behavior in between Is suspicious and Is source of diagnosis Behavior within the under approximation Is safe Combining over and under approximations Over approximation (constructed by static analysis) Under approximation (constructed from runs)

  21. What if we don’t have a model of the application? • We can monitor the application as a blackbox and intercept system calls: • Learn a model of good behaviors • Learn a model of bad behaviors • Anomalies are difference between good and bad behaviors • Borrow from delta-debugging techniques to find root-causes of misbehaviors

  22. b b b c d a c b c a b c b b b d Analyzing Differences between runs • There are many differences between execution traces: • Could consider arbitrary lengths of different sub-sequences • Difference of length k should be considered where k is defined depending on the application, the size of the collected data, and the sensitivity of the analysis

  23. a b b c b d b c a b c b c b b d Delta Differences k=2 good run bad run a b b b b c c b b d a b b b b c c b b d Both sequences have the same set of 2-events sequences. This means that, k needs to be increased and that k=2 is A too abstract way of distinguishing the two sequences

  24. c b a b b c b d a b c b c b b d good run bad run a b b b b c b c b c b c c b d a b b b b c c b d a b c c b b b b d a b c c b b b c b c b c b b d b c b c b c Sequence that are in red are those who appear only in the failing sequence. Sequences in blue are sequences appearing only in the successful sequence. Delta Differences k=3

  25. Diagnosis • One of the 6 sequences that are not common to the two runs is the source of the problem: which one?!. We can rank the sequences in order of importance based on: • Application specific criteria: use distance to common sequences for every application-specific origin of a sequence (e.g, process identity, or user identity) • Application-independent criteria: use distance to common sequences • Use distance to common sequences or known bad sequences by ignoring order of execution of calls • Increasing k provides a better explanation, but generates a large number of sequences.

  26. More abstraction • There are more good runs than bad ones!. We need to compare the bad runs to the union of good runs: union of good runs with a single sequence cancel out the one bad run that contains all those sequences! • Use average-sequence-weight ranking

  27. Situation Awareness Gauge

  28. Situation Awareness Gauge • Implemented as a platform independent web interface: (e.g. ruby on rails) • Content is defined by the databases content: attacks, failures, diagnosis, etc • Gauges a simple Displays of number of attacks and failures and various parameters • Provide a user with the possibilities of initiating responses and diagnosis activities in other nodes via the database

  29. Configuration-based Detection, Diagnosis, Recovery, and Situational Awareness Jim Thornton, PARC

  30. Importance of Configuration • Static configuration state highly correlated with system behavior • Many attacks/bugs/errors introduced by way of a substantive change to configuration “A central problem in system administration is the construction of a secure and scalable scheme for maintaining configuration integrity of a computer system over the short term, while allowing configuration to evolve gradually over the long term” – Mark Burgess, author of cfengine

  31. Reliability Want to be here Adaptability AC Opportunity • Leverage scale of population to learn what are bad states in configuration space Today: Every configurationchange is an uncontrolledexperiment AC Future: Configurationchanges managed as controlledreversible trials

  32. Live Monitoring of Configuration State • State analysis • Comparative diagnosis • Vulnerability assessment • Clustering similar nodes and contextualizing observations • Detect change events • Cluster low-level changes into transactions • Log events for problem detection, mitigation and user interaction • Share events in real-time for situational awareness • Active learning • Automated experiments to isolate root causes • Managed testing of official changes like patch installation

  33. Live Control of Configuration State • Modification for Reversibility and Experimentation • Coarse-grained: VM rollback • Medium-grained: Installer/Uninstaller activation • Fine-grained: Direct manipulation of low-level state elements • Prevention • In-progress detection of changes • Interruption of change sequence • Reversal of partial effects

  34. Identifying Badness • Objective Deterministic Criteria • Rootkit detection from structural features • Published attack signatures • Objective Heuristic Criteria • Performance outside of normal parameters • Subjective End-User Report • Dialog with user to gather info, e.g. temporal data for failure appearance • Administrative Policy • Rules specified by administrators within community

  35. Local Components Community 3 App VM COTS VERNIER VM Experimental VM Console(UI) Comm Diag App 1 App 2 App 1 App 2 Agent Agent VERNIER Monitor/Control 1 1 App OS App OS VERNIER OS Base 2 VMM (VM Kernel)

  36. Key Interfaces VERNIER-Agent (TCP/IP, XML?) Registry change events Filesystem change events Install events Manipulate registry Manipulate filesystem Control System Restore VERNIER-VMM (?) Suspend Resume Checkpoint Revert Clone Reset Lock memory Process events Read memory Read/write disk 1 2 3 • VERNIER-Community • (?) • Cluster management • Experience reports • Unknown • Prevalent • Known Bad • Presumed Good • State exchange • Experiment request/response

  37. Community ConfigChange Detector NetworkEvent Detector BehaviorEvent Detector Local Functions NetworkTap Console Communication Manager ResponseController Analysis & Diagnosis Configuration Analysis AgentInside Event Stream BehaviorAnalysis TrafficAnalysis Local DB Local condition detail Event logs Labeled condition signatures State snapshots Experimental data VMM Firewall

  38. Adapting and Extending Host-based, Run-time Win32 Bot Detection for VERNIER Liz Stinson, Stanford

  39. Overview • Background on Stanford’s botnet research • Plans for adapting and extending this work for application to VERNIER

  40. Exploit botnet characteristic: ongoing command and control • Network-based approaches: • Filtering (protocol, port, host, content-based) • Look for traffic patterns (e.g. DynDNS – Dagon) • Hard (encrypt traffic, permute to look like ‘normal’ traffic, …); botwriters control the arena. • Host-based approaches: • Ours: Have more info at host level. Since the bot is controlled externally, use this meta-level behavioral signature as basis of detection

  41. Our approach • Look at the syscalls made by a program • In particular at certain of their args – our sinks • Possible sources for these sinks: • local: { mouse, keyboard, file I/O, … } • remote: { network I/O } • An instance of external control occurs when data from a remote source reaches a sink • Surprisingly works really well: for all bots tested (ago, dsnx, evil, g-sys, sd, spy), every command that exhibited external control was detected

  42. Big picture

  43. Design

  44. Two modes • Cause-and-effect semantics: • Tight relationship between receipt of some data over network and subsequent use of some portion of that data in a sink • Correlative semantics: looser relationship • Use of some data that is the same as some data received over the network • Why necessary?

  45. Behaviors: ideally disjoint;@ lowest level in call stack

  46. Results • Looked at 6 bots: agobot, dsnxbot, evilbot, g-sysbot, sdbot, spybot • At least 4 have totally indep code bases • g-sys non-trivially extends sd • Spybot borrows only syn flood implem from sd • Wide variation in implementation • Every cmd that exhibited external control detected; almost every instance external control flagged (3 false negatives)

  47. Results

  48. Correlative semantics • Why necessary • Why bots with C library functions statically linked in ~= unconstrained OOB copies • In general almost as good as cause-and-effect semantics (stat vs. dyn link) • Exceptions: cmds that format recv’d params (e.g. via sprintf)

  49. Comparison

  50. Comparison

More Related