Collaborative Learning for Security and Repair in Application Communities

Collaborative Learning for Security and Repair in Application Communities Performers: MIT and Determina Michael Ernst MIT Computer Science & Artificial Intelligence Lab 7 July 2006

Personnel • MIT • Michael Ernst • Martin Rinard • Jeff Perkins • Stephen McCamant • Shay Artzi • … and others • Determina • Sandy Wilbourn • Derek Bruening • Saman Amarasinghe • … and others

Vulnerable monocultures Problem: Large installed bases of similar software Susceptible to a single catastrophic attack Opportunity: Large community of cooperating applications Share information about attacks, errors Experiment with different response and recovery strategies Disseminate successful approaches

Components of our solution • Technical ideas: • Targeted bounds enforcement • Data structure consistency learning and enforcement • Implementation platform • Determina Managed Program Execution Engine

Cooperating communities • Each computer is a sentry on watch for problems • Each computer is a testbed for evaluating solutions • Share information about problems and solutions • The system learns: it performs better over time • Example: • One machine notices an error or attack • Generate many distinct patches • Each machine loads a randomly chosen patch • Discard patches that do not yield acceptable behavior

Targeted bounds enforcement • Program errors or injected code indicates bounds violations • Generate patches to eliminate bounds errors • Evaluate patches on many machines • Filter out those that do not eliminate problems (or that cause new problems)

Data structure consistency learning and enforcement Monitor data structures in successful runs Machine learning generalizes to consistency properties • Use of a community minimizes over-fitting Monitor executions for violations Repair corrupt data structures Learn which repairs are most successful • Helps eliminate incorrect constraints

COTS applications • Pros: • Inexpensive, featureful, familiar, widely deployed • Cons: • Contain many (exploitable) bugs • No source code or debug symbols

Determina managed execution • Determina MPEE: Managed Program Execution Environment • Efficient emulation engine for x86 binaries • Typically <5% overhead: permits routine use • API: • Arbitrarily patch and modify the executable • Examine instructions before execution • Set breakpoints at which to suspend execution • Robust and scalable (e.g., Microsoft Office apps)

Productization • Determina’s customers use its security products on commercial Windows applications • Determina partnership permits test and evaluation in COTS environments • If successful, integrate intoVulnerability Protection Suite™ product

Why this can succeed (now) • Technologies (bounds enforcement, constraint learning, and constraint enforcement) have been demonstrated in the lab • Experiments limited in some ways, but more thorough than typical initial research efforts • Determina toolset has unique capabilities • Application community permits faster and more accurate learning, and permits experimentation by reducing the cost of any single failure

Metrics • Tools for Windows binaries built on top of Determina products (MPEE, LiveShield™, etc.) • Bounds enforcement detects 95% of injected code attacks and (asymptotically) recovers from 60% of them • Data structure constraint learning and repair detects 50% of attacks and errors that corrupt data, and recovers from 30% of such errors and attacks

Outline of the presentation • Introduction/overview • Previous work on learning and repair of data structure consistency constraints • DARPA Self-Regenerative Systems program • Details on learning and repair components • Determina security products • Determina monitoring framework • Plans

Challenges • Performing whole-program analysis • Determina tools are basic-block oriented • Inferring types from the heap • Past work has relied on source code or debug symbols • Scaling research tools to very large systems • Focus on small parts of interest • Distribute work among many machines • Scale back parts of the algorithms • New repair algorithms: operate directly on data, tolerate potential conflicts among constraints • Better tolerate mislabeled inputs to the learning algorithm • Learning temporal sequences as well as data structure constraints

Activities • Injected code detection • Patch generation • Patch evaluation and filtering • Constraint learning • Constraint monitoring • Constraint repair • Repair evaluation and filtering • Infrastructure development • Evaluation

Phases of the project • Tool development • Tool integration • Experimentation • Deployment

Deliverables (1) • Enhanced Client Interface for MPEE (Determina) • Injected Code Detection (MIT) • Application State Probing (Determina) • Learning for Binaries (Determina and MIT) • LiveShield Constraint Creation Framework (Determina) • Data Structure Consistency Checking (MIT) • Patch Generation (MIT) • LiveShield Coordination Center (Determina) • Patch Distribution (MIT) • Hybrid System for Binary Analysis (Determina)

Deliverables (2) • Proactive Situation Awareness (Determina) • Vulnerability Analysis (Determina) • Custom Constraints (MIT) • Integration, Testing, and Deployment (Determina) • Alternative Repair Generation (MIT) • Merging of learning (MIT) • Type Inference for Heap Structures (MIT) • Dynamic Constraint Update (MIT) • Repair Evaluation and Filtering (MIT) • Patch Testing (MIT) • Patch Evaluation and Filtering (MIT)

Collaborative Learning for Security and Repair in Application Communities

Collaborative Learning for Security and Repair in Application Communities

Presentation Transcript

P2P for Collaborative Communities

Collaborative and Collaborative Learning

Building Communities for Design Education: Using Telecommunication Technology for Remote Collaborative Learning

Learning Communities Labs for Innovation in Teaching and Learning

Collaborative Evaluation Communities in Urban Schools

COLLABORATIVE LEARNING

Collaborative Learning

Moodle for Collaborative Learning

Security and Privacy in Smart Communities

Indiana Collaborative for Healthier Rural Communities

Learning, Monitoring, and Repair in Application Communities

Collaborative Learning for PLTL

California Chronic Care Learning Communities Initiative Collaborative

Professional Learning Communities: Collaborative Brain Power!

Collaborative Learning Online and In-Class:

Vermont’s Integrated Communities Care Management Learning Collaborative

Application Communities

Chronic Care Learning Communities Initiative Collaborative

California Chronic Care Learning Communities Initiative Collaborative