240 likes | 400 Views
Securing web applications with static and dynamic information flow tracking. PEPM 2008 Monica S. Lam, Michael Martin, V. Benjamin Livshits, John Whaley. INDEX. AUTHOR INTRODUCTION SECURITY VULNERABILITY PATTERNS PQL LANGUAGE OVERVIEW TECHNIQUE EXPERIMENTAL RESULTS
E N D
Securing web applications with static and dynamic information flow tracking. PEPM 2008 Monica S. Lam, Michael Martin, V. Benjamin Livshits, John Whaley
INDEX • AUTHOR • INTRODUCTION • SECURITY VULNERABILITY PATTERNS • PQL LANGUAGE OVERVIEW • TECHNIQUE • EXPERIMENTAL RESULTS • SOMETHING USEFUL FOR US?
Monica S. Lam (Stanford University) • Professor Computer Science Department, founding CEO of moka5 • ACM Fellow ,She chaired the ACM SIGPLAN Programming Languages Design and Implementation Conference in 2000, served on the Editorial Board of ACM Transactions on Computer Systems and numerous program committees for conferences on languages and compilers (PLDI, POPL), operating systems (SOSP), and computer architecture (ASPLOS, ISCA). • Research InterestsMobile computing. Decentralized social networks. Programming and computing systems. • Current Research ProjectsPrPl: A Decentralized Open Trustworthy (DOT) social networking platform, a part of the POMI 2020 (Programmable Open Mobile Internet) project. • Previous Research ProjectsImproving Program Robustness via Static Analysis and Dynamic Instrumentation.The Collective: an Appliance-Based Computing Architecture.The SUIF Compiler System
Monica S. Lam (Stanford University) • Recent Publications • Automatic dimension inference and checking for object-oriented programs. ICSE 2009 • Automatic Generation of XSS and SQL Injection Attacks with Goal-Directed Model Checking. USENIX Security Symposium 2008 • Automatic Inference of Stationary Fields: a Generalization of Java's Final Fields. POPL • Selected Publications • A Practical Dynamic Buffer Overflow Detector. NDSS 2004 • Security and Manageability • Finding Security Vulnerabilities in Java Applications Using Static AnalysisUSENIX Security Symposium • Program Analysis • Context-Sensitive Program Analysis as Database QueriesSIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems • Cloning-Based Context-Sensitive Pointer Alias Analysis Using Binary Decision Diagrams ACM Programming Language Design and Implementation Best Paper Award PLDI 2004 • A Practical Flow-Sensitive and Context-Sensitive C and C++ Memory Leak Detector PLDI 2003 • Tracking Down Software Bugs Using Automatic Anomaly Detection ICSE 2002 • Architecture • Limits of Control Flow on Parallelismthe 19th Annual International Symposium on Computer Architecture
Michael Martin (MIT) • Professor in the Department of Electrical Engineering and Computer Science at the Massachusetts Institute of Technology, and a member of the Computer Science and Artificial Intelligence Laboratory • Research • Modular Program Analysis for Data Structure Consistency • Acceptability-Oriented Computing • Pointer and Escape Analysis • Automatic Parallelization • Commutativity Analysis • Symbolic Analysis of Divide and Conquer Algorithms • Synchronization Optimizations • Credible Compilation • Jade
Benjamin Livshits • Researcher at Microsoft Research in Redmond, WA. • Active projects • Gatekeeper static analysis of JavaScript • Nozzle protecting browsers against heap spraying attacks • Ripley ensuring integrity of distributed Web applications • Doloto code splitting for faster Web 2.0 apps • Merlin improving and inferring specification for static analysis tools • Retired projects • AjaxScope/Ajax View distributed Web 2.0 monitoring Volta advanced distributing tier-splitting .NET Spectator detection and containment of JavaScript worms Griffin Software Security Project protecting Web applications from security attacks using static and runtime analysis PQL project specifying queries on program behavior bddbddb declarative program analysis LAPSE Web application security scanner for Java Checklipse Finding Bugs in Eclipse Code using Eclipse
John Whaley (Moka5, Inc.) • the founders of moka5. moka5 is devoted to the mission of making PCs easier to manage and use. • Most of research has been in the areas of compilers, program analysis, software engineering, and virtual machines. interested in dynamic compilation, optimization of object-oriented languages, and pointer analysis. also interested in using program analysis to automatically identify and fix software flaws, and in improving programmer productivity through the use of software tools. • He received his Ph.D. in 2007 from Stanford University, working under the tutelage of Monica Lam in the SUIF group. From 2005 to 2007, he took a leave of absence from Stanford to help found moka5. he graduated from MIT in 1999 with a Bachelors in Computer Science and a Masters in Electrical Engineering and Computer Science. At MIT, he was working mostly with Martin Rinard. From 1999-2000, he was working at IBM Tokyo Research Lab on their Java JIT compiler.
John Whaley (Moka5, Inc.) • designed and wrote the open source Joeq virtual machine and compiler infrastructure, used by many researchers throughout the world and as the basis for the compilers course at Stanford. also created bddbddb, a powerful BDD-based program analysis tool with a very silly name. • Open source projects: • Joeq: a virtual machine and compiler infrastructure. LGPL. • JavaBDD: an efficient BDD (binary decision diagram) library for Java. LGPL. • bddbddb: BDD-Based Deductive DataBase, a tool for translating analysis specifications into efficient BDD implementations. LGPL. • BuDDy: a popular open-source BDD library that he contribute to. • MIT Flex compiler: a compiler infrastructure he worked on at MIT. GPL. • Eclipse keepresident plugin: a simple plugin for Eclipse on Windows that keeps Eclipse from being swapped out, greatly reducing pause times. • Maven: a software project management tool. he keep his own modifications to Maven here
INTRODUCTION • The security of Web applications has become increasingly important. • The distributions as below: • 1. Information Trackingthis paper focuses on the use of information flow for securing web applications, the techniques described are useful for other topics such as debugging and avoiding leakage of confidential data. • 2. PQL This paper describes a high-level declarative language called PQL(Program Query Language). • 3. Integrating Static and Dynamic Analysesthe techniquesSound static information trackers using context-sensitive pointer alias analysis.Optimized dynamic instrumentationModel checkingDynamic error recovery
Overview of the PQL system PQL queries are systematically translated into Datalog queries. Then translate Datalog queries into BDD operations–this makes it possible to encode the exponentially many calling contexts in large Java applications succinctly. also use the pointer alias analysis to help resolve reflection accurately. an abstract model abstracting only the user and the support libraries. monitoring the program’s information flow dynamically
Security Vulnerability Patterns • 2.1 SQL Injection • 2.2 Taint-Based Vulnerabilities • Cross-site scripting • HTTP response splitting • Path traversal
PQL Language Overview • The focus of PQL is to track method invocations and accesses of fields and array elements in related objects. • they model the dynamic program execution as a sequence of primitive events, in which the checkers find all subsequences that match the specified pattern. • Abstract Execution Traces • PQL Queries
Abstract Execution Traces • They abstract the program execution as a trace of primitive events, each of which contains a unique event ID, an event type, and a list of attributes. • all but the following eight event types are abstracted away: • Field loads and stores. The attributes of these event types are the source object, target object, and the field name. • Array loads and stores. The attributes of these event types are the source and target objects. The array index is ignored. • Method calls and returns. The attributes of these event types are the method invoked, the formal objects passed in as arguments and the returned object. The return event parameter includes the ID of its corresponding call event. • Object creations. The attributes of this event type are the newly returned object and its class. • End of program. This event type has no attributes and occurs just before the Java Virtual Machine terminates.
Abstract Execution Traces • Example 1. Abstract execution trace. • We illustrate the concept of an abstract execution trace with the code below: • 1 int len = names.length; names->o1 • 2 for (int i = 0; i < len; i++) { o2,o6->o1element • 3 String s = request.getParameter(names[i]); request -> o3, s -> o4,o7 • 4 con.execute(s); con->o5 • 5 }
PQL Language Overview • The focus of PQL is to track method invocations and accesses of fields and array elements in related objects. • they model the dynamic program execution as a sequence of primitive events, in which the checkers find all subsequences that match the specified pattern. • Abstract Execution Traces • PQL Queries
PQL Queries • This paper describes a high-level declarative language called PQL(Program Query Language) . • PQL allows programmers to describe a class of information flow as a pattern that resembles an excerpt of Java code. • system automatically detects the existence of information flow in a program matching a specified pattern both statically and dynamically, In addition, the programmer can specify the corrective actions to take if such a pattern is detected dynamically. • In this way, the program heals automatically rather than simply reporting an error and terminating the program. This auto-healing property is important to prevent users from mounting a denial-of-service attack by crashing the program.
PQL Queries • 3.2.1 Query VariablesQuery variables correspond to objects in the program that are relevant to a match. The most common variables represent objects, and represent individual objects on the heap, Object variables have a class name that restricts the kind of object instances that they can match.There are also member variables, which represent the name of a field or a method.Query variables are either arguments, return, or internal variables. • 3.2.2 StatementsMost primitive statements in our query language correspond directly to the event types of the abstract execution trace. • 3.2.3 SubqueriesSubqueries allow users to specify recursive event sequences or recursive object relations. • 3.2.4 Reacting to a MatchMatches in PQL often correspond to notable or undesirable program behavior.
TECHNIQUE • The techniques discussed in this paper: • Context-Sensitive Static Information Tracking • Dynamic Monitoring Code • Model Checking
Context-Sensitive Static Information Tracking • Sound information flow analysis is challenging because objects carrying the information may be passed around as heap references and method parameters throughout the program. • checkers use pointer information from a sound cloning-based context-sensitive inclusion-based pointer alias analysis (PLDI 2004). This analysis computes the points-to relations for each distinct call path for programs without recursion. The points-to information is stored in bddbddb. The data are compactly represented BDDs, and can be accessed efficiently with queries written in the logic programming language Datalog. We can then use bddbddb to resolve the queries. • After running bddbddb, they will have as our result a set of program objects that could participate in the match of each subquery.
Dynamic Monitoring Code • They use two main strategies to lower the overhead. First, they modify the program to only track objects at program points that might generate an event of interest for the specific query. Also, instead of collecting full traces, their system tracks all the partial matches as the program executes and takes action immediately upon recognizing a match.
Model Checking • Model checking is very helpful for the programmers to fix the bugs. However, model checking is challenging for real programs. • We have developed a model checker, called QED (Query-based Event Director). Instead of generating large number of random input vectors to exercise the program, QED uses a goal-directed approach to generate those essential input vectors that exercise those portions of code that may harbor vulnerabilities. • The input to QED is the Java bytecode instrumented to detect the vulnerability patterns of interest. The monitoring code has been optimized so only those sections that can participate in an attack, according to the context-sensitive information tracking analysis, are instrumented. • for the sake of efficiency, they use non-deterministic choices reflecting the different paths we wish to explore instead of executing a full object persistence layer.
SOMETHING USEFUL FOR US? • Overview-The pattern they combining the static analysis and the dynamic analysis. • Use bddbddb system • Dynamic monitoring code -tracks all the partial matches • Model checking -abstracting only the user and the support libraries.