Improving Software Security with Precise Static and Runtime Analysis

Improving Software Security with Precise Static and Runtime Analysis Benjamin Livshits SUIF Compiler Group Computer Systems Lab Stanford University http://suif.stanford.edu/~livshits/work/griffin/

Security Vulnerabilities: Last 10 Years source: NIST/DHS Vulnerability DB

“Today over 70% of attacks against a company's Web site or Web application come at the 'Application Layer' not the Network or System layer.” – Gartner Group

Which Security Vulnerabilities are Most Prevalent? • Analyzed 500 vulnerability reports, one week in November 2005 294 vuln. or 58% source: securityfocus.com

Focusing on Input Validation Issues Web application vulnerabilities source: securityfocus.com

SQL Injection Example • Web form allows user to look up account details • Underneath: Java J2EE Web application serving requests String username = req.getParameter(“user”); String password = req.getParameter(“pwd”); String query = “SELECT * FROM Users WHERE username =“ + user + “ AND Password =“ + password; con.executeQuery(query);

Injecting Malicious Data (1) submit • ... • ... String query = “SELECT * FROM Users WHERE username = 'bob' AND password = ‘********‘”; ...

Injecting Malicious Data (2) submit • ... • ... String query = “SELECT * FROM Users WHERE username = 'bob‘-- ‘AND password = ‘ ‘”; ...

Injecting Malicious Data (3) submit ... ... String query = “SELECT * FROM Users WHERE username = 'bob‘; DROP Users-- ‘AND password = ‘‘”; ...

Web Application Attack Techniques Command injection SQL injections HTTP request smuggling Cross-site tracing Path traversal Parameter manipulation Cross-site scripting Hidden field manipulation HTTP request splitting Cookie poisoning Header manipulation Second-level injection

1. Sources (inject) Parameter manipulation Hidden field manipulation Header manipulation Cookie poisoning Second-level injection 2. Sinks (exploit) SQL injections Cross-site scripting HTTP request splitting Path traversal Command injection Web Application Attack Techniques 1. Parameter manipulation + 2. SQL injection = vulnerability

Goals of the Griffin Software Security Project • Financial impact (Gartner group report) • Cost per incident: $300,000+ • Total cost of online fraud: $400B/year 2000 2001 2002 2003 2004 2005 2006 • Yahoo! mail May 2006 • sourceforge.net Apr. 2006 • Myspace.com Apr. 2006 • ebay.com Apr. 2006 • google.com Apr. 2006 • hotmail.com Feb. 2006 • ri.gov Jan. 2006 • …

Existing Application Security Solutions Client-side input validation Penetration testing Client-side input validation Penetration testing (manual and automatic) Application firewalls Code reviews Application firewalls Code reviews

Overview Overview of the Griffin Project Static Extensions Dynamic Experiments Conclusions Future

Griffin Project: Framework Architecture Vulnerability specification Application bytecode Provided by user Static analysis Dynamic analysis [Livshits and Lam, Usenix Security ’05] [Martin, Livshits, and Lam, OOPSLA ’05] Vulnerability warnings Instrumented application Pros: • Find vulnerabilities early • Explores all program executions • Sound, finds all vuln. of particular kind Cons: • May still have imprecision Pros: • Keeps vulnerabilities from doing harm • Can recover from exploits • No false positives Cons: • Incurs an overhead

Recent bug finding techniques Instrinsa Prefix [SoftPractExp 2000] – FindBugs [Hovemeyer, OOPSLA ‘04] Metal [Engler, et.al. PLDI ‘02] – Saturn [Xie, et.al. FSE ‘05] These techniques lack soundness → best-effort attempts Static Analysis for Bug Finding: Why Soundness? NULL dereferences Buffer overruns Format string violations Performance errors SQL injections Memory leaks Find & fix common bugs Need to find & fix all Cross-site scripting Data races Locking errors TOCTOU

Following Unsafe Information Flow / Taint Flow How do we know what these are? sanitizer String.substring “…”+ “…” sink Servlet.getParameter(“user”) (source) Security violation Statement.executeQuery(...) (sink)

Vulnerability Specification • User needs to specify • Source methods • Sink methods • Derivation methods • Sanitization methods • PQL: Program Query Language [Martin, Livshits, and Lam OOPSLA’05] • General language for describing events on objects • Real queries are longer • 100+lines of PQL • Captures all vulnerabilities • Suitable for all J2EE applications query simpleSQLInjection returns object String param, derived; uses object HttpServletRequest req; object Connection con; object StringBuffer temp; matches{ param = req.getParameter(_); temp.append(param); derived = temp.toString(); con.executeQuery(derived); }

Overview Static Analysis Static Extensions Dynamic Experiments Conclusions Future

Motivation: Why Pointer Analysis? • What objects do usernameand strpoint to? • Question answered by pointer analysis • A classic compiler problem for 20 years+ • Rely on context-sensitive inclusion-based pointer analysis [Whaley and Lam PLDI’04] String username = req.getParameter(“user"); list1.addFirst(username); ... String str = (String) list2.getFirst(); con.executeQuery(str);

Pointer Analysis Precision Runtime heap Static representation • Imprecision of pointer analysis → false positives • Precision-enhancing pointer analysis features Static approximation h o1 o2 o3

Towards Greater Pointer Analysis Precision • Precision of pointer analysis greatly affects end results • Variety of pointer analysis techniques suggested map sensitive object sensitive context sensitive [PLDI’04] base line

Importance of Context Sensitivity imprecision →excessive tainting → false positives c1 tainted tainted c1 String id(String str) { return str; } c2 untainted tainted c2 points-to(vc : VarContext, v : Var, h : Heap) points-to(v : Var, h : Heap) Context insensitive Context sensitivity

Handling Containers: Object Sensitivity 1. String s1 = new String(); // h1 2. String s2 = new String(); // h2 3. 4. Map map1 = new HashMap(); 5. Map map2 = new HashMap(); 6. 7. map1.put(key, s1); 8. map2.put(key, s2); 9. 10. String s = (String) map2.get(key); points-to(vc : VarContext, v : Var, h : Heap) points-to(vo1:Heap, vo2 :Heap, v : Var, ho :Heap, h :Heap) points-to(vc : VarContext, vo1:Heap, vo2 :Heap, v : Var, ho :Heap, h :Heap) points-to(vo : Heap, v : Var, h :Heap) 1-level object sensitivity + context sensitivity 1-level object sensitivity Context sensitivity Object sensitivity

Inlining: Poor Man’s Object Sensitivity • Call graph inlining is a practical alternative • Inline selected allocations sites • Containers: HashMap, Vector, LinkedList,… • String factories: String.toLowerCase(), StringBuffer.toString(), ... • Generally, gives precise object sensitive results • Need to know what to inline: determining that is hard • Inlining too little → false positives • Inlining too much → doesn’t scale • Iterative process • Can’t always do inlining • Recursion • Virtual methods with >1 target

Map Sensitivity 1. ... 2. String username = request.getParameter(“user”) 3. map.put(“USER_NAME”, username); ... 4. String query = (String) map.get(“SEARCH_QUERY”); 5. stmt.executeQuery(query); 6. ... • Maps with constant string keys are common in Java • Map sensitivity: augment pointer analysis: • Model HashMap.put/getoperations specially “USER_NAME” ≠ “SEARCH_QUERY”

Pointer Analysis Hierarchy Context sensitivity Object sensitivity Map sensitivity Flow sensitivity None None None None 1-OS Local flow k-CFA k-OS Constant keys Predicate- sensitive ∞-CFA CFL reachability ∞-OS Symbolic analysis of keys Interprocedural predicate-sensitive ∞-CFA Partial 1-OS Constant string keys Local flow

PQL into Datalog Translation PQL Query Datalog Query [Whaley, Avots, Carbin, Lam, APLAS ’05] simpleSQLInjection(hparam, hderived ) :– ret(i1, v1), call(c1, i2, "ServletRequest.getParameter"), pointsto(c1, v1, hparam), actual(i2, v2, 0), actual(i2, v3, 1), call(c2, i2, "StringBuffer.append"), pointsto(c2, v2, htemp), pointsto(c2, v3, hparam), actual(i3, v4, 0), ret(i3, v5), call(c3, i3, "StringBuffer.toString"), pointsto(c3, v4, htemp), pointsto(c3, v5, hderived), actual(i4, v6, 0), actual(i4, v7, 1), call(c4, i4, "Connection.execute"), pointsto(c4, v6, hcon), pointsto(c4, v7, hderived). query simpleSQLInjection returns object String param, derived; uses object ServletRequest req; object Connection con; object StringBuffer temp; matches { param = req.getParameter(_); temp.append(param); derived = temp.toString(); con.executeQuery(derived); } Relevant instrumentation points Datalog solver Vulnerability warnings

Eclipse Interface to Analysis Results • Vulnerability traces are exported into Eclipse for review • source → o1 → o2 → … → on → sink

Importance of a Sound Solution • Soundness: • only way to provide guarantees on application’s security posture • allows us to remove instrumentation points for runtime analysis • Soundness claim Our analysis finds all vulnerabilities in statically analyzed code that are captured by the specification

Overview Static Analysis Extensions Static Extensions Dynamic Experiments Conclusions Future

Towards Completeness • Completeness goal: • analyze all code that may be executed at runtime specify roots call graph discovery

Generating a Static Analysis Harness public class Harness { public static void main(String[] args){ processServlets(); processActions(); processTags(); processFilters(); } ... } <servlet> <servlet-name>blojsomcommentapi</servlet-name> <servlet-class> org.blojsom.extension.comment.CommentAPIServlet </servlet-class> <init-param> <param-name>blojsom-configuration</param-name> </init-param> <init-param> <param-name>smtp-server</param-name> <param-value>localhost</param-value> </init-param> <load-on-startup>3</load-on-startup> </servlet> App App App 500,000+ lines of code web.xml web.xml web.xml Application Server (JBoss) 1,500,000+ lines of code 2M+ lines of code

Reflection in Java Applications ... try { Class macOS = Class.forName("gruntspud.standalone.os.MacOSX"); Class argC[] = {ViewManager.class}; Object arg[] = {context.getViewManager()}; Method init = macOS.getMethod("init", argC); Object obj = macOS.newInstance(); init.invoke(obj, arg); } catch (Throwable t) { // not on macos } ... • Reflection is very common in large apps • Often used to load plug-ins, avoid incompatibilities, etc.

Reflection Resolution Constants Specification points 1. String className = ...; 2. Class c = Class.forName(className); 3. Object o = c.newInstance(); 4. T t = (T) o; Q: what object does this create? 1. String className = ...; 2. Class c = Class.forName(className); Object o = new T1(); Object o = new T2(); Object o = new T3(); 4. T t = (T) o; • [Livshits, Whaley, and Lam, APLAS’05]

Reflection Resolution Results • Applied to 6 large Java apps, 190,000 lines combined Methods Call graph sizes compared

Overview Dynamic Analysis Static Extensions Dynamic Experiments Conclusions Future

Runtime Vulnerability Prevention [Martin, Livshits, and Lam, OOPSLA’05] App • Detect and stop • Detect and recover App Vulnerability specification App Application Server (JBoss)

Runtime Instrumentation Engine • PQL spec → into state machines • Run alongside program, keep track of partial matches • Run recovery code before match {x=y=o3} {x=o3} y := x ε ε ε t=x.toString() y := derived(t) STOP ε ε {x=y=o3} {x=o3} ε {x=o3} t.append(x) y := derived(t) {x=o3} sanitizer

Reducing Instrumentation Overhead query simpleSQLInjection returns object String param, derived; uses object ServletRequest req; object Connection con; object StringBuffer temp; matches { param = req.getParameter(_); temp.append(param); derived = temp.toString(); con.executeQuery(derived); } • Instrument events on objects that may be part of a match • Soundness allows to remove instrumentation points 1. String name = req.getParameter(“name”); 2. StringBuffer buf1 = new StringBuffer(); 3. StringBuffer buf2 = new StringBuffer(“def”); 4. buf2.append(“abc”); 5. buf1.append(name); 6. con.executeQuery(buf1.toString()); 7. con.executeQuery(buf2.toString());

Overview Experimental Results Static Extensions Dynamic Experiments Conclusions Future

Experimental Evaluation • Comprehensive evaluation: • SecuriBench Macro [SoftSecTools ’05] • SecuriBench Micro • Google: SecuriBench • Compare Griffin to a commercially available tool • Griffin vs. Secure Software CodeAssure • CodeAssure: March 2006 version

Benchmark Statistics Lines of Code

SecuriBench Macro: Static Summary

Vulnerability Classification • Reported issues back to program maintainers • Most of them responded, most confirmed as exploitable • Vulnerability advisories issued

SecuriBench Micro: Static Summary

With sanitizers added 0 A Study of False Positives in blojsom Base Q: How important are analysis features for avoiding false positives? 114 With context sensitivity 84 With object sensitivity 43 With map sensitivity 5

Griffin vs. CodeAssure SecuriBench Macro 80+ Q: What is the relationship between false positives and false negatives? SecuriBench Micro 40+

Deep vs. Shallow Vulnerability Traces Q: How complex are the vulnerabilities we find?

Analyzing personalblog Hibernate library code Application code Q: What is the connectivity between sources and sinks? sinks sf.hibernate.Session.find(…) sources objects roller 1falsely tainted object → 100+ false positives

Improving Software Security with Precise Static and Runtime Analysis