260 likes | 277 Views
Learn how to prevent JavaScript vulnerabilities to protect data integrity and enhance web security. Understand taint analysis, sources, sinks, and sanitizers. Get insights into complex JavaScript functionalities and taint analysis methods.
E N D
Saving the World Wide Webfrom Vulnerable JavaScriptInternational Symposium on Software Testing and Analysis (ISSTA 2011) Salvatore GuarnieriIBM Software Groupsguarni@us.ibm.com Marco PistoiaIBM T. J. Watson Research Centerpistoia@us.ibm.com Omer TrippIBM Software Groupomert@il.ibm.com Stephen TeilhetIBM Software Groupsteilhet@us.ibm.com Julian DolbyIBM T.J. Watson Research Centerdolby@us.ibm.com Ryan BergIBM Software Groupryan.berg@us.ibm.com www.research.ibm.com/labasec
Consequences of Taint Violations • Read and write access to saved data in cookies and local data stores • Read and write access to data in the web page • Key loggers • Impersonation • Phishing via page modifications or redirects
Getting data from the DOM Sanitizing some, but not all, of the data var el1 = document.getElementById("d1"); function foo() { var el2 = document.getElementById("d2"); function bar() { var el3 = new Element(); var s = encodeURIComponent(el2.innerText); document.write(s); el1.innerHTML = el2.innerText; document.location = el3.innerText; } bar(); } foo(); function baz(a, b) { a.f = document.URL; document.write(b.f); } var x = new Object(); baz(x, x); Writing untrusted data into web page Writing unchecked data to the web page
Motivation Sources, Sinks, and Sanitizers Taint Analysis Results
var el1 = document.getElementById("d1"); function foo() { var el2 = document.getElementById("d2"); function bar() { var el3 = new Element(); var s = encodeURIComponent(el2.innerText); document.write(s); el1.innerHTML = el2.innerText; document.location = el3.innerText; } bar(); } foo(); function baz(a, b) { a.f = document.URL; document.write(b.f); } var x = new Object(); baz(x, x);
var el1 = document.getElementById("d1"); function foo() { var el2 = document.getElementById("d2"); function bar() { var el3 = new Element(); var s = encodeURIComponent(el2.innerText); document.write(s); el1.innerHTML = el2.innerText; document.location = el3.innerText; } bar(); } foo(); function baz(a, b) { a.f = document.URL; document.write(b.f); } var x = new Object(); baz(x, x);
Rules • A rule is a triple <Sources, Sinks, Sanitizers> • Not all sources are valid for all sinks, and not all sanitizers are valid for all sinks
Rules • A rule is a triple <Sources, Sinks, Sanitizers> • Not all sources are valid for all sinks, and not all sanitizers are valid for all sinks • Sources • Seeds of untrusted data • Field gets or returns of function calls • Ex: document.url
Rules • A rule is a triple <Sources, Sinks, Sanitizers> • Not all sources are valid for all sinks, and not all sanitizers are valid for all sinks • Sources • Seeds of untrusted data • Field gets or returns of function calls • Ex: document.url • Sinks • Security critical operations • Field puts or parameters to function calls • Ex: element.innerHTML
Rules • A rule is a triple <Sources, Sinks, Sanitizers> • Not all sources are valid for all sinks, and not all sanitizers are valid for all sinks • Sources • Seeds of untrusted data • Field gets or returns of function calls • Ex: document.url • Sinks • Security critical operations • Field puts or parameters to function calls • Ex: element.innerHTML • Sanitizers • Marks flow as non-dangerous • Function calls • Ex: encodeURIComponent(str)
Motivation Sources, Sinks, and Sanitizers Taint Analysis Results
Complexities of JavaScript function foo() { var y = 42; var bar = function() { write(y); } } • Reflective property access • Prototype chain property lookup • Lexical scoping • Function pointers • eval and its relatives function F() { this.bar = document.url; } function G() { } G.prototype = new F(); var a = new G(); write(g.bar); eval("document.write('evil')"); var a = "foo" + "bar"; var b = obj[a]; var m = function() ... var k = function(f) { f(); } k(m);
The seeds are the assignments to sources or return values from sources The analysis proceeds by tainting variables Variables consist of triplets: Static Single Assignment (SSA) variable ID Method where SSA variable is defined Access path Ex: (v7, m, <f, g>) Demand Driven Taint Analysis
Start from taint sources Propagate taint intra-procedurally through def-use Inter-procedurally propagate taint forward Resolve aliasing by using Andersen alias analysis Record constraints on call sites, recursively In the final constraint-propagation graph, detect paths between sources and sinks not intercepted by sanitizers Context Sensitive Taint Analysis m1() m2(p1, p2, p3) m3(q1, q2)
Analysis Example Taint variable: (v2, foo, <f, *>) function foo(p1, p2) { p1.f = p2.f; } var a = new Object(); var b = new Object(); b.f = window.location.toString(); var c = new Object(); var d = new Object(); d.f = "safe"; foo(a, b); foo(c, d); document.write(a.f); // This is a taint violation document.write(c.f); // This is NOT a taint violation Install taint summary for foo: p2.f -> p1.f Since d.fis not tainted, c.fwill not be tainted
Motivation Sources, Sinks, and Sanitizers Taint Analysis Results
Data Sets • Developed a micro-benchmark suite of about 150 test scripts • Downloaded Web pages and ran Actarus on them
Real World Data Set • Crawled portions of top Alexa Web sites and downloaded pages to disk • Ran Actarus on a sample of the saved pages • Ran on over 12,000 pages • Successfully analyzed over 9,000 pages • ~22% failure due to a 4 minute timeout
Findings • Several vulnerable Web sites were found • Duplicates of vulnerabilities were found on many pages from the same site • Some exploits were found in third party code that was shared among several websites • 40% true positive rate • Vulnerabilities can be fixed with common sanitization routines
User Friendly Output • Flows are highlighted and numbered in the source code • JavaScript was pretty printed to improve readability and usefulness of line numbers
Future Work • Using string analysis to reduce false positives • Make analysis modular so library code does not have to be reanalyzed
Thank You E-mail: sguarni@us.ibm.com