480 likes | 498 Views
The SICILIAN Defense: Signature-based Whitelisting of Web JavaScript. Pratik Soni, Enrico Budianto , and Prateek Saxena National University of Singapore. Application Whitelisting in Various Platforms. Whitelisting on the web?. Problem: Script Injection Attacks.
E N D
The SICILIAN Defense:Signature-based Whitelisting of Web JavaScript Pratik Soni, Enrico Budianto, and Prateek Saxena National University of Singapore
Application Whitelisting in Various Platforms Whitelisting on the web?
Problem: Script Injection Attacks Typosquatting XSS [2], Cross-channel Scripting [3], etc… CDN Server Web Server Malicious CDN [1] 2 BAD LINK Bad Input Key Idea: Browser should whitelist scripts before execution XSS Attacks 1
Our Contributions • Large-scale study of JS changes • SICILIAN: the first practical signature-based whitelisting • A browser-assisted deployment model called progressive lockdown 45,066 web pages 33,302 scripts Alexa’s Top 500 Sites + 15 Popular PHP Applications 3-month period, starting January 2015
Existing W3C Proposal: SRI • W3C Recommendation: Subresource Integrity (SRI) -> Raw Signature comp.nus.edu.sg jquery.min.js sha256-C6CB9U…qFQmYg= jquery.com Subresource Integrity Website
Is Raw Signature Effective & Practical? • Of the 33,302 scripts, 30989 (97.99%) scripts remain static and 2312 scripts change • only 69 out of 500 websites have all the scripts remaining static • How often does a script change?
Is SRI Effective? 200 domains q p 127 domains Can we do better?
Categories of Changes in JavaScript • C1: Syntactic Changes • Changes that affect only the syntactic structure of code • C2: Data-only • Changes in JS literals • C3: Semantic Changes • Changes that introduce new JS code • C3A: Infrequent Changes • C3B: Frequent Changes Our focus for constructing more relaxed signature
Examples: C1 (Syntactic Change) Variable Renaming /*requested Sun, 15 Feb 2015 12:33:27GMT*/ (function(){geolocation = {}; …})(); var h = parseInt(o.css (“margin-top”)), f = h+ d; /*requested Sat, 14 Feb 2015 8:10:57 GMT*/ (function(){geolocation = {}; …})(); var f = parseInt(o.css (“margin-top”)), h = f + d; Comments
Sinks in C2 (Data Change) document.write(’<link rel=\"stylesheet\" href=\"http:\/\/staticd.cdn.industrybrains. 2 com\/css\/zonesuu\/zone796.css\"uu type=\"text\/css\" \/>\n’) document.write(’<link rel=\"stylesheet\" href=\"http:\/\/staticd.cdn.industrybrains. 2 com\/css\/zonesuu\/zone832.css\"uu type=\"text\/css\" \/>\n’)
A better whitelisting scheme that tolerates benign JS changes?
Overview of Whitelisting Process foo.com server Script #1 S# = pqrst foo.com foo.com admin Update #2 S# = fghij Script #2 S# = abcde bar.com server
Design Goals • Attacker must not be able to forge the signature • Execution of valid script • Collision-hardness property • Signature scheme must be robust against benign JS changes (C1 and C2) • Signature scheme should require infrequent signature update
Building Structural Signature • Naïve extension of raw signature // Version 1 function Collection(){ varobj= ’<div>ADS-CONTENT<div>’; this.items = []; } SHA256(‘function Collection(){ varobj =; this.items = []; }’) Source: img.ifeng.com
Building “Structural” Signature • Challenges #1: Syntactic Changes (minification) • Solution #1:AST as a representation basis for constructing the signature SHA1(‘function Collection(){ varobj =; this.items = []; }’) //Version 2 function Collection(){varobj = ’<div>ADS-CONTENT<div>’; this.items=[];} SHA1(‘function Collection(){varobj=; this.items=[];}’) Source: img.ifeng.com
Signature Computation = Cryptographic hash function (SHA256) (n) = Hash value at n lName = Node’s label name S# (nr) Program nr var x = 10; Variable Declaration S#(n3) =(lName) S#(n1)||S#(n2))) Variable Declarator Variable Declarator n4 n3 S#(n1) S#(n2) Identifier Identifier Literal Literal … n2 n1 x 10
Building Structural Signature "wgAvailableSkins":{"cologneblue":"CologneBlue", "myskin":"MySkin","simple":"Simple", "modern":"Modern", "nostalgia":"Nostalgia", "monobook":"MonoBook", "standard":"Standard", "chick":"Chick"} Isomorphic "wgAvailableSkins":{"cologneblue":"CologneBlue", "monobook":"MonoBook","myskin":"MySkin", "simple":"Simple","chick":"Chick","modern":"Modern", "nostalgia":"Nostalgia","standard":"Standard"} • Isomorphism 1: Node permutation Source: www.google.com/jsapi • Challenges #2: Permutations of unordered properties
Computing Structural Signature ObjectExpression Key SORT(S#(n1), S#(n2), S#(n3)) "wgAvailableSkins" Value … S#(n2)=def123 S#(n3)= ghi123 S#(n1)=abc123 ObjectExpression ObjectExpression ObjectExpression “cologneblue" “simple" “myskin"
Building Structural Signature • Challenges #3: Variable renaming • Solution #1: Completely ignore variable name varbob= "<div>ADS</div>"; bob= "function evil(){}; evil()"; document.write(bob); varalice= "<div>ADS</div>"; bob= "function evil(){}; evil()"; document.write(alice); var h = parseInt(o.css (“margin-top”)), f = h+ d; var f = parseInt(o.css (“margin-top”)), h = f + d; var xxx= "<div>ADS</div>"; xxx= "function evil(){}; evil()"; document.write(xxx);
Isomorphism 2: Label Renaming • Key Idea: Replace identifier (variable name) with something that uniquely identifies it • Our proposal: The variable’s usage pattern • Example: var x = 10, y; y = x + 1 var x var y Assigned to value ‘undefined’ Assigned to value ‘10’ Assigned by a value related to arithmetic operation (LHS) Become part of an arithmetic Operation (RHS) Structural Identity of y Structural Identity of x
Progressive Lockdown • INIT: Compile initial whitelists for a limited set of web pages • CRAWL: Use the crowds to build the whitelist • LOCK: Freeze crawl, instruct the browser to use the whitelist
Implementation • Chromium version 43.0 Raw Signature 1 M Struct. Signature 2 JS Engine Progressive Lockdown + = SICILIAN is Browser-agnostic
Evaluation Questions to be answered • How many websites to which SICILIAN can be fully applied? (Deployability) • What is the rate of signature updates for SRI and SICILIAN? (Rate of Signature Updates) • How big is the performance overhead after adopting SICILIAN? (Performance)
Deployability • SICILIAN can be fully applied to 84.7% websites and 15 PHP Apps (15% in SRI, 7 apps) Average UF of SICILIAN = 0.057 (1 update/month) SRI SICILIAN
Performance • Sample one page of every domain in Alexa’s 500
Conclusion • SICILIAN: The first multi-layered signature-based whitelisting approach that tolerates benign JS changes • 3-month long study on JS changes • A browser-assisted deployment model called progressive lockdown Performance overhead of 7.02%, compared to vanilla browsers Fully applicable to 84.7% of Alexa’s 500 sites Up to 1 whitelists update per-month (on average)
Thank You! http://www.comp.nus.edu.sg/~enricob/2015/ccs15.pdf enricob@comp.nus.edu.sg
References • Levy, Amit, Henry Corrigan-Gibbs, and Dan Boneh. "Stickler: Defending Against Malicious CDNs in an Unmodified Browser.“ • Nikiforakis, Nick, et al. "You are what you include: large-scale evaluation of remote javascript inclusions." Proceedings of the 2012 ACM conference on Computer and communications security. ACM, 2012. • Bojinov, Hristo, ElieBursztein, and Dan Boneh. "XCS: cross channel scripting and its impact on web applications." Proceedings of the 16th ACM conference on Computer and communications security. ACM, 2009.
Our Contributions • 3-month study on how scripts change in the Alexa’s top 500 sites and 15 popular PHP apps • SICILIAN: a multi-layered whitelisting approach based on signatures to prevent script injection attacks • A browser-assisted deployment model called progressive lockdown
Longitudinal Study of JS Changes • Our study covers Alexa’s Top 500 Sites + 15 Popular PHP Applications 45066 web pages 33,302 scripts 3-month period, starting January 2015
Research Questions • RQ1: What is the state-of-the-art of web whitelisting? • RQ2: Is current web whitelisting practice practical in real-world websites? • RQ3: If not, what would be the main problem? can we do better?
Building Structural Signature (1) var x = 10, y;; Program var x=10,y; Variable Declaration LC Variable Declarator Variable Declarator Identifier … Literal 3 node types L I x 10
Building Structural Signature (2) • Challenges #3: How about arguments of CallExpression or parameters of FunctionDeclaration? • Such isomorphism is applied only to unordered node-types such as properties of an object of independent code statements SORT(“foo” “bar”) function test(bar, foo){ } FunctionDeclaration function test(foo, bar){ } params name “test” “foo” “bar”
Progressive Lockdown • CRAWL: The first time a browser sees a script, it locally compiles a whitelist for the script and sends it to the whitelist database url1.com/path4/ DATABASE url1.com/path5/ url1.com/path1/
Progressive Lockdown • LOCK: Once the whitelists are sufficiently populated, the website owner does not receive any more whitelists from the browser. The server can resolve any conflicted whitelist and decide the final whitelist
Performance • Sample one page of every domain in Alexa’s 500 • Compare the page load time between vanilla browser, SRI-enabled browser, and SICILIAN-enabled browser
Deployability • SICILIAN can be fully applied to 372 websites and 15 PHP Apps • Use the metric called Update Frequency (UF) • By the above metric, the average UF is 0.057, equivalent to 1 whitelist update per month.
Rate of Signature Updates • Out of 500 websites, 59 excluded due to website errors • Of the 441 websites, there are 153 domains with UF ≤ 0.1 for SRI and 334 domains with UF ≤ 0.1 for structural signatures • More websites require very few updates had we implemented structural signatures [RQ3]
Building Structural Signature (2) • What about permutation of properties? Variable renaming? • Relax the signature mechanism a bit further • We allow two forms of isomorphism on the ASTs • Safety: these isomorphism are designed in a way that we only allow relaxation for scripts that differ in syntax and not semantics 18
Building Structural Signature (3) • Safety: We refine the structural identities of the identifiers in order to generate unique identity for each of them. In case two identifiers have the same structural identity, we retain the names of the identifiers and do not replace them with structural identities 23
Existing Primitive • Great for preventing illegitimate scripts • One-size-fits-all solution against script injection attacks / XSS, content tampering at CDNs, etc. • W3C Recommendation: Subresource Integrity (SRI) [RQ1] Signature Subresource Integrity Website 4
Existing Primitive • W3C Recommendation: Subresource Integrity (SRI) [RQ1] Signature example- framework.js example.com Subresource Integrity Website 4