260 likes | 406 Views
A Symbolic Execution Framework for JavaScript. Prateek Saxena. Devdatta Akhawe. Steve Hanna. Feng Mao. Stephen McCamant. Dawn Song. UC Berkeley. Motivation: Rich Web Applications. Client-side JS complexity in Rich Web Applications High cross-domain client-side data exchange
E N D
A Symbolic Execution Framework for JavaScript PrateekSaxena Devdatta Akhawe Steve Hanna Feng Mao Stephen McCamant Dawn Song UCBerkeley
Motivation: Rich Web Applications • Client-side JS complexity in Rich Web Applications • High cross-domain client-side data exchange • Need tools to analyze complex applications
An Important Application:Finding Client-side Code Injection Bugs • An Example • Several Client-side Data Exchange Channels for mashups facebook.com Many attack vectors ….. <IMG SRC="javascript:alert('XSS')”> Data: “friendName: Joe,msg: Yo!” <IMG SRC=JaVaScRiPt:alert('XSS')> Data: “..msg: <imgsrc=s onerror=javascript:alert(..” FRAGMENT ID postMessage ?friendName=Joe #msg=Yo! http://cnn.com varDataStr = ’varnew_msg =(’ +event.data+’);’; ParseData(DataStr); Parse the Input varregx = /<script.*>.*?<\/script>/g; if (regex.test(DataStr.msg)) { return false; } Validationchecks n.innerHTML = DataStr.msg; Dynamic HTML update
Problem Definition Automatically Find Code-Injection Vulnerabilities in JS Applications • Two challenges • #1: Automatic exploration of the execution space • #2: Automatically check if data is sanitized sufficiently • Can’t distinguish parsing ops. from custom validation checks • Can’t assume validation, false negatives vs. false positives.
Our Contributions • Existing Approaches • Static Analysis [Gatekeeper’09, StagedInfoFlow ‘09] • Taint-enhanced blackboxfuzzing[Flax’10] • Drawbacks • Either assumes an external test suite to explore paths [Flax’10] • Or, does not generate an exploit instance, can have FPs [Gatekeeper’09, StagedInfoFlow ‘09] • Our Contributions • A Symbolic Analysis approach • Kudzu: An end-to-end symbolic execution tool for JavaScript • Identify a sufficiently expressive “theory of strings” • Kaluza: A new expressive, efficient decision procedure • Supports strings, integers and booleans as first-class input variables
Outline • Problem Definition • Previous Approaches vs. Our Approach • Kudzu System Design • Kaluza Decision Procedure • Evaluation
Outline • Problem Definition • Previous Approaches vs. Our Approach • Kudzu System Design • Kaluza Decision Procedure • Evaluation
Kudzu: Approach and Design • Input space has 2 components • Event Space: GUI explorer • Value Space: Dynamic Symbolic Execution • Checking sufficiency of validation checks • Symbolic analysis of validation operations on code-evaluated data NEW INPUT FEEDBACK GUI EXPLORER DYNAMIC SYMBOLIC INTERPRETER KALUZA DECISION PROCEDURE APPLICATION-AGNOSTIC APPLICATION-SPECIFIC CHECKING SUFFICIENCY OF VALIDATION
Dynamic Symbolic Interpreter for JavaScript • Employed for Value Space Exploration Initial Input New Input Symbolic Formula KALUZA DECISION PROCEDURE Program Concrete Execution Symbolic Execution
Checking Sufficiency of Validation Checks • To eliminate false positives KALUZA DECISION PROCEDURE Attack Grammar Specification INITIAL INPUT INTERSECTION EMPTY CODE EVALUATION CONSTRUCT
GUI Exploration • Events: State of GUI elements, mouse and link clicks • Event Sequence: A sequence of state-altering GUI actions • Event Space Exploration using a GUI explorer • Practically enhances coverage benefits • Example: • 1 Gadget Vulnerability: reachable with a sequence of events executed: dropdown box value is changed, delete hit
Outline • Problem Definition • Previous Approaches vs. Our Approach • Kudzu System Design • Kaluza Decision Procedure • Evaluation
Empirical Motivation for A Theory of Strings substring / charAt / charcodeAt (5%) • Combined string and integer solver • Regular Expression based operations are 1/3rd of the match, split, test, replace operations (9%) • Multiple string variables concat (8%) split / match / test (1%) replace / decodeURI / encodeURI (8%) indexOf/ lastIndexOf / strlen (78%) /\\(?:["\\\/bfnrt]|u[0-9a-fA-F]{4})/ 33% regexes have Capture Groups
A Sufficiently Expressive Theory for JS • Practical Requirements to support Concatenation (Word Equations) Regular Language Membership String Length Equality Multiple String Variables Boolean and Integer Logic [DPRLE’09] [HAMPI’09] [PEX’09] Existingsolvers not sufficiently expressive
Kaluza: A New Solver Decision Procedure • Input: A boolean combination of constraints over multiple integer and variable-length string variables • Decidability vs Expressiveness • Equality between reg language variables undecidable[STOC’81] • Full generality of replace in word constraints undecidable [TACAS’09] STRING SOLVING APPROACHES LANGUAGE EQUATIONS WORD CONSTRAINTS • Insight: JS to Kaluza Reduction uses Dynamic Information JavaScript Language Operations Kaluza Core Constraints
Outline • Problem Definition • Previous Approaches vs. Our Approach • Kudzu System Design • Kaluza Decision Procedure • Evaluation
Kudzu System Evaluation • 18 Live Applications • 13 iGoogle gadgets • 5 AJAX application • Social networking: Academia, Plaxo • Chat applications: AjaxIM, Facebook Chat, • Utilities: parseURI • Setup • Untrusted sources • All cross-domain channels • Text boxes • Critical sinks • Code evaluation constructs
Results: Summary • Summary • Kudzu found 11 code injection vulnerabilities automatically • 2 previously unknown vulnerabilities • 6 hours of testing period • Examples • XSS in Facebook Connect used by 2 social networking sites • Gadget Overwriting Attacks on Google/IG • Self-XSS on AjaxIM • No false positives • Finds all known vulnerabilities in our benchmarks [Flax’10]
Results: Code Coverage 29% code coverage increase in 6 hours Initial Discovered Initial Executed Total Discovered Total Executed
Results: Code Coverage 29% code coverage increase in 6 hours Coverage Increase Initial Coverage Code Coverage (in %)
Conclusion • Kudzu: An End-to-end Symbolic Execution Tool for JS • Separates the input space analysis into 2 components • Identified a theory of strings expressive enough for JS • Kaluza: A new decision procedure for the theory • Demonstrated capabilities on 18 live web applications • Found 11 vulnerabilities with no given initial test harness • 2 new vulnerabilities
Contact • Contact: • PrateekSaxena (prateeks@cs.berkeley.edu) • Kaluza, our core constraint solver is online: • http://webblaze.cs.berkeley.edu/2010/kaluza • Please visit Webblaze, our web security research page • http://webblaze.cs.berkeley.edu THANKS FOR COMING TO THE TALK
Reduction of JS Operations:Mixed Concrete and Symbolic Power • Example: replace full generality is undecidable • Concretize number of occurances of matched string rep1 = INPUT.replace(/\\(?:["\\\/bfnrt]|u[0-9a-fA-F]{4})/g, "@"); R Symbolic operations S0 S1 S2 S3 INPUT Regex Membership over T1..T3, S1..S3 R T1 T2 T3 @ @ @ Concat @ @ @ S0 S1 S2 S3 OUTPUT
Results: Solver Performance SAT cases: < 1sec, UNSAT 1-50 secs
Comparison of Symbolic Execution Alonewith GUI Exploration • Symbolic Execution Alone vs. Full-featured Kudzu Full-featured Kudzu Symbolic Execution Alone
Example Attacks: Gadget Overwriting Legitimate URL bar Compromised Gadget with Overwritten Contents <Attack Link to IGoogle page>