350 likes | 621 Views
Practical Static Analysis of JavaScript Applications in the Presence of Libraries and Frameworks. Magnus Madsen Benjamin Livshits Michael Fanning. Outline. Motivation. Windows 8: JavaScript is an officially supported language .NET library bindings exposed to JavaScript
E N D
Practical Static Analysis of JavaScript Applications in the Presence of Libraries and Frameworks Magnus Madsen Benjamin Livshits Michael Fanning
Motivation Windows 8: • JavaScript is an officially supported language • .NET library bindings exposed to JavaScript Q: How can we use static analysis to improve the development experience?
The Challenge Modern JavaScript applications are often built using large and complex libraries: • Browser API, Win8 API, NodeJS, PhoneGap, ... • Problems: Reflection? native code? sheer size? • But: We really only care about the application! A pragmatic choice: We ignore the libraries (thus sacrificing soundness) to focus on the applications themselves
Practical Applications (which do not require soundness) • auto-complete • call graph discovery • capability usage • API usage
Practical Applications (which do not require soundness) • auto-complete • call graph discovery • capability usage • API usage
Practical Applications (which do not require soundness) • auto-complete • call graph discovery • capability usage • API usage <Capabilities> <Capability name="internetClient"/> <Capability name="picturesLibrary"/> <DeviceCapability name="location"/> <DeviceCapability name="microphone"/> <DeviceCapability name="webcam"/> </Capabilities>
Practical Applications (which do not require soundness) • auto-complete • call graph discovery • capability usage • API usage • Windows.Devices.Sensors • Windows.Devices.Sms • Windows.Graphics.Display • Windows.Graphics.Printing • Windows.Media.Capture • Windows.Networking.Sockets • Windows.Storage.Search
Win8 & Web Applications Windows 8 App Web App Builtin Builtin DOM DOM WinJS WinJS Win8 Win8 Builtin Builtin DOM DOM jQuery jQuery … … 3000 functions
Introducing Use Analysis elm flows into reset elm flows into playVideo elm must have: muted and play elm must have: pause Conclusion: elm is a HTMLVideoElement
Use Analysis: Determines what an object is based on how it is used
Heap Partitioning Library Heap Application Heap "Symbolic Heap"
Symbolic Objects and Unification • Introduce symbolic objects where flow is dead (i.e. missing) due to libraries. • Collect information about where the symbolic objects flow and how they are used. • Unify symbolic objects with "compatible" application or library objects.
Example: Iteration 1 We discover that c is a dead return
Example: Iteration 2 We introduce a symbolic return object
Example: Iteration 3 We unify the symbolic object with the HTMLCanvasElement
Missing Flow Where can dataflow be missing when ignoring the library code?: • Dead Returns • Dead Arguments • Dead Loads • Dead Prototypes • Dead Array Accesses
Unification Strategies Unification strategies based on property names: • : a single shared property name • :all shared property names • : all shared property names, but prioritize prototype objects x x x z y x y Application Symbolic Application
Benchmarks 25 Windows 8 Apps: Average ~1,500 lines of code Approx. 30,000 lines of stubs
Call Graph Resolution Pointer Analysis Pointer Analysis + Use Analysis A call site is resolved if it has a non-empty set of call targets
Auto-complete • We compared our technique to the auto-complete in four popular IDEs: • Eclipse for JavaScript developers • IntelliJ IDEA 11 • Visual Studio 2010 • Visual Studio 2012 • In all cases, where libraries were involved, our technique was an improvement
Auto-complete: Case study 35 0 26 1 9 0 7 k 50 0 7 7 50 0 1 k 0 7 k 250
Soundness & Completeness Use Analysis is inheritenly unsound: • library code is not analyzed • library code could have arbitrary side-effects An example of unsoundness An example of incompleteness: ... results of manual (human) inspection of 200 call sites
Findings • Auto-completion is improved compared to four popular IDEs • Use analysis improves call graph resolution • In practice unsoundness is limited • Reasonable analysis time median analysis time of 10s for apps of avg 1500 loc
Summary Pointer analysis + Use analysis: • A technique to statically reason about JavaScript applications which rely on large and complex libraries without analyzing the libraries themselves Practical applications: • auto-complete • API usage • capability discovery • call graph construction Thank You
Architecture JavaScript Application Introduce New Facts App Facts Pointer Analysis Use Analysis Analysis Rules
Datalog Formulation We define the following domains: – variables – heap-allocated objects – property names – call sites – integers (e.g. argument offsets) Based on Gatekeeper (Livshits et al. 2009)
Pointer Analysis PointsTo(v, h) :- NewObj(v, h, _). PointsTo(v1, h) :- Assign(v1, v2), PointsTo(v2, h). PointsTo(v2, h2) :- Load(v2, v1, p), PointsTo(v1, h1), HeapPtsTo(h1, p, h2). HeapPtsTo(h1, p, h2) :- Store(v1, p, v2), PointsTo(v1, h1), PointsTo(h2, h2). HeapPtsTo(h1, p, h3) :- Prototype(h1, h2), HeapPtsTo(h2, p, h3). CallGraph(c, f) :- ActualArg(c, 0, v), PointsTo(v, f). Assign(v1, v2) :- CallGraph(c, f), FormalArg(f, i, v1), ActualArg(c, i, v2), z > 0. Assign(v1, v2) :- CallGraph(c, f), FormalRet(f, v1), ActualRet(c, v2).
Example: Dead Returns DeadRet(c, v) :- CallGraph(c, f), ActualRet(c, v), !ResolvedVar(v), !AppAlloc(f). DeadArg(f, i) :- FormalArg(f, i, v), !ResolvedVar(v), AppAlloc(f). ...