140 likes | 150 Views
JavaScript Information Flow Analysis. Shiyi Wei CS6204 term project. Overview. Project motivation Literature review Paper orgnization Selected papers Observations Framework overview Analysis components On-going work & conclusion. Project Motivation. Jif: Java information flow
E N D
JavaScript Information Flow Analysis Shiyi Wei CS6204 term project
Overview • Project motivation • Literature review • Paper orgnization • Selected papers • Observations • Framework overview • Analysis components • On-going work & conclusion
Project Motivation • Jif: Java information flow • Type-based approach • Language extension • Imprecise • Java programming language • Static typing • Class hierarchy
Project Motivation • Information flow analysis for JavaScript • Type-based approach works? • Dynamic typing • Challenges • Dynamic language features • Prototyping • Dynamic code generation • Variadic functions • Fields • Benchmark
Literature Review • Paper categories • Information flow analysis for C, C++, and Java • Analyzing dynamic languages • Performance • Correctness • Security analysis of JavaScript • Static analysis • Dynamic analysis
Literature Review • GATEKEEPER[1] • JavaScript widget • JavaScriptSAFE • Static • JavaScriptGK • Dynamic References [1] S. Guarnieri, and B. Livshits. GATEKEEPER: mostly static enforcement of security and reliability policies for JavaScript code. In proceedings of the 18th conference on USENIX security symposium(2009), pp. 151-168
Literature Review • Staged information flow for JavaScript[2] • Integrity policy • The code loaded at any evalsite must not into the value of document.location • Confidential policy • The value of document.cookiemust not flow into any variable within the code loaded at any evalsite • Staged information flow • Stage 1: Compute policy • Stage 2: Check policy References [2] R. Chugh, J. A. Meister, R. Jhala, and S. Lerner. Staged information flow for JavaScript. In proceedings of the 2009 ACM SIGPLAN conference on Programming Language Design and Implementation
Literature Review • JavaScript taint analysis[3] • Prototypes • Object creations • Reflective property accesses • Lexical scoping References [3] S. Guarnieri, M. Pistoia, O. Tripp, J. Dolby, S. Teilhet, R. Berg. Saving the world wide web from vulnerable JavaScript. In proceedings of the 2011 International Symposium on Software Testing and Analysis.
Literature Review • Observations • Handle limited language features • Prototype[2, 4] • Properties deletion • eval • Experimental design • JavaScript benchmark not representative[5] References [4] A. Guha, S. Krishnamurthi, and T. Jim. Using static analysis for ajax intrusion detection. In Internation Conference on World Wide World(WWW), 2009 [5] G. Richards, S. Lebresne, B. Burg, J. Vitek. An analysis of the dynamic behavior of JavaScript programs. In proceedings of the 2010 ACM SIGPLAN conference on Programming Language Design and Implementation.
Framework Overview Instrumented WebKit Call graph + dynamically generated code Website source Static analysis Infrastructure
Analysis Components • Instrumented WebKit • TracingSafari[5] • Instrumented code • Function calls • Method signature • Arguments • Object creation sites • Dynamically generated code • Eval • document.write • etc.
Analysis Components • Static Infrastructure • WALA • IBM T.J. Watson Libraries for Analysis • Extract JavaScript code • From web site source • Import dynamic information • Dynamic call graph • Dynamically generated code
Analysis Components • Static infrastructure • Handle JavaScript language features • Variadic functions • Method definitions + arguments • Pruning with argument.length • twitter.com, amazon.com, msn.com, … • Dynamic code generation 1. function F(a, b) 2. { 3. if(arguments.length =1) 4. {…} 5. else if(arguments.length =2) 6. {…} 7. else if (arguments.length >= 3) 8. {…} 9. }
On-going Work & Conclusion • On-going work • Information flow algorithm • Benchmark • Handle other language features • Prototyping, etc • Conclusion • Literature review • JavaScript Information flow is hard • Dynamic language features • Blended approach • Works on unsolved issues