230 likes | 357 Views
Precise Interface Identification to Improve Testing and Analysis of Web Applications. William G.J. Halfond, Saswat Anand , and Alessandro Orso Georgia Institute of Technology. End Users. Web Server. Example Web Application. Initial Visit. Web Application. getQuote.jsp. buyPolicy.jsp.
E N D
Precise Interface Identification to Improve Testing and Analysis of Web Applications William G.J. Halfond, SaswatAnand, and Alessandro Orso Georgia Institute of Technology
End Users Web Server Example Web Application Initial Visit Web Application getQuote.jsp buyPolicy.jsp Quote Information http://host/getQuote.jsp?action=doquote&car=jeep
Parameter names Domain information Interface Identification publicvoid write(Fileoutfile, String buffer, int length) Grouping of parameters Names of parameters Grouping of parameters Domain information
Example Web Application public void service (HttpRequestreq) 1. String aValue = req.getIP( “action” ) 2. if (aValue.equals( “checkeligibility” )) 3. int userAge = getNumIP( “age” ) 4. if (userAge < 16) 5. displayErrorMsg(“Too young.” ) 6. else 7. displayQuotePage( ) 8. if (aValue.equals( “doquote” )) 9. String nValue = req.getIP( “name” ) 10. String carType = req.getIP( “type” ) 11. int carYear = getNumIP( “year” ) 12. calculateQuote(carType, carYear) … publicint getNumIP(String name) 1. String value = getIP(name) 2. int param = Integer.parse(value) 3. return param Names of parameters Grouping of parameters Domain information
Previous Approaches: Interface Identification Dynamic Spider • Web spider crawls pages of application • Limitation: No guarantee of completeness Static • DFW1: • Identify parameter names via static analysis • Limitation: Only identifes parameter names WAMdf2: • Uses iterative data-flow analysis • Limitation: Assumes all paths feasible 1. String aValue = req.getIP( “action” ) 2. if (aValue.equals( “checkeligibility” )) … 8. if (aValue.equals( “doquote” )) 4. if (userAge < 16) 5. displayErrorMsg(“Too young.” ) 6. else 7. displayQuotePage( ) (action, age, name, type, year) Deng, Frankl, Wang, SEN 2004. Halfond and Orso, FSE 2007.
Our Approach • Program transformation • Symbolic execution • Interface identification Statically identify interfaces by using symbolic execution to model input parameters and domain constraining operations.
1 – Program Transformation • 1. Introduce symbolic values • s new SymbolicValue() • s.assignName(name) • SymbolicState.add(s, value) • return s • value • getIP(name) 2. Replace domain-constraining operations • Accessing an input parameter • Conversion to numeric type • String comparison • Arithmetic constraints
2 – Symbolic Execution Symbolically execute the transformed web application -- track path conditions and symbolic state. Transformed Web Application getQuote.jsp buyPolicy.jsp Path Conditions c1 c2 c3 c3 c4 c5 Symbolic Execution Symbolic States saction aValue syear carYear
2 – Access Input Parameters PC = Path Condition SS = Symbolic State • (PC, SS) 1. String aValue = req.getIP( “action” ) • (PC, SS[saction aValue])
2 – String Comparison 1. String aValue = req.getIP( “action” ) • (PC, SS[sactionaValue]) 2. if (aValue.equals( “checkeligibility” )) TRUE FALSE • (PC saction “checkeligibility”, SS[sactionaValue]) • (PC saction “checkeligibility”, SS[sactionaValue]) 8. if (aValue.equals( “doQuote” ))
3 – Interface Identification 1. String aValue = req.getIP( “action” ) 2. if (aValue.equals( “checkeligibility” )) 3. intuserAge = getNumIP( “age” ) 4. if (userAge < 16) 5. displayErrorMsg(“Too young.” ) 6. else 7. displayQuotePage( ) … SS [sactionaValue, sageuserAge] PC1 saction “checkeligibility” integer(sage) sage 16 PC2 saction “checkeligibility” integer(sage) sage 16
Empirical Evaluation • Research Questions (RQ): • Efficiency -- Is the new approach efficient in terms of its analysis time requirements? • Precision -- Is the new approach more precise than previous approaches? • Usefulness -- Does the new approach improve the performance of quality assurance techniques?
Implementation: WAMse • Written in Java for Java Enterprise Edition (JEE) based web applications • Implementation Modules • transform • Customized JEE libraries • Stinger for analysis and automated transformation • se engine • Symbolic execution engine built on JavaPathFinder • Constraint solver is YICES • pc analysis
Implementation: Other Approaches Dynamic Spider • Web spider crawls pages of application • OWASP Web Scarab Project • Static • DFW1: • Identify parameter names via static analysis • Reimplementation of the author-provided code WAMdf2: • Uses iterative data-flow analysis • Implementation from previous work Deng, Frankl, Wang, SEN 2004. Halfond and Orso, FSE 2007.
Subject Applications • Subjects available online from GotoCode.com
RQ1: Efficiency Spider DFW WAMDF WAMSE • High amount of infeasible paths in subjects • Low number of constraints per parameter • Web applications highly modular
RQ2: Precision WAMDF WAMSE On average, 80% of WAMDF interfaces were spurious
RQ3: Usefulness Measure improvement of three quality assurance techniques: • Invocation Verification • Penetration Testing • Test Input Generation
RQ3a – Invocation Verification Web Application getQuote.jsp buyPolicy.jsp X Verification of invocations for subject Bookstore
RQ3b – Penetration Testing Spider Number of vulnerabilities: 2X – 6X higher for WAMSE DFW WAMDF WAMSE
RQ3c – Test Input Generation % Stmt. Coverage Statement coverage increase: 3%-25% Spider DFW WAMDF % Branch Coverage Branch coverage increase: 3%-67% WAMSE # Command Forms Command form increase: 651%-1,577%
RQ3c – Test Suite Size Spider DFW WAMDF Test suite decrease in size: 4X – 10X WAMSE RQ3c results: • Higher coverage for measured metrics • Smaller average test suite
Summary of Results • Developed interface identification technique for web applications based on symbolic execution. • Empirical evaluation: • Similar analysis time to other techniques • More precise than current techniques • Improves quality assurance techniques