1 / 61

Generating Program Inputs for Database Application Testing

Generating Program Inputs for Database Application Testing. Kai Pan , Xintao Wu University of North Carolina at Charlotte. Tao Xie North Carolina State University. 26th IEEE/ACM International Conference on Automated Software Engineering Nov 11, 2011 Lawrence, Kansas. Background.

elias
Download Presentation

Generating Program Inputs for Database Application Testing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Generating Program Inputs for Database Application Testing Kai Pan, Xintao Wu University of North Carolina at Charlotte Tao Xie • North Carolina State University 26th IEEE/ACM International Conference on Automated Software Engineering Nov 11, 2011 Lawrence, Kansas

  2. Background Functional Testing Test Generation Program Inputs

  3. Background Functional Testing Test Generation Program Inputs Database States

  4. An Example Program inputs Database

  5. Motivation

  6. Benefits to use an existing database state • Represent real-world objects’ characteristics, helping detect • faults that could cause failures in real-world settings • Reduce cost of generating new database records

  7. Dynamic Symbolic Execution (DSE) • Execute the program in both concrete and symbolic way (also called concolictesting) • Collect constraints along executed path as path condition • Negate part of the path condition and solve the new path condition to lead to new path • DSE tools for various program languages • Pex for .NET from Microsoft Research

  8. Motivation Path Condition: C1: Query construction constraints

  9. Motivation Path Condition: C1: Query construction constraints C2: Query/DB constraints

  10. Motivation Path Condition: C1: Query construction constraints C2: Query/DB constraints C3: Result manipulation constraints

  11. Motivation Path Condition: C1: Query construction constraints C2: Query/DB constraints C3: Result manipulation constraints C1 ^ C2 ^ C3

  12. Motivation Path Condition: C1: Query construction constraints C2: Query/DB constraints C3: Result manipulation constraints C1 ^ C2 ^ C3 A hard part

  13. Motivation How to derive high-covering program input values based on a given database state?

  14. Outline • Background • Approach • Evaluation • Conclusion and future work

  15. SQL query forms • Fundamental structure: SELECT, FROM, WHERE, GROUP BY, and HAVING clauses. SELECTselect-list FROMfrom-list WHERE qualification (GROUP BY grouping-list) (HAVING group-qualification)

  16. SQL query forms (cont’d) • Nested query: a query with another query embedded within it • Nested query can be unnested into equivalent single level canonical queries SELECTS.snameSELECTS.sname FROM Sailors S FROM Sailors S, Reserves R WHERE EXISTS ( SELECT * WHERE R.sid=S.sid AND R.bid=103 FROM Reserves R WHERE R.bid=103 AND R.sid=S.sid) transoformation rules A nested query Its canonical form

  17. SQL query forms of focus • WHERE clause consisting of a disjunction of conjunctions SELECT C1, C2, ..., Ch FROMfrom-list WHERE (A11 AND ... AND A1n) OR ... OR (Am1 AND ... AND Amn)

  18. Outline • Background • Approach • Evaluation • Conclusion and future work

  19. Illustrative example

  20. Apply DSE on the existing database Step1: DSE chooses “ type=0, zip=0 ”  executed query: Q1: SELECT C.SSN, C.income, M.balance FROM customer C, mortgage M WHEREM.year=15 AND C.zipcode=1 AND C.SSN=M.SSN Execution of Q1  zero record, not covering loop body

  21. Apply DSE on the existing database (cont’d) Step2: DSE flips “type == 0” to “type != 0”  “type=1, zip=0”  executed query: Q2: SELECT C.SSN, C.income, M.balance FROM customer C, mortgage M WHEREM.year=30 AND C.zipcode=1 AND C.SSN=M.SSN Execution of Q2  zero record not covering loop body

  22. Apply DSE on the existing database (cont’d) However, An input like “type=0, zip=27694”  executed query: Q3: SELECT C.SSN, C.income, M.balance FROM customer C, mortgage M WHEREM.year=15 AND C.zipcode=27695 AND C.SSN=M.SSN Execution of Q3  one record {C.SSN = 001, C.income = 50000, M.balance = 20000}. Covering Line14=true and Line18=false

  23. Apply DSE on the existing database (cont’d) Furthermore, An input like “type=0, zip=28222”,  executed query: Q4: SELECT C.SSN, C.income, M.balance FROM customer C, mortgage M WHEREM.year=15 AND C.zipcode=28223 AND C.SSN=M.SSN Execution of Q4  one record {C.SSN = 002, C.income = 150000, M.balance = 30000}. As a result, Line14=true and Line18=true

  24. Assist DSE to generate program inputs How to derive high-covering program input values based on a given database state?

  25. Our idea: construct auxiliary queries Auxiliary query : SELECTC.zipcode, FROM customer C, mortgage M WHEREM.year=15 AND C.SSN=M.SSN e.g., result set includes “fzip=27695”. From “fzip=zip+1”, we derive “zip=27694”!

  26. Our idea: construct auxiliary queries (cont’d) Auxiliary query : SELECT C.zipcode, FROM customer C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN e.g., result set includes “fzip=27695”. From “fzip=zip+1”, we derive “zip=27694”! Cover Line14=true and Line18=false! true false

  27. Our idea: construct auxiliary queries (cont’d) Auxiliary query : SELECT C.zipcode, FROM customer C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN e.g., result set includes “fzip=27695”. From “fzip=zip+1”, we derive “zip=27694”! Cover Line14=true and Line18=false! true false Act like “Constraint Solver” for Program Constraints +DB State Constraints

  28. Approach • Collect query construction constraints • on program variables used in the executed queries from the program code

  29. Approach (cont’d) • Collect query construction constraints • on program variables used in the executed queries from the program code • Collect result manipulation constraints • on comparing with record values in the query’s result set (such as “if (diff>100000)”)

  30. Construct auxiliary queries For path “Line04=true, Line14=true”, construct the abstract query: true SELECT C.SSN, C.income, M.balance FROM customer C, mortgage M WHEREM.year=15AND C.zipcode=‘fzip’ AND C.SSN=M.SSN true

  31. Construct auxiliary queries For path “Line04=true, Line14=true”, construct the abstract query: true SELECT C.SSN, C.income, M.balance FROM customer C, mortgage M WHEREM.year=15AND C.zipcode=‘fzip’ AND C.SSN=M.SSN Our target true

  32. Construct auxiliary queries true SELECT C.SSN, C.income, M.balance FROM customer C, mortgage M WHEREM.year=15AND C.zipcode=‘fzip’ AND C.SSN=M.SSN SELECTC.zipcode Construct auxiliary query true

  33. Construct auxiliary queries true SELECT C.SSN, C.income, M.balance FROM customer C, mortgage M WHEREM.year=15AND C.zipcode=‘fzip’ AND C.SSN=M.SSN SELECTC.zipcode FROM customer C, mortgage M Construct auxiliary query true

  34. Construct auxiliary queries true SELECT C.SSN, C.income, M.balance FROM customer C, mortgage M WHEREM.year=15AND C.zipcode=‘fzip’ AND C.SSN=M.SSN SELECTC.zipcode FROM customer C, mortgage M WHEREM.year=15 AND C.SSN=M.SSN Construct auxiliary query true

  35. Generate program input values Run auxiliary query: SELECTC.zipcode, FROM customer C, mortgage M WHEREM.year=15AND C.SSN=M.SSN  fzip:27695 or 28223

  36. Generate program input values Run auxiliary query: SELECTC.zipcode, FROM customer C, mortgage M WHEREM.year=15AND C.SSN=M.SSN  fzip: 27695 or 28223  zip: 27694 or 28222

  37. Generate program input values Input combinations: type: 0 or !0 X zip: 27694 or 28222 true “type=0, zip=27694” covers Line04=true, Line14=true, but Line18=false true false

  38. Approach (cont’d) Not enough! Program variables in branch condition after executing the query may be data-dependent on returned record values. How to cover Line18 true branch?

  39. Approach (cont’d) • To cover path Line04=true, Line14=true, Line18=true We need to extend previous auxiliary query true true true

  40. Construct auxiliary queries We extend the WHERE clause true SELECTC.zipcode, FROM customer C, mortgage M WHEREM.year=15 AND C.SSN=M.SSN (----how to extend?----) true true

  41. Construct auxiliary queries We extend the WHERE clause true SELECTC.zipcode, FROM customer C, mortgage M WHEREM.year=15 AND C.SSN=M.SSN (----how to extend?----) true true

  42. Construct auxiliary queries We extend the WHERE clause true SELECTC.zipcode, FROM customer C, mortgage M WHEREM.year=15 AND C.SSN=M.SSN AND C.income - 1.5 * M.balance > 100000 true true

  43. Generate program input values Run auxiliary query: SELECTC.zipcode, FROM customer C, mortgage M WHEREM.year=15 AND C.SSN=M.SSN AND C.income - 1.5 * M.balance > 100000  fzip=28223

  44. Generate program input values Run auxiliary query: SELECTC.zipcode, FROM customer C, mortgage M WHEREM.year=15 AND C.SSN=M.SSN AND C.income - 1.5 * M.balance > 100000  fzip=28223  zip=28222

  45. Other issues (aggregate calculation) • Extend auxiliary query with GROUP BY and HAVING clauses. Involve multiple records

  46. Other issues (aggregate calculation) SELECTC.zipcode, sum(M.balance) FROM customer C, mortgage M WHEREM.year=15 AND C.SSN=M.SSN AND C.income - 1.5 * M.balance > 100000 GROUP BY C.zipcode HAVING sum(M.balance) > 500000

  47. Other issues (cardinality constraints) SELECTC.zipcode FROM customer C, mortgage M WHEREM.year=15 AND C.SSN=M.SSN AND C.income - 1.5 * M.balance > 100000 GROUP BY C.zipcode HAVING COUNT(*) >= 3 Use a special DSE technique for dealing with input-dependent loops P. Godefroid and D. Luchaup. Automatic partial loop summarization in dynamic test generation. In ISSTA 2011.

  48. Outline • Background • Approach • Evaluation • Conclusion and future work

  49. Research questions • RQ1 (Effectiveness): What is the percentage increase in code coverage by the program inputs generated by Pex with our approach’s assistance? • RQ2 (Cost): What is the cost of our approach’s assistance?

  50. Evaluation subjects Two open source database applications • RiskIt • 4.3K LOC, database: 13 tables, 57 attributes, and >1.2 million records • 17 DB-interacting methods selected for testing • UnixUsage • 2.8K LOC, database: 8 tables, 31 attributes, and >0.25 million records • 28 DB-interacting methods selected for testing

More Related