Parallel Execution of Test Runs for Database Application Systems

Parallel Execution of Test Runs for Database Application Systems Donald Kossmann: ETH Zurich, i-TV-T AG Florian Haftmann: i-TV-T AG Eric Lo: ETH Zurich

Some Facts • Microsoft spends 50% of their development cost on testing • SAP product cycle = 18 months • 6 months to execute tests Testing is the most expensive phase of the software development cycle

Observation • The more test runs, the better • However, it takes more time! • Goal: Optimize Testing Time

Definition: Test Run Ti • A sequence of requests • Test Run “Login” (2 requests):

Expected Result

More Definitions • Failed Test Run: At least one request does not return the expected result • Test Database D: The state of an Application + Database at the beginning of each test • Database Reset R: Bring the database back to D

A Test Run Fails When: • The application has a real bug • Or the test database is in wrong state due to execution of test runs  Carry out resets to find real bugs

Resetting the Test Database? • P.O. Insertion • Count P.O. • … DatabasePurchaseOrder={P1}

Resetting the Test Database? TA: Insert Purchase Order P2 • P.O. Insertion • Count P.O. • … DatabasePurchaseOrder={P1}

Resetting the Test Database? TA: Insert Purchase Order P2 • P.O. Insertion • Count P.O. • … <TA Success> DatabasePurchaseOrder={P1 } P2

Resetting the Test Database? • P.O. Insertion • Count P.O. • … TB: Get Total Purchase Order Expected Result: 1 Actual Result: 1 <TB Success> DatabasePurchaseOrder={P1}

Resetting the Test Database? TA: Insert Purchase Order P2 • P.O. Insertion • Count P.O. • … TB: Get Total Purchase OrderExpected Result: 1 Actual Result: 2 <TB Fails> DatabasePurchaseOrder={P1 } P2

Reset DB TB: Get Total Purchase OrderExpected Result: 1 Database Reset is needed! TA: Insert Purchase Order P2 • P.O. Insertion • Count P.O. • … DatabasePurchaseOrder={P1 } P2

Database Reset • Resetting a database for a large scale application takes about 2 minutes! • Back-of-the-envelop calculation: • 10000 test runs = 10000 resets x 2 min = 2 weeks on DB resets for 1 complete test

TA: Insert Purchase Order P2 TB: Get Total Purchase OrderExpected Result: 1Actual Result: 2<TB Fails> Reordering Test Runs • P.O. Insertion • Count P.O. • … DatabasePurchaseOrder={ P1, P2 }

TB: Get Total Purchase OrderExpected Result: 1Actual Result: 1<TB Success> Reordering Test Runs • P.O. Insertion • Count P.O. • … DatabasePurchaseOrder={ P1 }

TA: Insert Purchase Order P2 <TA Success> TB: Get Total Purchase OrderExpected Result: 1Actual Result: 1<TB Success> Order Matters! • P.O. Insertion • Count P.O. • … DatabasePurchaseOrder={ P1, P2 }

Our Previous Work (CIDR 2005) • A test run depends on a correct state of a database • Control the database state • Reduce the number of database resets • Algorithms to optimize order of test runs • No parallelism in testing

Can we do better if we have > 1 machine?

Parallel Testing is a Two-dimensional Problem! • Fully utilize the available resources • Load Balancing! • Same as single machine, we still have to control the database state • Reduce the database resets!

More about the Problem • Regression test • Later stage of the development cycle • Minor changes between versions • Execute the same set of test runs • Version 1.1 • Execute test: T1 T2 T3 T4 • Version 1.2 (Bug fixed and/or minor changes) • Execute test: T1 T2 T3 T4

Parallel TestingShared-Nothing vs. Shared-Database

Shared-Nothing (SN) ... ... • If I work for IBM, I can install: • N applications • N databases • N machines • One more machine: • More admin. work! • More license fees! • Applications do not SHARE the database T4 T31 T12 T5 Application Application ... Database Database Machine 1 Machine N

If I work for PoorEric.com, I install: N threads (e.g., open N browsers) 1 database 1 machine The threads SHARE the database Test runs interference with each others Can’t scale as good as Shared-Nothing Shared-Database (SDB) Thread N Thread 1 ... ... T4 T31 ... T12 T5 Application Database

Parallel Testing Framework Application History T1 T2 M1 ... Database Reset? T5 MN ... Machine/Thread 1 Scheduler ... T6 T2 T5 T1 Reset? Application Conflicts DB Database Machine/Thread N

Parallel Testing is a Two-dimensional Problem! • Fully utilize the available resources • Load Balancing! • Same as single machine, we still have to control the database state • Reduce the database resets!

Execution Strategies • Optimistic Execution: • Reset the database only when it is a must • Example: R T1 T2 T3 T4 • Optimistic++ Execution: • Avoid to execute a test run twice, again • Example (Wk 1): R T1 T2 T3 T4 R T4 T5 • Example (Wk 2): R T1 T2 T3 (Next is T4 ?) • Slice Reordering Heuristics: • Slice: A sequence of test runs without conflicts • Example: R T1 T2 T3 T4 R T4 T5 • Collect <slice>s during each test • Graph Reordering Heuristics R T4 T5 - R T4T5 <T1 T2 T3>T4

Parallel TestingShared-Nothing (SN)

Shared-Nothing Application Reset Database Scheduler Machine 1 ... T6 T2 T5 T1 Test Run Input Queue Reset Application Conflicts DB Database Machine 2

Test 1 M1: R Scheduler T1 T5 T2 T6 T3 T7 T8 Test Run Input Queue M2: R Conflicts DB

Test 1 M1: R T1 Scheduler T5 T2 T6 T3 T7 T8 Test Run Input Queue M2: R Conflicts DB

Test 1 M1: R T1 Scheduler T2 T6 T3 T7 T8 Test Run Input Queue M2: R T5 Conflicts DB

Test 1 M1: R T1 T2 Scheduler T6 T3 T7 T8 Test Run Input Queue M2: R T5 Conflicts DB

Test 1 M1: R T1 T2 Scheduler T3 T7 T8 Test Run Input Queue M2: R T5 T6 Conflicts DB

Test 1 M1: R T1 T2 T3 Scheduler T7 T8 Test Run Input Queue M2: R T5 T6 Conflicts DB

Test 1 M1: R T1 T2 T3 Scheduler T7 T8 Test Run Input Queue M2: R T5 T6 R Conflicts DB

Test 1 M1: R T1 T2 T3 Scheduler T7 T8 Test Run Input Queue M2: R T5 T6 R T6 Conflicts DB T5T6

Test 1 M1: R T1 T2 T3 R Scheduler T7 T8 Test Run Input Queue M2: R T5 T6 R T6 Conflicts DB T5T6 T1T2T3

Test 1 M1: R T1 T2 T3 R Scheduler T8 Test Run Input Queue M2: R T5 T6 R T6 T7 Conflicts DB T5T6 T1T2T3

Test 1 M1: R T1 T2 T3 R Scheduler Test Run Input Queue M2: R T5 T6 R T6 T7 T8 Conflicts DB T5T6 T1T2T3

Test 1 M1: R T1 T2 T3 R T3 Scheduler Test Run Input Queue M2: R T5 T6 R T6 T7 T8 Conflicts DB T5T6 T1T2T3

Shared-Nothing - Slice 3 major principles: • The slices in the input queue are ordered by: • Reordering the slices on each machine locally • Merge the partial order • Executes all test runs of the same slice on the same machine • The scheduler makes sure conflicting slices are executed on different machines as much as possible

Collect Slices M1: R T1 T2 T3 R T3 Scheduler Test Run Input Queue M2: R T5 T6 R T6 T7 T8 Conflicts DB T5T6 T1T2T3

T1 T2 T3 T5 T6 T7 T8 Reordering Slices M1: R T3 R Local Order M1: Local Order M2: M2: R T6 R

T1 T2 T3 T5 T6 T7 T8 Merge Partial Order M1: R T3 R Local Order M1: Local Order M2: M2: R T6 R Test Run Input Queue

Shared-Nothing - Slice 3 major principles: • The slices in the input queue are ordered by: • Reordering the slices on each machine locally • Merge the partial order • Executes all test runs of the same slice on the same machine • The scheduler makes sure conflicting slices are executed on different machinesas much as possible

Test 10 M1: R Scheduler T3 T6 T7 T8 T1 T2 T5 Test Run Input Queue M2: R Conflicts DB T5T6 T1T2T3 T3T1

Test 10 M1: R T3 T3 Scheduler T6 T7 T8 T1 T2 T5 Test Run Input Queue M2: R Conflicts DB T5T6 T1T2T3 T3T1

Test 10 M1: R T3 T3 Scheduler T1 T2 T5 Test Run Input Queue M2: R T6 T7 T8 Conflicts DB T5T6 T1T2T3 T3T1

Test 10 M1: R T3 T3 Conflict? Scheduler T1 T2 T5 Test Run Input Queue M2: R T6 T7 T8 Conflicts DB T5T6 T1T2T3 T3T1

Parallel Execution of Test Runs for Database Application Systems