410 likes | 552 Views
Corky Cartwright Rice and Halmstad Universities Summer 2013. Unit Testing in Java with an Emphasis on Concurrency. Software Engineering Culture. Three Guiding Visions Data-driven design Test-driven development Mostly functional coding (no gratuitous mutation)
E N D
Corky Cartwright Rice and Halmstad Universities Summer 2013 Unit Testing in Java with an Emphasis on Concurrency
Software Engineering Culture Three Guiding Visions Data-driven design Test-driven development Mostly functional coding (no gratuitous mutation) Codified in Design Recipe taught inHow to Design Programs by Felleisen et al(available for free online: www.htdp.org [first edition], www.ccs.neu.edu/home/matthias/HtDP2e/Draft, [second edition]) and Elements of Object-Oriented Design (available online at . The target languages are Scheme and Java.
Timeliness CPU clock frequencies stagnate Multi-Core CPUs provide additional processing power, but multiple threads needed to use multiple cores. Writing concurrent programs is difficult!
Tutorial Outline Introduce unit testing in single-threaded (deterministic) setting using lists Demonstrate problems introduced by concurrency and their impact on unit testing Show how some of the most basic problems can be overcome by using the right policies and tools.
(Sequential) Unit Testing Unit tests … Test parts of the program (including whole!) Integrate with program development; commits to repository must pass all unit tests Automate testing during maintenance phase Serve as documentation Prevent bugs from reoccurring Help keep the code repository clean Effective with a single thread of control
Universal Test-Driven Design Recipe Analyze the problem: define the data and determine top level operations. Give sample data values. Define type signatures, contracts, and headers for all top level operations. In Java, the type signature is part of the header. Give input-output examples including critical boundary cases for each operation. Write a template for each operation, typically based on structural decomposition of primary argument (the receiver in OO methods). Code each method by filling in templates Test every method (using I/O examples!) and ascertain that every method is tested on sufficient set of examples. White-box testing matters!
Sequential Case Studies: Functional Lists and Bi-Lists A List<E> is either Empty<E>(), or Cons<E>(e, l) where e is an E and l is List<E> A BiList<E> is a mutable data structure containing a possibly empty sequence of objects of type E that can be traversed in either direction using a BiListIterator<E>.
Review Elements of Sequential Unit Testing Unit tests depend on deterministic behavior Known input, expected output…Successcorrect behaviorFailureflawed code Outcome of test is meaningful if test is deterministic
Problems Due to Concurrency Thread scheduling is nondeterministic and machine-dependent Code may be executed under different schedules Different schedules may produce different results Known input, expected output(s?)…Successcorrect behaviorin this schedule, may be flawed in other scheduleFailureflawed code Success of unit test is meaningless
Recommended Resources on Concurrent Programming in Java Explicit Concurrency: Comp 402 web site from 2009 Brian Goetz, Java Concurrency in Practice (available onlne at this website) Coping with Multicore Emerging parallel extensions of Java/Scala that guarantee determinism (in designated subset) and do not require explicit synchronization and avoid JMM issues Habanero Java Habanero Scala Success of non-deterministic unit test is not very meaningful
Problems Due to Java Memory Model JMM is MUCH weaker than sequential consistency Writes to shared data may be held pending indefinitely unless target is declared volatile or is shielded by the same lock as subsequent reads. Why not always use locking (synchronized)? Significant overhead Increases likelihood of deadlock Extremely difficult to reason about program execution for specific inputs because so many schedules are allowed. A model that accommodates compiler writers rather than software developers.
Hidden Pitfalls in Using JUnit to Test Concurrent Java Junit Is Completely Broken for Concurrent Code Units: Fails to detect exceptions and failed assertions in threads other than the main thread (!) Fails to detect if auxiliary thread is still running when main thread terminates; all execution is aborted when main thread terminates. Fails to ensure that all auxiliary threads were joined by main thread before termination. (In Habanero Java, all programs are implicity enclosed a comprehensive join called finish() but not in Java.)
Possible Solutions to Concurrent Testing Problems Programming Language Features Ensure that bad things cannot happen; perhaps ensure determinism (reducing testing to sequential semantics!) May restrict programmers Comprehensive Testing Testing if bad things happen in any schedule All schedules may be too stringent for programs involving GUIs Does not limit space of solutions but testing burden is greatly increased. Good testing tools are essential.
Coping with the Java Memory Model Avoid using synchronized and minimize the size of synchronized blocks to reduce likelihood of deadlock. Identify all classes that can be shared and make all fields in such classes either final or volatile. Ensures sequential consistency (almost). Array elements are still technically a problem because they cannot be marked as volatile. The ConcurrentUtilities library includes a special form of array with volatile elements.
Improvements to Junit ConcJUnit developed by my former graduate student Mathias Ricken fixes all of the problems with Junit. Developed for Java 6; Java 7 not yet supported. Mathias developed some other tools to help test concurrent programs but none of them have yet reached production quality (e.g., random delays/yields). Research idea: JVM from Hell. Presumably easy to use ConcJUnit jar instead of Junit in Eclipse. Designed for drop-in compatibility with Junit 4.7. • Uncaught exceptions and failed assertions • Not caught in child threads
Sample JUnit Tests publicclass Test extends TestCase { public void testException() { thrownew RuntimeException("booh!"); } public void testAssertion() { assertEquals(0, 1); } } } Both tests fail. Both tests fail. if (0!=1) throw new AssertionFailedError();
Problematic JUnit Tests publicclass Test extends TestCase { public void testException() { new Thread(new Runnable() { public void run() { thrownew RuntimeException("booh!"); } }).start(); } } end of test spawns Main thread success! uncaught! Child thread Main thread new Thread(new Runnable() { public void run() { thrownew RuntimeException("booh!"); } }).start(); thrownew RuntimeException("booh!"); Child thread
Problematic JUnit Tests publicclass Test extends TestCase { public void testException() { new Thread(new Runnable() { public void run() { thrownew RuntimeException("booh!"); } }).start(); } } end of test spawns Main thread success! uncaught! Child thread Main thread new Thread(new Runnable() { public void run() { thrownew RuntimeException("booh!"); } }).start(); thrownew RuntimeException("booh!"); Child thread
Problematic JUnit Tests publicclass Test extends TestCase { public void testException() { new Thread(new Runnable() { public void run() { thrownew RuntimeException("booh!"); } }).start(); } } Main thread new Thread(new Runnable() { public void run() { thrownew RuntimeException("booh!"); } }).start(); thrownew RuntimeException("booh!"); Child thread Uncaught exception, test should fail but does not!
Problematic JUnit Tests publicclass Test extends TestCase { public void testFailure() { new Thread(new Runnable() { public void run() { fail("This thread fails!"); } }).start(); } } Main thread new Thread(new Runnable() { public void run() { thrownew RuntimeException("booh!"); } }).start(); thrownew RuntimeException("booh!"); Child thread Uncaught exception, test should fail but does not!
Thread Group for JUnit Tests publicclass Test extends TestCase { public void testException() { new Thread(new Runnable() { public void run() { thrownew RuntimeException("booh!"); } }).start(); } } Test thread new Thread(new Runnable() { public void run() { thrownew RuntimeException("booh!"); } }).start(); thrownew RuntimeException("booh!"); Child thread invokes checks TestGroup’s Uncaught Exception Handler
Thread Group for JUnit Tests publicclass Test extends TestCase { public void testException() { new Thread(new Runnable() { public void run() { thrownew RuntimeException("booh!"); } }).start(); } } Test thread new Thread(new Runnable() { public void run() { thrownew RuntimeException("booh!"); } }).start(); thrownew RuntimeException("booh!"); Child thread spawns and waits resumes Main thread failure! check group’s handler spawns end of test Test thread invokes group’s handler uncaught! Child thread
Improvements to JUnit • Uncaught exceptions and failed assertions • Not caught in child threads • Thread group with exception handler • JUnit test runs in a separate thread, not main thread • Child threads are created in same thread group • When test ends, check if handler was invoked • Detection of uncaught exceptions and failed assertions in child threads that occurred before test’s end Past tense: occurred!
Child Thread Outlives Parent publicclass Test extends TestCase { public void testException() { new Thread(new Runnable() { public void run() { thrownew RuntimeException("booh!"); } }).start(); } } Test thread new Thread(new Runnable() { public void run() { thrownew RuntimeException("booh!"); } }).start(); thrownew RuntimeException("booh!"); Child thread spawns and waits resumes Main thread failure! check group’s handler spawns end of test Test thread invokes group’s handler uncaught! Child thread
Child Thread Outlives Parent publicclass Test extends TestCase { public void testException() { new Thread(new Runnable() { public void run() { thrownew RuntimeException("booh!"); } }).start(); } } Test thread new Thread(new Runnable() { public void run() { thrownew RuntimeException("booh!"); } }).start(); thrownew RuntimeException("booh!"); Child thread check group’s handler spawns and waits resumes Main thread success! spawns Too late! Test thread end of test uncaught! invokes group’s handler Child thread
Enforced Join publicclass Test extends TestCase { public void testException() { new Thread(new Runnable() { public void run() { thrownew RuntimeException("booh!"); } }); t.start(); … t.join(); } } Test thread Thread t = new Thread(new Runnable() { public void run() { thrownew RuntimeException("booh!"); } }); t.start(); … t.join(); … thrownew RuntimeException("booh!"); Child thread
Testing Using ConcJUnit Replacement for junit.jar or as plugin JAR for JUnit 4.7 compatible with Java 6 (not 7 or 8) Available as binary and source at http://www.concutest.org/ Results from DrJava’s unit tests Child thread for communication with slave VM still alive in test Several reader and writer threads still alive in low level test (calls to join() missing) DrJava currently does not use ConcJUnit Tests based on a custom-made class extending junit.framework.TestCase Does not check if join() calls are missing
Conclusion Improved JUnit now detects problems in other threads Only in chosen schedule Needs schedule-based execution Annotations ease documentation and checking of concurrency invariants Open-source library of Java API invariants Support programs for schedule-based execution
Future Work Adversary scheduling using delays/yields (JVM from Hell) Schedule-Based Execution (Impractical?) Replay stored schedules Generate representative schedules Dynamic race detection (what races bugs?) Randomized schedules (JVM from Hell) Support annotations from Floyd-Hoare logic Declare and check contracts (preconditions & postconditions for methods) Declare and check class invariants
Tractability of Comprehensive Testing • Test all possible schedules • Concurrent unit tests meaningful again • Number of schedules (N) • t: # of threads, s: # of slices per thread detail
Extra: Number of Schedules Product of s-combinations For thread 1: choose s out of ts time slices For thread 2: choose s out of ts-s time slices … For thread t-1: choose s out of 2s time slices For thread t-1: choose s out of s time slices Writing s-combinations using factorial Cancel out terms in denominator and next numerator Left with (ts)! in numerator and t numerators with s! back
Tractability of Comprehensive Testing • If program is race-free, we do not have to simulate all thread switches • Threads interfere only at “critical points”: lock operations, shared or volatile variables, etc. • Code between critical points cannot affect outcome • Simulate all possible arrangements of blocks delimited by critical points • Run dynamic race detection in parallel • Lockset algorithm (e.g. Eraser by Savage et al)
Critical Points Example Local Var 1 All accesses protected by lock lock access unlock lock access unlock Thread 1 Shared Var Lock Local variables don’t need locking All accesses protected by lock All accesses protected by lock Thread 2 lock access unlock Local Var 1
Fewer Schedules • Fewer critical points than thread switches • Reduces number of schedules • Example: Two threads, but no communicationN = 1 • Unit tests are small • Reduces number of schedules • Hopefully comprehensive simulation is tractable • If not, heuristics are still better than nothing
Limitations Improvements only check chosen schedule A different schedule may still fail Requires comprehensive testing to be meaningful May still miss uncaught exceptions Specify absolute parent thread group, not relative Cannot detect uncaught exceptions in a program’s uncaught exception handler (JLS limitation) details
Extra: Limitations May still miss uncaught exceptions Specify absolute parent thread group, not relative (rare) Koders.com: 913 matches ThreadGroup vs. 49,329 matches for Thread Cannot detect uncaught exceptions in a program’s uncaught exception handler (JLS limitation) Koders.com: 32 method definitions for uncaughtException method back
Extra: DrJava Statistics Unit tests passed failed not run Invariants met failed % failed KLOC “event thread” 2004 736 610 36 90 5116 4161 965 18.83% 107 1 2006 881 881 0 0 34412 30616 3796 11.03% 129 99 back