Gregg Rothermel and Matt Dwyer Computer Science and Engineering

Toward Dependable Software: Cyberinfrastructure Support for Controlled Experimentation with Testing and Analysis Techniques Gregg Rothermel and Matt Dwyer Computer Science and Engineering

Software Dependability Dependability: fitness for intended use Avionics Spreadsheets Device Drivers GUIs

Achieving Dependability void add(Object o) { buffer[head] = o; head = (head+1)%size; } Object take() { … tail=(tail+1)%size; return buffer[tail]; } Software Testing & Program Analysis Techniques Program Dependability Information Requirement 1: … Requirement 2: … … Supporting Data

Software Testing void add(Object o) { buffer[head] = o; head = (head+1)%size; } Object take() { … tail=(tail+1)%size; return buffer[tail]; } Code Coverage-Based Testing Program Test Results Report Testing Requirements

Model Checking Model Checker OK Finite-state model or (!}) Temporal logic formula Error trace

Essential (Empirical) Questions • Is a given technique more cost-effective than other techniques? • What factors influence the relative cost-effectiveness of techniques? • What models best capture the cost-benefits tradeoffs between techniques?

Experimentation Cure & Side-Effects Information Treatment T1 Subject A Subject B Subject C Cure & Side-Effects Information Subject D Treatment T2

Experimentationwith Testing and Analysis Techniques Dependability & Cost Information Technique T1 Program A Program B Program C Program D Dependability and Cost Information Technique T2

State of the Art • In a survey of top journals/conferences: • 41% of papers included empirical studies - 17% included controlled experiments • 1% involved shared artifacts

State of the Art In a survey of top journals/conferences: 41% of papers included empirical studies - 17% included controlled experiments 1% involved shared artifacts As empiricists, in the testing & analysis sub-field, we have not evolved far.

Requirements for Empirical Research on Testing & Analysis Prepared experiment “subjects” Adequate computational resources Community engagement Knowledge of empirical study design

Experiment“Subjects” Experiments with testing and analysis require various “subjects”, or artifacts Obtaining, preparing, organizing and sharing artifacts is expensive and difficult Until recently, obtaining support for these efforts has been difficult

NSF Support NSF CRI program (Research Resources) $1.1 million, 4 year collaborative grant Investigators: University of Nebraska – Lincoln: Gregg Rothermel (PI) Sebastian Elbaum Matt Dwyer Kansas State University: John Hatcliff (PI)

Project Goal Create and disseminate a repository of software-related artifacts sufficient to support rigorous controlled experimentation with testing and analysis techniques across the broad community of software engineering researchers and educators

Proposed Artifacts Code bases for C and Java applications Supporting elements: e.g., versions, faults, requirements, behavior specifications, models, test inputs Basic static and dynamic data: e.g., expected outputs, fault-exposure data, coverage patterns Mechanisms for accessing data: e.g, filters and access routines

Obstacles to Efficient Execution of Empirical Studies Controlled experiments with testing & analysis techniques require significant computational resources Programs and test suites may be long running, e.g., one study took 8 months to collect data Program analysis techniques may require large amounts of memory, e.g., studies run on 32-bit architectures regularly used 4 Gbytes. Archiving analysis data may require significant storage, e.g., run-time trace data for a single study can easily use 100s of Gbytes

ARO and UNL Support to Overcome these Obstacles ARO DURIP program $400,000 equipment grant Investigators: University of Nebraska – Lincoln: Matt Dwyer (PI) Kansas State University John Hatcliff (PI) UNL startup funding for Rothermel and Dwyer Dept. staff support for Research Computing

Hardware Infrastructure 2.1 Tbyte disk storage array 45 node dual-Opteron 248 Cluster 32 nodes have 16 Gbytes of RAM 12 nodes have 4 Gbytes of RAM Running linux, Sun Grid Engine Remote access to 12 node quad-Opteron 246 Cluster at Kansas State

Early Adopters Studies of cost/effectiveness of extracting unit tests from system tests Studies of probe placement for run-time monitoring Studies of heuristic strategies for model checking concurrent programs Studies of heuristic search strategies for test suite generation Trace data collection to characterize server applications to predict “pre-crash” states Experiments on reducing the cost of inferring program specifications

Supporting the Emerging Experimental Community A growing understanding that computing is an experimental discipline 45 experts support our efforts More education in experimental methods needed Initial repository of subjects Distributed via web-site Full-time staff will develop and maintain site We will populate repository with a wide-range of sample subjects

Repository Lifespan We are planning for the repository to serve the community well-beyond the grant duration Exploring ideas such as Open-source model for repository contributions and maintenance Building consensus in community that subject development is valuable Use of conference/workshop model for review of contributions

Summary Testing and analysis are essential for software dependability – which is essential for everything Progress in testing/analysis has been limited by lack of empirical research Empirical software research is hard and requires subjects, computational resources, community consensus, and knowledge of empirical methods Our current efforts are focused on supporting researchers on the first three of these

Looking Forward A significant remaining challenge is supporting scientists in the design of experiments Decision-support for empirical studies Requires development of knowledge base and tools to query/synthesize that knowledge Scott Henninger (UNL) is leading work in this area Ultimately, our goal is to see that research on software systems develops into a discipline whose experimental maturity matches its theoretical maturity

Toward Dependable Software: Cyberinfrastructure Support for Controlled Experimentation with Testing and Analysis Techniques Gregg Rothermel and Matt Dwyer Computer Science and Engineering

Gregg Rothermel and Matt Dwyer Computer Science and Engineering