WERST â€“ Infrastructure Group

WERST – Infrastructure Group Summary Notes July 2004 http://www.sce.carleton.ca/squall/WERST2004/

Infrastructure Issues • Subject program • representative • access to programs • Preparing subjects • dealing with missing information • confidentially constraints • Infrastructure construction • cost versus validity • Evaluating coverage criteria • data collection strategies • tools • analysis techniques • Source for programs • writing your own • from industry • open source • from other researchers • Reusing tools from other researchers • Is infrastructure construction publishable? • General purpose • Where to publish

Sub-Topics • Making experimental artifacts available • Tools to support experimentation • Tools to support experimental conduct • People • Experimental design descriptions Focused on controlled experiments, not case studies

(1) Goals of Exp. Artifact Repository • Support experimental replication • Making experiments affordable • Ensuring that artifacts are representative • What is our population like? • Mechanism: share and reuse artifacts • Increase comparability • Decrease effort of experiments

Types of Experimental Software Artifacts • Program source files • Compiled and executable modules • Formal specifications • Requirements • Design documents • Test inputs and oracles • Faulty versions • Data from previous experiments with the artifacts • Execution environment (makefiles, shell scripts, etc.) • Meta data (size, application, …)

Experimental Software ArtifactsTwo Crosscutting Issues • Different versions of the artifacts • Different variants • Slightly different versions of the requirements • Different designs • Different implementation languages

Repository Framework • Quality of the contents • Completeness, correctness, … • Should the contents of the repository be verified to be of some level of quality? • Where do the contents come from? • Various researchers • Repository “owners” • Created automatically

Two Models for Repositories • 5 star hotel – requires a proactive repository owner • Quality of artifacts have been verified by owner • Collections of artifacts “complete” • Experiments can be replicated • Artifact files should be stored with repository • YMCA / youth hostel – requires a reactive administrator • Artifacts have not been verified • Collections may be incomplete, incorrect, … • Artifact files can be stored offsite • Quality / Cost • Quality / Quantity

State of the ArtCurrent distribution of artifacts • Personal direct contacts • Unreliable & slow • NASA’s SEL • Very difficult to find useful information • NIST collection • No current support • EXPSIR – Rothermel, Elbaum, Do, … • Collection of artifacts (Siemens) that are in use • SERR – Alexander, Bieman, France • Under production • SEEWeb – Offutt, Hayes • Repository available, not populated

Repository Future Needs • Must have community buy-in • Must have a satisfactory quality • Must be widely available and accessible • Must have extremely high usability • Must be evolvable and extensible • Support must be available (repository and artifacts) • Must be well populated • Must have some level of funding

(2) Types of Experimental Tools • Analysis tools • Aristotle, CBAT, SOOT, SUIF, Eclipse • Dynamic information collection • InsECT, Mothra, MuJava, Proteum, ATAC, Frankl-DF?, Pure Coverage, DIE, BCEL, JIAPI, DynaInst, Atac, MuJava • Test generation tools • Mothra/Godzilla, TSL tools • Test drivers • JUnit • Fault generators • Mutation • Differencing tools

Issues and Needs • Support for the tools • Usability of the tool and API • Bugs in tools • Unsupported language features • ? operator • inheritance

Tools Conclusions • Current experimental tools are “all over the map” • It can be okay to scope the problem to • Specific artifacts (in repositories) • Limited features (language, etc) • It can be a very good idea to pre-compute data from some tools (ASTs, CFGs, etc) • Standardizing of analysis results? • Repositories and tools need to be connected

What Holds Back Infrastructure Distribution? • Lack of credit & support for building infrastructure • Funding • Publishing tools is hard • Nobody got tenure or a PhD for building experimental tools • Value of experimental infrastructure is not recognized • Hard to publish replicated studies • Difficulty of creating sufficiently reusable tools • Documentation • Usability • Faults in tools

WERST â€“ Infrastructure Group