1 / 16

Assessing the Frequency of Empirical Evaluation in Software Modeling Research

Assessing the Frequency of Empirical Evaluation in Software Modeling Research. Workshop on Experiences and Empirical Studies in Software Modelling ( EESSMod ) October 17, 2011. Jeffrey C. Carver, Eugene Syriani and Jeff Gray (presenter) University of Alabama

louise
Download Presentation

Assessing the Frequency of Empirical Evaluation in Software Modeling Research

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Assessing the Frequency ofEmpirical Evaluation in Software Modeling Research Workshop on Experiences and Empirical Studies in Software Modelling (EESSMod) October 17, 2011 Jeffrey C. Carver, Eugene Syriani and Jeff Gray (presenter) University of Alabama Department of Computer Science {carver, esyriani, gray}@cs.ua.edu

  2. Background • Many creative modeling ideas • Impression that the field has not followed the traditional Scientific Method • Most new techniques are not (thoroughly) evaluated • Investigate the prevalence of this phenomenon • Considered MODELS papers from 2006-2010 • Also considered papers from empirical conference (ESEM)

  3. Background: Empirical Studies “models” used more generally on this slide • The understanding of a discipline evolvesover time • We get more sophisticated in our methods • We are able to test and prove or disprove hypotheses • The empirical paradigm has been used in many other fields, e.g., physics, medicine, manufacturing

  4. Empirical Studies: Misconceptions • Empirical studies are not “one-shot deals.” Studies on live development projects are not the only ones that matter. • Software engineering is a laboratory science • Understanding our discipline involves • Observation, reflection, model building, experimentation • Followed by iteration • Symbiotic nature of research and development • Research needs laboratories to observe & manipulate variables • Development needs to understand how to build systems better

  5. Empirical Studies: Misconceptions • Overall purpose • “We ran a study of technology X and now we know…” • Technology X doesn’t work (NO) • Technology X performed worse than technology Y in our environment (YES) • “Environment” includes people & their expertise, project goals, etc. • Measuring performance implies we decided on some metric that we felt was an important indicator • No solution is really expected to be better for all users under all conditions Assist in evolution Yes/No Certification of a technology Yield insights and answers Find appropriate environment

  6. Empirical Studies: Outputs • Empirical study can help to provide information of interest to teams that might eventually adopt a technology: • Does it work better for certain types of people? • Novices: It’s a good solution for training • Experts: Users need certain background knowledge… • Does it work better for certain types of systems? • Static/dynamic aspects, complexity • Familiar/unfamiliar domains • Does it work better in certain development environments? • Users [did/didn’t] have the right documentation, knowledge, amount of time, etc… to use it Shull, 2004

  7. Our Objective and Methodology • Goal: Determine how many recent modeling papers had some type of empirical evaluation of their claims • Three step methodology • Develop initial characterization scheme • Identify candidate papers • Review candidate papers and finalize characterization

  8. Characterization Scheme Formative Case Studies: Papers gather information about use of technique in practice

  9. Results

  10. Results – Summary from 2006-2010 17%

  11. Results - Trends

  12. Results:Human-Based Controlled Experiments • Total of 12 in 5 years! Should be more • Observations • Generally, low level of detail reported • Most had less than 25 participants • 2 had over 50, 1 did not even report the number • Most participants were undergraduate students • General misunderstanding in many papers by equating “discussion” to “evaluation”

  13. Results:Formative Case Studies • Total of 10, need to see more • 4 did not involve humans • Analyze existing source code to understand how various modeling tools would/would not work • 6 involved humans • Surveys to understand how existing tools were not meeting developer needs • Generally, a study of output requirements for needed tools

  14. Results:ESEM Focus • The ESEM conference has three types of papers: Regular Papers, Short Papers, and Posters • Across the same 5 year period, we only found 17 modeling papers • Of those 17 papers, only 4 were Regular Papers (10 pages IEEE or ACM format) out of 178 Regular candidates • 10 were Short Papers (4 pages) out of a total of 118 Short Papers • 3 of the papers were Poster summaries • Even with the empirical area, modeling papers are not very well represented (typically, just short papers)

  15. Conclusions • Summary: • Rigor of empirically validated research in software modeling is weak • Very large percentage of papers with no evaluation • Did not include technical reports or extended publication in a journal • Plan to repeat analysis with SoSym • Would like to push the community to conduct more empirical evaluations • Paper has URLs pointing to the data from our observations • Recommendations: • Team up with empirical researchers • Venues need to provide additional space for reporting empirical results (e.g., 2 extra pages in paper length for those papers that have a clear evaluation)

  16. Questions or comments?

More Related