400 likes | 753 Views
Oops…. tim@menzies.us fayolapeters@gmail.com andrian amarcus@wayne.edu MSR ’ 13. Inevitable, due to the complexity &novelty of our work. (But rarely reported, which is…. suspicious) What can we learn from those mistakes?.
E N D
Oops…. tim@menzies.usfayolapeters@gmail.com andrian amarcus@wayne.edu MSR’13
Inevitable, due to the complexity &novelty of our work (But rarely reported, which is…. suspicious) What can we learn from those mistakes?
An MSR’13 paper: Cross-company learning Can “Us” can learn from “them”? • Provided “us”selects right data from “them” • Relevancy filtering: [Turhan09] (and any others) • Selection guided by structure of “us” • If “we” is small and “them” is many: • Selection guided using kernelfunctions learned from “them” • Result #1: out-performed [Turhan09]. • Result #2: Result #1 was a coding error
Houston, we have a problem • Mar 15: paper accepted to MSR • “Better cross-company defect prediction” • Mar 29: camera-ready submitted, • ?Apr 10: pre-prints go on-line • April 29: Hyeongmin Jeon, graduate student at Pusan Natl. Univ., • Emailed us: can’t reproduce result • May 4: Peters, checking code, found error • Manic week of experiments …. • May11: results definitely wrong • Emails to MSR organizers Btw, < 3 weeks. Wow…
Coding error • Distance between test & training instance • Remove classes • Ran a distance function • Re-inserted the classes • But…. bad re-insert • Used the training class • Not the test class
Pull the paper? • In the internet age, isthatevenpossible? • X peoplenowhave local copies of thatpaper • WhichGoogle mighteasilystumbleacross Old pre-print, found May 15
Authors: report your mistakes, openly and honestly • We need to expect, allow, papers with sections: “clarifications”, “errata”, “retractions” • E.g. Murphy-Hill, Parnin, Black. IEEE TSE, Jan 2012
Conference organizers: encourage research honesty • Need CFPs with text that encourages • Repeating and testing and challenging old results
Researchers: Share data, check each other’s conclusions • Reinhart & Rogoff [2010] • “countries with debt over 90% of GDP suffer notably lower economic growth.” • Thomas Herndon, 3rd year Ph.D. U.Mass. • Unable to replicate with publicly available data , • Asked Reinhart & Rogoff for their data • Got it (Their spreadsheet) • Found errors in data on economic growth vs debt levels. • A triumph for open science • Sadly, reported in media as grave mistake • E.g. http://goo.gl/HGugL • Immature view of the nature of science
Supervisors : encourage a culture of research honesty • What will you tell others about this paper? • A failure? Or a success of the open science method? • Its up to you but understand the implications • If we don’t let grad students report mistakes • Then they won’t • Students graduate, • Leave you, • The error emerges • And you are left with with the problem
Specific lessons • Data mining experiments are complex software prototypes • Version control (of code and data) • Code inspections • Trap and log your random number seeds • Rewrite data rarely • Pull out the class, process, put it back? • Fuhgeddaboudit • Have data headers of different types • So (say) distance measures can skip over classes The above error does noteffect Peters & MenziesICSE’12 and TSE’13
Open accessscience • Repeatable, improvable, • and sometimeseven refutable • Weshouldnotcelebratethefailedpaper • Butweshouldcelebrate • Theopen sciencecommunitythatfindssucherrors • MSR, PROMISE, etc • Thegradstudentsthatstruggleto reproduce results • HyeongminJeon • Theintegrity of gradstudentswhosefirst responseonfindingan error wastoreportit • FayolaPeters
Was this a “useful” mistake? • Is this insight within this mistake? • What does it mean if using more experience makes the defect predictor worse? • International workshop on Transfer Learning in Software Engineering • Nov, ASE’13