1 / 14

Oops….

Oops…. tim@menzies.us fayolapeters@gmail.com andrian amarcus@wayne.edu MSR ’ 13. Inevitable, due to the complexity &novelty of our work. (But rarely reported, which is…. suspicious) What can we learn from those mistakes?.

bona
Download Presentation

Oops….

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Oops…. tim@menzies.usfayolapeters@gmail.com andrian amarcus@wayne.edu MSR’13

  2. Inevitable, due to the complexity &novelty of our work (But rarely reported, which is…. suspicious) What can we learn from those mistakes?

  3. An MSR’13 paper: Cross-company learning Can “Us” can learn from “them”? • Provided “us”selects right data from “them” • Relevancy filtering: [Turhan09] (and any others) • Selection guided by structure of “us” • If “we” is small and “them” is many: • Selection guided using kernelfunctions learned from “them” • Result #1: out-performed [Turhan09]. • Result #2: Result #1 was a coding error

  4. Houston, we have a problem • Mar 15: paper accepted to MSR • “Better cross-company defect prediction” • Mar 29: camera-ready submitted, • ?Apr 10: pre-prints go on-line • April 29: Hyeongmin Jeon, graduate student at Pusan Natl. Univ., • Emailed us: can’t reproduce result • May 4: Peters, checking code, found error • Manic week of experiments …. • May11: results definitely wrong • Emails to MSR organizers Btw, < 3 weeks. Wow…

  5. Coding error • Distance between test & training instance • Remove classes • Ran a distance function • Re-inserted the classes • But…. bad re-insert • Used the training class • Not the test class

  6. Pull the paper? • In the internet age, isthatevenpossible? • X peoplenowhave local copies of thatpaper • WhichGoogle mighteasilystumbleacross Old pre-print, found May 15

  7. Authors: report your mistakes, openly and honestly • We need to expect, allow, papers with sections: “clarifications”, “errata”, “retractions” • E.g. Murphy-Hill, Parnin, Black. IEEE TSE, Jan 2012

  8. Conference organizers: encourage research honesty • Need CFPs with text that encourages • Repeating and testing and challenging old results

  9. Researchers: Share data, check each other’s conclusions • Reinhart & Rogoff [2010] • “countries with debt over 90% of GDP suffer notably lower economic growth.” • Thomas Herndon, 3rd year Ph.D. U.Mass. • Unable to replicate with publicly available data , • Asked Reinhart & Rogoff for their data • Got it (Their spreadsheet) • Found errors in data on economic growth vs debt levels. • A triumph for open science • Sadly, reported in media as grave mistake • E.g. http://goo.gl/HGugL • Immature view of the nature of science

  10. Supervisors : encourage a culture of research honesty • What will you tell others about this paper? • A failure? Or a success of the open science method? • Its up to you but understand the implications • If we don’t let grad students report mistakes • Then they won’t • Students graduate, • Leave you, • The error emerges • And you are left with with the problem

  11. Specific lessons • Data mining experiments are complex software prototypes • Version control (of code and data) • Code inspections • Trap and log your random number seeds • Rewrite data rarely • Pull out the class, process, put it back? • Fuhgeddaboudit • Have data headers of different types • So (say) distance measures can skip over classes The above error does noteffect Peters & MenziesICSE’12 and TSE’13

  12. Open accessscience • Repeatable, improvable, • and sometimeseven refutable • Weshouldnotcelebratethefailedpaper • Butweshouldcelebrate • Theopen sciencecommunitythatfindssucherrors • MSR, PROMISE, etc • Thegradstudentsthatstruggleto reproduce results • HyeongminJeon • Theintegrity of gradstudentswhosefirst responseonfindingan error wastoreportit • FayolaPeters

  13. Was this a “useful” mistake? • Is this insight within this mistake? • What does it mean if using more experience makes the defect predictor worse? • International workshop on Transfer Learning in Software Engineering • Nov, ASE’13

More Related