1 / 32

ICSE 2013 Bug Prediction Session

ICSE 2013 Bug Prediction Session. Does Bug Prediction Support Human Developers? Findings From a Google Case Study Transfer Defect Learning. Doe Bug Prediction Support Human Developers? Findings From a Google Case Study.

feng
Download Presentation

ICSE 2013 Bug Prediction Session

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ICSE 2013 Bug Prediction Session Does Bug Prediction Support Human Developers? Findings From a Google Case Study Transfer Defect Learning

  2. Doe Bug Prediction Support Human Developers? Findings From a Google Case Study Chris Lewis, ZhongPengLin, Caitlin Sadowski, XiaoyanZhu, RongOu, E.James Whitehead Jr. University of California, Santa Cruz Google Inc. Xi’an Jiaotong University

  3. Motivations Little empirical data validating that areas predicted to be bug-prone match the expectations of expert developers Little data showing whether the information provided by bug prediction algorithms leads to modification of developer behavior

  4. Three Questions Q1: According to expert opinion, given a collection of bug prediction algorithms, how many bug-prone files do they find and which algorithm is preferred? Q2: What are the desire characteristics a bug prediction algorithm should have? Q3: Using the knowledge gained from the other two questions to design a likely algorithm, do developers modify their behavior when presented with bug prediction results?

  5. Algorithm Choice • FixCache • If a file is recently changed, it is likely to contain faults • If a file contains a fault, it is likely to contain more faults • Files that change alongside faulty files are more likely to contain faults • LRU • Problem: 10% of files, no severity • Reduce the cache size to 20 • Order the cache by duration( total commits ) • Rahman

  6. Project Choice

  7. User Studies • 19 interviewees ( A: 9 B: 10 ) • 3 lists of files • Choices: • Bug-prone • Not bug-prone • No strong feelings either way about • No experience with the file

  8. Results

  9. Results

  10. Results

  11. Results

  12. Q2: desirable characteristics Actionable( take clear steps that will result in the area no longer being flagged ) Obvious reasoning Bias towards the new file Parallelizable Effectiveness scaling

  13. Time-Weighted Risk Algorithm ( TWR ) • Modify Rahman • i: bug-fixing commit • n: number of bug-fixing commit • ti: normalized time of the current bug-fixing commit • w: how strong the decay should be

  14. Experiment • Mondrian ( code review software ) + lint • Duration: 3 months in Google Inc. • Metrics: • The average time a review containing a bug-prone file takes from submission to approval • The average number of comments on a review that contains a bug-prone file

  15. Results

  16. Conclusion Failure due to TWR No actionable means of removing the flag

  17. Transfer Defect Learning Jaechang Nam, SinnoJialin Pan, Sunghun Kim Department of Computer Science and Engineering The Hong Kong University of Science and Technology, China Institute for Infocomm Research, Singapore

  18. Motivations • Poor cross-project prediction performance • Same feature space • Different data distribution • On the basis of transfer learning, propose transfer defect learning. Modify existing method: TCA( Transfer Component Analysis ) • TCA is sensitive to normalization

  19. TCA TCA tries to learn a transformation to map the original data of source and target domains to a latent space where the difference between domains is small and the data variance after transformation is large.

  20. TCA

  21. TCA+ Choose Normalization options automatically

  22. Rules 5 rules

  23. Process

  24. Experiment Projects • ReLink • Apache HTTP Server • OpenIntents Safe • Zxing • AEEEM • Equinox • Eclipse JDT Core • Apache Lucene • Mylyn • Eclipse PDE UI

  25. TCA with different normalization options

  26. TCA+

  27. Contributions First to observe improved prediction performance by applying TCA for cross-project defect prediction Proposed TCA+

More Related