1 / 24

Week 3 Presentation

This research paper explores the problem of handling a surge of bugs in large open source projects and proposes a data mining and machine learning approach to recommend bug assignments. The results show promising precision and recall rates, highlighting the potential for improving bug assignment processes in OSS projects.

garfieldm
Download Presentation

Week 3 Presentation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Week 3 Presentation Istehad Chowdhury CISC 864 Mining Software Engineering Data

  2. Research Paper Who Should Fix This Bug? John Anvik, Lyndon Hiew and Gail C. Murphy Department of Computer Science University of British Columbia {janvik, lyndonh, murphy}@cs.ubc.ca

  3. Problem with Open Bug Repository • Overall, to cope with the surge of bugs in large open source projects. • “Everyday, almost 300 bugs appear that need triaging. This is far too much for only the Mozilla programmers to handle.” • Many bug reports are invalid or duplicate of another bug report • Eclipse, 36% • Every bug report should be triaged • To check validity and duplicity • To assign the bug to an appropriate developer

  4. Problem cont.. • Triager may not be sure whom to assign the bug. • Lot of time is wasted in reassigning and regaining • 24% reports in Eclipse are re-assigned

  5. The research work • Goal: • suggest whom to assign this bug to • Technique: • Using data mining and machine learning • Result: • 60% precision and 10% recall

  6. Precision and Recall

  7. Life Cycle of a Bug Report

  8. Roles • Reporter/Submitter • Resolver • Contributor • Triager • The roles are overlapping

  9. Approach to the problem • Semi automated • Characterizing bug reports • Assigning a label to each report • Choosing reports to train the supervised machine learning algorithm • Applying the algorithm to create the classifier for recommending assignments.

  10. Heuristics on labeling bug reports • FIXED (who provided last approved patch), Firefox • FIXED (whoever marked report as resolved), Eclipse • DUPLICATE: whoever resolved the report is duplicate. Eclipse and Firefox • WORKSFORME (Firefox) -- unclassifiable.

  11. Experimental Results Fig. Recommender accuracy and recall

  12. Validating Results with GCC • Why so poor result? • Why recall is low in all cases, esp. gcc? • Shows need of similarity in project natures.

  13. Trying Alternatives

  14. Trying Alternatives cont.. • Unsupervised Machine learning • Incremental Machine learning • Incorporating Additional sources of Data • Component based classifier

  15. Component based classifier

  16. Points to Ponder

  17. Points to Ponder cont.. • Are new developers assigned any bug? • “Needs further study to context of which it can be applied”-empirical research

  18. Points to Ponder cont.. • Was there enough instances to evaluate using Cross Validation? • For firefox 75%, gcc 86% developers have less than 100 reports • Why was the labeling mechanism more successful in case of gcc and Eclipse than firefox? • 1% for Eclipse, 47% for firefox

  19. Points in favor • The research work was very intense • Thoroughly studied • Honest in identifying the limitations and smart pointing out of the future works • It opens up interesting doors of future research

  20. Points Against • The study may not be suitable for a environment where there is a frequent change in the active set of developers • The findings are too project specific and works well on “actual bugs” reports

  21. Points Against cont.. • If there is any naivety in the heuristics it also propagates to the filtering process based on the heuristics to train the classifier. • I liked the way included the lesson learned section. However, the authors should have explained in more details how the mappings were done .

  22. Concluding Remarks • It shows promise for improving the bug assignment problem for OSS • “Coordination bug reports and CVS is challenging” • The effort is worth praising • Identifies need for further research

  23. Questions and Comments?

More Related