/* iComment : Bugs or Bad Comments? */

/* iComment:Bugs or Bad Comments? */ Lin Tan, Ding Yuan, Gopal Krishna, Yuanyuan Zhou Published in SOSP 2007Presented by Kevin Boos

In a Nutshell • iComment: static analysis + NLP • Detects code-comment mismatches • Uses both source code and comments

Roadmap • iComment Paper • Motivation • Challenges • Contributions • Approach & Methodology • Results • Related Work • Complexity • Authors’ other works

Motivation • Software bugs affect reliability. • Mismatches between code and developer assumptions // Caller must acquire lock. static intreset_hardware(...) { //access shared data. } static int in2000_bus_reset(...) { reset_hardware(...); }

Prevalence of Comments • Comments = developer assumptions • Must hold locks, interrupts must be disabled, etc. • Other tools do not utilize comments! • Ignore valuable information (dev. intentions)

Code vs. Comments • Developer assumptions can’t always be inferred from source code • Comments and code are redundant • or should be…

Inconsistencies • What’s wrong: comments or code? • Developer mistake • Out of date • Copyand paste error (clone detection) • Bad code might be bugs • Bad comments cause future bugs

Challenges • Parsing and understanding comments • Natural language is ambiguous and varying /* We need to acquirethe IRQ lock before calling … */ /* Lock must be acquired on entry to this function. */ /* Caller must hold instance lock! */ • NLP only captures sentence structure • No concept of understanding • Decent accuracy • Comments may be grammar disasters…

Contributions • First step towards automatically analyzing comments • Combines NLP, machine learning, static analysis • Identifies inconsistent code & comments • Real-world applicability • Discovered 60 new bugs or bad comments • Only two topics: locks & calls

Approach • Two types of comments • Explanatory: /* set the access flags */ • Assumptions/Rules: /* don’t call with lock held */ • Check comment rules topic-by-topic • General framework • Users choose the hot topics

Rule Templates • <Lock L> must be held before entering <Function F>. • <Lock L> must NOT be held before entering <Function F>. • <Lock L> must be held in <Function F>. • <Lock L> must NOT be held in <Function F>. • <Function A> must be called from <Function B> • <Function A> must NOT be called from <Function B> • Other templates exist (see paper) • User can add more templates

Handling Comments • Extract comments • NLP, keyword filters, correlated word filters • Classify comments (rule generation) • Manually label small subset • Create decision tree with machine learning • Decision tree matches comments to templates • Fill template parameters with actual variables • Training is optional for users

Rule Checker • Static analysis • Flow sensitive and context sensitive • Scope of comments • Display the inconsistencies • Sorted by ranking (support probability)

Evaluation • Four large software projects • Two topics: locks and function calls • Average training data: 18%

Results • Automatically detected 60 new bugs and bad comments • 19 new bugs and bad comments already confirmed by developers • False positives exist (38%) • Incorrectly generated rules • Inaccuracy of checking rule

Training Accuracy • Accuracy: % of correct mismatches —— Software-specific training —— —— Cross-software training ——

Related Work • Extracting rules from source code • iComment employs static analysis but not dynamic traces • Annotations • Poor adoption rates • Requires manual effort per comment • Documentation generation • No usage of NLP • iComment also analyzes unstructured comments

Complexity • Detecting inconsistencies • NLP • Abstracted away by tools • Machine learning • Simple manual training rules • Code maintenance • Developers may forget to be thorough • Automatic bug detection • Locking errors are extremely complex

Author Bio • Primary author: Lin Tan • Improving software reliability • Comments • Source code • Execution traces • Manual input • HotComments – prior ideas paper

Author Bio • Secondary author: Ding Yuan • Reliability of large software systems • Better logging • Enhanced output

Author Bio • Professor: Yuanyuan Zhou • Better debuggers, software reliability • Founded PatternInsight

PatternInsight Startup • http://patterninsight.com/

Conclusion • Comment-code inconsistencies are bad • Poorer software quality and reliability • First work to automatically analyze comments • Uses NLP and static code analysis • Detected real bugs in Linux/Mozilla • Manages complexity of code consistency and maintenance

/* iComment : Bugs or Bad Comments? */

/* iComment : Bugs or Bad Comments? */

Presentation Transcript

Strategies for Post-Silicon Debug of Complex Integrated Circuits and Systems-on-Chip

A Journey of Hope

Virtually Eliminating Router Bugs

Bed Bugs - The Answers You Need to Protect Your Home

Bugs of the Bay

Bug! By: Margaret Wise Brown Adapted by: Kristina Sica clip art from: www.microsoft.com

BED BUGS

FROM BED BUGS TO BIG BUGS (actually both terms are incorrect)

INSPECTIONS FOR BED BUGS

Bugs – From Finding to Preventing

Sleep Tight, and Don’t Let the Bed Bugs Bite!

Bugs

HEMIPTERA

Animation Bugs Academy of Digital Arts

Bugs

FANS 1/A Mixed INTEROP Status

Lecture 8 Comments and Documentation

Prep 4 Bed Bugs

Buy Facebook Comments Receive the Numbers of Comments

Bed Bugs: Get Them Out and Keep Them Out

Best Way to Get Rid Of Bed Bugs