260 likes | 270 Views
This study explores how programmers ask questions during code changes, categorize them, and utilize existing tools for making correct changes. It analyzes related works, programmer comprehension models, and studies on tool usage. Research approaches, case study designs, and analysis methods are discussed systematically.
E N D
Asking and Answering Questions during a Programming Change Task, By Jonathan Sillito, Member, IEEE, Gail C. Murphy, Member, IEEE, and Kris De Volder. Presented by Bob Mazzi 10/30/08
Introduction • What is the best approach to code changes? • Not taught in school • No industry standards • No clear starting point
Introduction • Approach to this study • What questions do programmers ask • Can they be grouped into general categories • How are questions answered • Do existing tools support answering questions to correctly make changes
Related Work • Program Comprehension • Work involving the analysis of programmers’ Questions • How programmers use tools and generally how they carry out change tasks and other programming activities
Program Comprehension • Much work focuses on how programmers form a mental model of the code to be worked on • Top Down Model • Brooks, Koenemann and Robertson, Soloway and Ehrlich • Bottom Up Model • Pennington’s model • Shneiderman and Mayer’s cognitive framework • Combined Models • Littman et al. noted that programmers use either a “systematic” strategy or an “as-needed” strategy • Letovsky’s knowledge-based, opportunistic model • Von Mayrhauser and Vans “integrated metamodel” • Tools to help programmer comprehension • Von Mayrhauser and Vans - list of information needs and tool capabilities • Walenstein’s theory of cognitive support – tools to support programmer
Work involving the analysis of programmers’ Questions • Johnson and Erdem & Herbsleb and Kuwana • Studies of existing questions from usenet and actual design meetings • Letovsky & Erdos • Kinds of questions - why, how, what, whether, and discrepancy • Sneed • Types of questions • Ko et al. - Categorizing questions • 1. writing code • 2. submitting a change • 3. triaging bugs • 4. reproducing a failure • 5. understanding execution behavior • 6. reasoning about design • 7. maintaining awareness
Studies of how programmers use tools and how they carry out change tasks and other programming activities • Storey et al • User study focused on how program understanding tools enhance or change the way that programmers understand programs • Flor and Hutchins • Study on performing a change task • 4 Recent studies on use of current development environments • Robillard et al. how programmers navigate a code base • Deline et al. report observing how programmers navigate • Ko et al. research on IDE design elements for code maintenance • De Alwis and Murphy report on a field study about how programmers experience disorientation
Case Study Design • First Study designed to use teams • Promotes conversation • Allows easy observation to develop list of questions • Second study used individuals ( and 1 team ) • More common in industry • Study participants • First study used students ( grad level ) • Second study used experienced programmers • Code to be worked on • First study used an unfamiliar Java application • Second study used code familiar to programmers in a mix of languages
RESEARCH APPROACH • Study one • Laboratory setting • 9 participants • Code that was new to programmers • State-of-the-practice development tools • Limited time designed to not allow completion • Required changes span multiple files • Goals • Easy to observe in order to gather questions
RESEARCH APPROACH • Study two • Industrial setting • 16 Participants ( 14 individuals, 1 team ) • Familiar code • Familiar tools and environment • Most participants had 2 or more years experience • Goals • Is the process different with familiar code • Are different questions asked • Focus on identifying problem and using tools
Analyzing basic study responses • Identification of questions • Simplification / reduction of near duplicates • Categorization of questions • Finding Focus Points • Questions about a given entity and other entities directly related to it • Understanding a number of entities and relationships together • How groups relate to each other or to the rest of the system • Identification of tool use An example of one programmers combination of tools from second study
Samples of categorizations and questions • Finding Focus Points • 5 kinds of questions • Which type represents this domain concept or this UI element or action? • Where in the code is the text in this error message or UI element? • Expanding Focus Points • 15 kinds of questions • Where is this method called or type referenced? • When during the execution is this method called? • Understanding a Subgraph • 13 kinds of questions • How are instances of these types created and assembled? • How are these types or objects related? • Questions over Groups of Subgraphs • 11 kinds of questions • How does the system behavior vary over these types or cases? • What are the differences between these files or types?
Question Frequencies • Some specific questions were more common in one study than the other • Some questions were more likely to be asked by novices than experienced coders • Some questions would have been prior knowledge for the experienced coders • Some types of questions were more common in one study than the other • Advanced questions were more likely to be asked by experienced coders • Basic questions were more likely from coders unfamiliar with the code being worked on • Study parameter differences • As study one was designed as a team effort some questions may have been more likely to be stated to confirm that the partner agreed. Results
Analysis of Tools Use for Answering Study Questions • Analyzing commercial and available research tools • Comparing identified questions with available tools • Categorizing tools by how they were able to support answering specific questions • No attempt was made to identify all possible tools, just to identify if a tool was available that helped answer a specific question
Comparing existing tools to identified needs • Existing tools did well on basic questions • Existing tools are not well suited to answering some more complicated questions that were identified during this study • Tools do better at specific search type queries than less specific understanding type queries TABLE 9 The Number of Questions with Full or Partial Support by Category
The gap between existing tools and identified needs • Support for More Refined or Precise Questions • Scope of a search or query • Support for Maintaining Context • Comparisons • Support for Piecing Information Together • Information gathered from different sources is not easily combined • Multiple windows of related information add confusion
Summary • These case studies have identified a list of common questions that are likely to be similar in other coding tasks • The comparison of tools capabilities to common questions has identified potential areas of improvement in tools • This paper has identified a need for tools that allow different questions and answers to be presented in a unified format
Limitations • The use of pairs of programmers vs. individuals may have introduced errors • The comparison of Graduate students vs. Experienced programmers may have introduced errors • The use of familiar and unfamiliar code in the two studies may have introduced errors • The designed time limitation may have created a limitation in which questions that might be asked later in the process never appeared in the limited time available • These studies represent a small sample size as each study was a combination of several smaller tasks. This implies that each task was either not repeated at all or in some cases only a few times and the repeats were all among the novices • These studies used a limited set of tools. A larger or different set of tools may produce different results
Comments – Future Work • Each limitation listed also may be regarded as adding breadth to the studies as they introduce additional questions and potential combinations • The researchers recorded gender for each study participant yet did not explore the potential differences in questions asked, tool use or logical approach implemented by each gender
Finding Focus Points 1. Which type represents this domain concept or this UI element or action? 2. Where in the code is the text in this error message or UI element? 3. Where is there any code involved in the implementation of this behavior? 4. Is there a precedent or exemplar for this? 5. Is there an entity named something like this in that unit (project, package, or class, say)?
Expanding Focus Points 6. What are the parts of this type? 7. Which types is this type a part of? 8. Where does this type fit in the type hierarchy? 9. Does this type have any siblings in the type hierarchy? 10. Where is this field declared in the type hierarchy? 11. Who implements this interface 12. Where is this method called or type referenced? 13. When during the execution is this method called? 14. Where are instances of this class created? 15. Where is this variable or data structure being accessed? 16. What data can we access from this object? 17. What does the declaration or definition of this look like? 18. What are the arguments to this function? 19. What are the values of these arguments at runtime? 20. What data is being modified in this code?
Understanding a Subgraph 21. How are instances of these types created and assembled? 22. How are these types or objects related? (whole-part) 23. How is this feature or concern (object ownership, UI control, etc.) implemented? 24. What in this structure distinguishes these cases? 25. What is the behavior that these types provide together and how is it distributed over the types? 26. What is the “correct” way to use or access this data structure? 27. How does this data structure look at runtime? 28. How can data be passed to (or accessed at) this point in the code? 29. How is control getting (from here to) here? 30. Why is not control reaching this point in the code? 31. Which execution path is being taken in this case? 32. Under what circumstances is this method called or exception thrown? 33. What parts of this data structure are accessed in this code?
Questions over Groups of Subgraphs 34. How does the system behavior vary over these types or cases? 35. What are the differences between these files or types? 36. What is the difference between these similar parts of the code (e.g., between sets of methods)? 37. What is the mapping between these UI types and these model types?