320 likes | 406 Views
Automatically detecting and describing high level actions within methods. Presented by: Gayani Samaraweera. The problem.
E N D
Automatically detecting and describing high level actions within methods Presented by: GayaniSamaraweera
The problem • Given signature and the body of a method M, automatically discover each code fragment that implements a high level action comprising the overall algorithm of M, and accurately express each high level action as a succinct natural language description
Outline • The problem • Outline • High level actions • Method • Evaluation • Concerns • Other uses • Conclusion
High level actions • Sequence fragment • A sequence of statements that when taken together represents a single high level action • Conditional fragment • A conditional block that performs an action with subtle variations based on the condition • Loop fragment • Code patterns that are commonly implemented using loop constructs that constitute a high level action
Detecting high level actions • Uses • AST (Abstract Syntax Tree) • CFG (Control Flow Gragh) • Information from naming conventions and linguistic knowledge gained from observations of Java programs • Textual clues from SWUM (Software Word Usage Model)
cont.. • Word usage information from identifiers • Identifier splitting • Camel case splitting: on capital letters, underscores, numbers • Eg: childXMLElement → child XML Element • Expand identifier abbreviations • Eg: Button butSelectAll, MouseEvent evt • SWUM → action, theme, optional secondary arguments of a statement grouping • Eg: list.add(Item i); → “additemto list” action secondary argument theme
Sequence as single action • Sequence fragments • Identifying sequences of statements with similar actions • Indicated by similar method calls
cont.. • Challenges • Integrate to successor statement based on similarity • Different method names • Same method name different parameter types
cont.. • Identifying fragments • Statements with one or more method calls Add ended panel to content panel Add bid panel to content panel Verb → add → equals Head word of NP → panel → equals Preposition → to content panel → equals → integratable
cont.. • Synthesizing descriptions • If equal head word → plural • Else Add okButton to content panel Head word of NP → different But if fields of same class → “all attributes” “different attributes”
Abstracting conditionals • Challenges • Integrating similar statements in different branches • Integrating conditional statements guarding different branches • Integrating return statements with literals or similar method calls
cont.. • Identifying and describing conditionals • Integrate statements of each block, compare each statement with statements of parent block • For method calls Singular
cont.. • For return statements • For assignment statements Theme based on enclosing method Update, create or get
cont.. • Describing conditional expressions • Compare phrases as 'subject predicate object' • Subject and predicate are equal → based on what <subject> <predicate> Based on what os name starts with If only head word of subject is equal Based on which <head word> <predicate> <object> Based on which radio button is selected
Finding traceable patterns in loops • Challenges • Common algorithms as finding, counting, copying • Develop identification templates • Develop heuristics to synthesize phrases for each template
cont.. • Loop abstractions implemented • Count • Contains • Find • Copy • Max-min
cont.. • Identifying fragments and synthesize templates
cont.. • Variations in synthesis templates • 'find item (in collection) whose/which/such that <criteria>' in subject predicate object, • If item is subject → which • If an attribute of item is subject → whose • Default → such that
Evaluation • Executed on 1.2 million methods across 1000 Java programs
cont.. • How prevalent are the implemented high level methods? • Sequence (methods with >= 10 statements 12.5%) • 11% • Conditional • 40% of if-else • 24% of switch • Loop • 51% of loops classified as iterating over all items in a collection • 15% of iteratorloops detected by implemented patterns
cont.. • Potential reduction in reading detail • Reduction in identified high level actions • Sequence → one phrase • 22% of original size • Conditional → two phrases • 29% of original size • Loop → varying # phrases • 25% of original size
cont.. • Precision of identification and description • 15 human evaluators, each evaluating 15 code fragments (5-sequence, 5-conditional, 5-loop) • 75 code fragments from 15 projects evaluated by 3 evaluators From methods with <= 20 statements 25-conditional, 25-sequence, 25-loop Loops: 5 fragments from each 5 patterns
cont.. • Evaluators wrote an abstraction of the method • Answered following based on 1 – strongly disagree to 5 – strongly agree identification description
cont.. Majority agreed or strongly agreed on both P1 and P2
Concerns.. • May not generalize to other Java programs • Results may vary on larger programs • Results might not hold with novices • Reduction in reading measurement may not hold with some developers
Improving client tools • Extract method refactoring Create application based on what os starts with Set different attributes of SVGApplicationModel
cont.. • Internal comment generation • Instead of Extract Method refactoring, can add comments inline • Add empty lines between related code fragments • Suggesting more informative method names • Improving automatically generated summary comments for a method
Conclusions • First technique for identifying code fragments of statement sequences, conditionals and loops, that is abstracted to a high level action • Automatically synthesizing natural language description
References • GiriprasadSridhara, Lori Pollock, and K. Vijay-Shanker. Automatically detecting and describing high level actions within methods. In Proceeding of the 33rd international conference on Software engineering (ICSE '11). ACM, New York, NY, USA, 101-110. • G. Sridhara, E. Hill, D. Muppaneni, L. Pollock, and K. Vijay-Shanker. Towards Automatically Generating Summary Comments for Java Methods. Intl. Conf on Automated Softw. Engg. (ASE’10), 2010. • GiriprasadSridhara, Lori Pollock, K. Vijay-Shanker, "Generating Parameter Comments and Integrating with Method Summaries," International Conference on Program Comprehension, pp. 71-80, 2011 IEEE 19th International Conference on Program Comprehension, 2011 • E. Hill. Integrating Natural Language and Program Structure Information to Improve Software Search and Exploration. PhD Dissertation, University of Delaware, 2010.