10 likes | 132 Views
Automatic Source Code Specification. Gabriella Cruz (1) , Justin Talavera (2) , Joseph Urban (3).
E N D
Automatic Source Code Specification • Gabriella Cruz(1), Justin Talavera(2), Joseph Urban(3) • 1. Department of Arts and Sciences, Georgia State University 2. Department of Electrical and Computer Engineering, Texas Tech University 3. Department of Industrial Engineering, Texas Tech University • Texas Tech University 2014 NSF Research Experiences for Undergraduates Site Project • Abstract • Objectives • Results & Summary • Agile development allows developers to have flexibility when changing code in between developments. Though an agile approach anticipates change, much time can be spent on understanding the last development and examining the code in depth to know what parts need to be changed. An amount of time is also spent in the maintenance phase when later developers need to understand the functionality of the program. In many cases, developers spend more time reading and understanding code when compared to the amount of time they spend writing the code. The imbalance of time spent, reading and writing code is due to the lack of a specification or summary of the source code in an easy to read format and language. Most developers ignore the step of writing a specification because of time constraints or other unknown factors. In order to be able to have a specification for future programmers to understand, developers need a tool that automatically generates a specification when source code is written. This will allow the developer to have a fully understandable specification without having to spend more time. This project focused on a tool that automatically creates specifications in between developments. The tool not only saves time so that developers do not have to manually write specifications, but it also decreases ambiguity between developments and future developers. • The specification should be automatically created while the source code is being written. • The tool should be efficient and allow the program to have a complete and correct specification when the source code is finished. • The tool should give a full and understandable specification in natural language. • The tool should allow end users to have easy access to the specification and allow the end user to make changes that can be seen by the developer. • The tool should use manually written comments from the developer to aid in the understanding of the specification. • The tool should work real-time with the source code and should automatically generate a new specification while the source code is edited. • The specification should be concise and significantly shorter than the source code. • The finished specification should be easily modified. • Created a source code process to complete the automatic specification • Created a Java splitter to split words based on capital letters, numbers, underscores, periods, and spaces for the preprocessing phase • Created a part of speech tagger to tag split words • This project focused on a tool that assists in writing the specification of source code. The tool not only saves time between developments, but it also allows future developers more insight and knowledge of what the source code holds. The main methods used to create the tool are: the Software Word Usage Model [4] and POS tagging to identify the linguistic properties of component words from the source code, along with camel case splitting, and variable lexicalization. • Future Work • Further approaches and methods will be evaluated to create an automatic specification. • The automatic source code specification tool will be prepared for real-world use. • The automatic specification will be tested for quality and efficiency. Having an automatic specification between developments allows for flexibility and easy understanding when changing past developments. Time is spread out evenly between each phase because the specification is automatically created for the previous development [6]. Source Code Process Methods • References Preprocessing • Splitting of identifiers • Abbreviation expanding • POS Tagging & SWUM Source Code • The source code process chart outlines the process source code would follow in order to have an automatic specification. • The source code begins in the preprocessing phase, where the code is split into components to be analyzed. • Next, abbreviations are automatically expanded. • Then, linguistic elements of the component words are identified using the Software Word Usage Model [4] and part of speech tagging. • The statement selection phase uses guidelines to omit and identify what statements are needed in the specification. • A few of the statements that are included are: ending statements, void-return statements, same-action statements, data-facilitating statements, and controlling statements, with some exceptions. • The next phase includes variable lexicalization, then translating the selected statements into natural language. • After the statements are translated, they are combined to create a summary of the code [4]. • Then the summary is created into a specification with abstractive information that is not included in the original source code. • [1] L. Moreno, J. Aponte, G. Sridhara, A. Marcus, L. Pollock, and K. Vijay-Shanker, "Automatic Generation of Natural Language Summaries for Java Classes," in 21stInternational Conference on Program Comprehension (ICPC), San Francisco, CA, USA, 2013, pp.23-32. • [2] J. Fowkes, R. Ranca, M. Allamanis, M. Lapata, and C. Sutton. "Autofolding for Source Code Summarization." CoRR, arXiv:1403.4503v1, pp. 1-12, 2014. • [3] S. Haiduc, J. Aponte, and A. Marcus, "Supporting Program Comprehension with Source Code Summarization," in 32nd ACM/IEEE International Conference on Software Engineering – NIER track, Capetown, South Aftrica, 2010, pp. 223-226. • [4] G. Sridhara, E. Hill, D. Muppaneni, L. Pollock, and K. Vijay-Shanker, “Towards Automatically Generating Summary Comments for Java Methods,” in IEEE/ACM International Conference on Automated Software Engineering (ASE), New York, NY, USA, 2010, pp. 43-52. • [5] E. Leonard, and C. Heitmeyer, "Automatic Program Generation from Formal Specifications using APTS," in Automatic Program Development, Netherlands: Springer,2008, pp. 93-113. • [6] “Characteristics of Agile Methodology in Software Development,” Available: http://blogs.globalteckz.com/characteristics-of-agile-methodology-in-software-development/, August 11, 2013 [July 14, 2014]. • Related Work • Automatic source code summarization [3]: • A tool that automatically creates a list of key-words from the original source code. • Automatic autofolding [2]: • A tool that automatically folds or hides code that is irrelevant to the understanding of the program. • Automatic generation of summary comments for Java methods and classes [4, 1]: • A tool that automatically creates summary comments within source code for Java methods and classes. • Automatic program generation from formal specification using APTS [5]: • A tool that automatically creates source code from a written APTS specification. Statement Selection • Omitting non-relevant information • Identifying relevant information Translating Chosen Statements • Variable lexicalization • Translating selected statements into natural language Combining Information • Combining natural language translating • Creating an abstractive specification • * DISCLAIMER: This material is based upon work supported by the National Science Foundation and the Department of Defense under Grant No. CNS-1263183. An opinions, findings, and conclusions or recommendation expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation or the Department of Defense. Completed Specification