190 likes | 341 Views
OOSE 01/17. Institute of Computer Science and Information Engineering, National Cheng Kung University Member:Q76001074 薛弘志 P76014020 蔡文豪 F74982155 周詩御 . Reference.
E N D
OOSE 01/17 Institute of Computer Science and Information Engineering, National Cheng Kung University Member:Q76001074 薛弘志 P76014020 蔡文豪 F74982155 周詩御
Reference • Prasad, A.V.K. and Ramakrishna, S. (2010b), ‘Data Mining for Secure Software Engineering – Source Code Management Tool Case Study’, International Journal of Engineering Science and Technology, vol. 2 (7), pp.2667-2677.
Introduction • To improve software productivity and quality, software engineers are increasingly applying data mining algorithms to various software engineering tasks. • However mining software engineering data poses several challenges, requiring various algorithms to effectively mine sequences, graphs and text from such data. • Using well established data mining techniques, it can explore the potential of this valuable data in order to better manage their projects and do produce higher-quality software systems that are delivered on time and with in budget.
Introduction(cont.) • Mining algorithms for software engineering falls into four main categories: • Frequent pattern mining – finding commonly occurring patterns. • Pattern matching – finding data instances for given patterns. • Clustering– grouping data into clusters and • Classification– predicting labels of data based on already labeled data.
Introduction(cont.) • Software engineering data can be broadly categorized into: • Sequencessuch as execution traces collected at runtime, static traces extracted from source code, and co-changed code locations. • Graphssuch as dynamic call graphs collected at runtime and static call graphs extracted from source code. • Textsuch as bug reports, e-mails, code comments, and documentation.
Objectives • The objective of the research work to propose strategic Data Mining tools for program source code debugging which improves Software Reliability & Quality. • Software engineers can start with either a problem driven approach, but in practice they commonly adopt a mixture of the first two steps: collecting data to mine and determining the SE tasks to assist. • The three remaining steps are inorder, preprocessing data, adopting a mining algorithm, and post processing applying mining results.
Objectives(cont.) • Processing data involved first extracting relevant data from the raw SE data. This data is further processed by cleaning and properly formatting it for the mining algorithm. • The next step produces a mining algorithm and its supporting tool, based on the mining requirements derived in the first two steps. • The final step transforms the mining algorithm results in to an appropriate format required to assist the SE task.
Objectives(cont.) • Further, many such tools are general purpose and should be adapted to assist the particular task at hand. • However, software engineering researchers may lack the expertise to adapt mining algorithms, while data mining researchers may lack the background to understand mining requirements in the software engineering domain. • On promise way to reduce this gap is to foster close collaborations between the software engineering community(requirement providers) and data mining community(solution providers).
Implementations • The management of source code is one of the greatest challenges facing programmers today. As programs become larger and more complex, the need to organize and manage source code increases. • Author’s motivation is to implement source code maintenance routines which parse tokens from an ANSI C++ file, formats the file, extract header files and colorize a file.
Implementations(cont.) • When files are shared among objects, it is difficult to track which files are dependent on others. • A source code maintenance program can parse the source code and produce documentation that describes each class its member variables and functions. • Maintaining structure code amongst team members is extremely difficult and time consuming because programmers must modify their individual styles. • A source code formatter offers a convenient solution to this problem.
Implementations(cont.) • Code maintenance modules receive source code as input, break the code down into tokens and then output them in a new format. • The utility is based on three class groups: • A scanner reads the code and breaks it down into tokens and returns them back to the parser. It also identifies the type of token to return. • The parser requests successive tokens from the scanner and takes appropriate action before requesting the next token. The action of parser is to write out the token.
The sequence diagram of the overall code maintenance process
Conclusion • The mining algorithms works on software engineering data like text, sequences, graphs which improves software engineering tasks like Programming, Maintenance, Bug Detection & Debugging. • The author only implemented the tool for source code management, it is useful for code maintenance and programming. • For a programmer, bug detection and debugging is more important, so how to use the mining algorithms to assist programmer is the future work.