210 likes | 341 Views
Cross-application Fan-in Analysis for Finding Application-specific Concerns. Makoto Ichii Takashi Ishio Katsuro Inoue Osaka University. Coding pattern detection. [Ishio, 2008] T. Ishio. H. Date, T. Miyake and K. Inoue,
E N D
Cross-application Fan-in Analysis for Finding Application-specific Concerns Makoto Ichii Takashi Ishio Katsuro Inoue Osaka University AOAsia 4
Coding pattern detection [Ishio, 2008] T. Ishio. H. Date, T. Miyake and K. Inoue, "Mining Coding Pattern to Detect Crosscutting Concerns in Java Programs", Proc. WCRE2008, 2008 [Miyake, 2007] T. Miyake, T. Ishio, K. Taniguchi, K. Inoue, "Towards Maintenance Support for Idiom-based Code Using Sequential Pattern Mining", Proc. AOASIA3, 2007 AOAsia 4 • Automatic detection of crosscutting concerns helps • Finding refactoring opportunities • Understanding application-specific coding rules • Fung: Coding pattern detection tool[Ishio, 2008][Miyake, 2007] • Detects coding patterns including crosscutting concerns from an application using a data mining technique • Basic idea: “a crosscutting concern code frequently appears across an application”
Example of coding pattern Sequential pattern mining parse & normalize Source code Method call sequence Coding pattern … if (log.isDebugEnabled()) { log.debug(getMessage()); } … … isDebugEnabled() IF getMessage() debug() END_IF … … String status = getStatus(); if (log.isDebugEnabled()) { log.debug(status); } … 1: isDebugEnabled() 2: IF 3: debug() 4: END_IF … getStatus() isDebugEnabled() IF debug() END_IF … … if (log.isDebugEnabled()) { log.debug("QBK"); } … … isDebugEnabled() IF debug() END_IF … AOAsia 4 • Coding pattern • An ordered sequence of method calls and control statements that frequently appears in source code. • Process of coding pattern detection
Needs for application-specific concerns Target application Detected patterns Logging 1: isDebugEnabled() 2: IF 3: debug() 4: END_IF Iterator idiom 1: iterator() 2: hasNext() 3: LOOP 4: next() 5: hasNext() 6: END_LOOP AOAsia 4 • Detected coding patterns include generic idioms • Idioms also frequently appear across code base • Less interesting to developers who need application-specific knowledge
Filtering approach:cross-application fan-in analysis Logging Appears in only two applications 1: isDebugEnabled() 2: IF 3: debug() 4: END_IF Appears in almost all applications Iterator idiom 1: iterator() 2: hasNext() 3: LOOP 4: next() 5: hasNext() 6: END_LOOP AOAsia 4 • Key Idea • Generic idioms appear in various applications • Application-specific patterns appear in a few applications • Measure how widely a class/pattern is used across applications • “Universality” metric
Approach overview Target application Detected patterns Filtered patterns 1: iterator() 2: hasNext() 3: LOOP 4: next() 5: hasNext() 6: END_LOOP 1: iterator() 2: hasNext() 3: LOOP 4: next() 5: hasNext() 6: END_LOOP 1: indexOf 2: lastIndexOf 3: substring 1: contains 2: IF 3: get 4: END_IF 1: indexOf 2: lastIndexOf 3: substring 1: activate 2: IF 3: deactivate 4: END_IF 1: isDebugEnabled() 2: IF 3: debug() 4: END_IF 1: activate 2: IF 3: deactivate 4: END_IF 1: isDebugEnabled() 2: IF 3: debug() 4: END_IF 1: contains 2: IF 3: get 4: END_IF 1. ………………………… 2. ………………………… 3. ………………………… 4. ………………………… 5. ………………………… … … … Application collection Use-relation between classes List of universally-used classes AOAsia 4 • Collect various applications • Including target application • Analyze the use-relation between the classes in the applications • Measure universality metric for each classes • Filter out the patterns comprising only universally-used classes.
Cross-application use-relation Source code Use-relation graph WarehouseApp class Warehouse { … Liquorliq = new Liquor(); … } Warehouse WarehouseApp class Liquor { long price; String name; … } Liquor AOAsia 4 • An extension of ordinal static use-relation analysis between classes in an application. • Build a use-relation graph • Node: class • Edge: static use-relation between classes • Kinds of use-relation • Inheritance, Method call, Field access, Instantiation and Variable/Parameter declaration
Cross-application use-relation WarehouseApp StoreApp Warehouse Store Shelf Liquor Liquor Paper A copy of Liquor in WarehouseApp AOAsia 4 • Analyze use-relation between classes across application borders • Analyze intra-application use-relation • in the same way with the case of single application • If there are several copies of “used class” in different applications, create edges to all of them
Class fan-in and application fan-in WarehouseApp StoreApp Warehouse Store Shelf Liquor Liquor Paper AOAsia 4 • Class fan-in of a class c • The number of classes using c • Application fan-in of a class c • The number of applications using c
The Class Universality Metric ic: class fan-in of c; ac: application fan-in of c;|C|: total number of classes; |A|: total number of applications Frequently-usedlocally Frequently-used universally AOAsia 4 • Class universality of a class c • Represents how widely a class is used • From many classes / applications
The Pattern Universality Metric Involved classes Coding pattern 1: iterator() 2: hasNext() 3: LOOP 4: next() 5: hasNext() 6: END_LOOP Pattern universality = 0.72 AOAsia 4 • Pattern universality of a pattern p • The minimum universality value of the classes whose methods are invoked in p • A universal pattern comprises only universal classes
Case studies AOAsia 4 Case Study 1 • Measure class universality value of actual classes Case Study 2 • Measure pattern universality value of coding patterns detected by Fung
Case Study 1 – Overview AOAsia 4 Questions Q.1 What kind of classes have high universality? Q.2 Can universality distinguish classes widely used and classes simply frequently used? Q.3 What threshold value is good for filtering? Process • Measure class universality of classes in application collection • Investigate the result to answer the questions • The top-20 classes in the universality [Q.1] • Difference between the universality and the fan-in [Q.2] • Distribution of the universality [Q.3] Target • 39 application packages (131,328 classes) • Java SE 1.5 • Various OSS packages covering a broad range of domains • Eclipse (IDE), Azureus (Network client), Apache Tomcat (Network server), Freemind (Drawing tool), …
Case Study 1 –Top 20 classes in the class universality AOAsia 4 Q.1 What kind of classes have high universality? • Fundamental / Utility classes
Case Study 1 –Universality and fan-in High universality / Low fan-in Low universality / High fan-in • Yes. AOAsia 4 • High universality / Low fan-in • Classes with fundamental / utility role • Low universality / High fan-in • Classes implementing crosscutting concerns in a large application Q.2 Can universality distinguish classes widely used and classes simply frequently used?
Case Study 1 –Distribution Q.3 What threshold value is good for filtering? • 0.2 for finding application-specific concerns • 0.5 for filtering out generic concerns AOAsia 4 • 1.0-0.5: general-purpose classes • Primitive/fundamental classes, collection utilities, … • 0.5-0.2: domain-specific classes • Logging utility, networking, GUI, … • 0.2-0: application-local classes
Case Study 2 – Overview AOAsia 4 Question • Can the pattern universality distinguish among generic, domain-specific and application-specific patterns? Process • Categorize coding patterns according to pattern universality • 1.0 – 0.5: Generic pattern • 0.5 – 0.2: Domain-specific pattern • 0.2 – 0.0: Application-specific pattern Target • Coding patterns • Azureus (presented in [Ishio, 2008]) • Application collection • Same as Case Study 1
Case Study 2 – Result • Q. Can the pattern universality distinguish generic / domain-specific / application-specific patterns? • Almost yes. AOAsia 4 • Generic patterns (2290 patterns) • String manipulation • String.lastIndexOf() / IF / String.substring() / END_IF • Collection manipulation • List.get() / IF / List.remove() / END_IF • Domain-specific patterns (79 patterns) • Collection manipulation • Map.size() / Iterator.remove() / LinkedHashMap.get() / LinkedHashMap.remove() • Domain-specific? • Application-specific patterns (2293 patterns) • Logging • LOOP / Thread.sleep() / Debug.printStackTrace() / END_LOOP • Synchronization • IF / AEMonitor.enter() / ArrayList.remove() / AEMonitor.exit() / END_IF
Discussion AOAsia 4 • Universality metric can distinguish universally-used classes • Resource management classes in Eclipse/NetBeans are distinguished as application-specific • although they have large fan-in • Universality metric value may depend on a set of applications • Case studies in different target are needed • E.g. industrial software systems.
Discussion AOAsia 4 • Some domain-specific classes have higher class universality than general-purpose classes • Ideas to improve the metric • Propagate fan-in through important use-relation • E.g. inheritance • Combining other metric • Less popular generic concerns may be more interesting than famous domain-specific ones
Summary and future works AOAsia 4 • Cross-application fan-in analysis for filtering coding patterns • Measures universality, or a metric that represents how widely a class/pattern is used • Future work • Case studies with different applications • Refinement of the universality metric