130 likes | 140 Views
An Architecture for Distributing the Computation of Software Clustering Algorithms. 2001 Working Conference on Software Architecture (WICSA'01). Brian S. Mitchell, Martin Traverso & Spiros Mancoridis Math & Computer Science, Drexel University. Software Architecture.
E N D
An Architecture for Distributing the Computation of Software Clustering Algorithms 2001 Working Conference on Software Architecture (WICSA'01). Brian S. Mitchell, Martin Traverso & Spiros Mancoridis Math & Computer Science, Drexel University
Software Architecture • Software Architecture describes the: • System elements • Interaction between the system elements • Patterns that guide the composition of the elements • Constraints on the patterns [Shaw & Garlan 1996]
Reverse Engineering Environment Clustering Tool Visualization Tool Source Code void main(){printf(“hello”);} M1 M6 M3 M2 M8 M7 M4 M5 Source Code Analysis Tools MQ = 1.75 Acacia Chava M6 M1 M3 M8 M7 Partitioned MDG File M2 MDG File M4 M5 M1 M3 M6 M1 M3 M6 M2 MQ = 1.60 M2 M7 M8 M7 M8 M4 M5 M4 M5
Software Architecture Challenges • Determining the software architecture • Designer knowledge, and/or • Up to date documentation, and/or • Automated tooling Source Code Analysis Clustering Visualization Evaluation Design Constraint Validation
Bunch Clustering Tool Evolution Semi- Automatic Automatic User Tooling Bunch V 1.x 1998 Bunch V 2.x 1999-2000 Bunch V 3.x 2000-2001 Distributed Clustering Added in Bunch Version 2.x Bunch is the clustering tool produced by the DrexelUniversity Software Engineering Research Group.
Clustering Tool Requirements • Pluggable Algorithms • User Knowledge Integration • Programming Language Independence • Tool Integration • Source Code Analysis • Visualization • Evaluation • API • PERFORMANCE to handle large and complex systems
Bunch Challenges • Performance well-suited to small and intermediate sized systems (< 250 modules) • Design/Architecture changes were required to improve performance • Clustering Algorithm and Implementation Enhancements • Distributed Processing Capabilities
Bunch Environment Bunch Clustering Tool Visualization Tool Exhaustive Clustering Algorithm Source Code Hill-Climbing Clustering Algorithm(s) Source Code Analysis BUNCH Clustering Algorithms Partitioned MDG Genetic Clustering Algorithm Partitioned MDG File MDG File
Bunch Hill Climbing Clustering Algorithm Generate a Random Decomposition of MDG Iteration Step Generate Next Neighbor Measure MQ Current Partition Measure MQ New Best Neighboring Partition Compare to Best Neighboring Partition Better Best Neighboring Partition for Iteration Convergence Best Neighboring Partition
Bunch’s MVC Architecture and Algorithms Support Distribution Bunch Server Bunch Client MDG & Partitioned MDG Bunch User Interface View Model Queue Bunch Coordinator Bunch Clustering Algorithms Controller • Clustering Activity Messages: Producer/Consumer Pattern • Status & Management Messages: Publish/Subscribe Pattern
… … Init_Neighbor Init_Iteration Bunch Distributed Hill Climbing Clustering Algorithm NEIGHBORING SERVER CLUSTERINGMANAGER Message Manager Move Next Neighbor Measure MQ Work Queue Outbound Queue Compare to Best Neighboring Partition Better Best Neighboring Partition for Iteration Next Put_Result Get_Work Convergence InboundQueue Best Neighboring Partition
Case Study Results 6 5 4 Speedup (12 CPUs) 392 939 3 2 86 1 37 24 174 13 Proprietary Compiler ISpell Bison Grappa Incl Perspec-tives Compiler System Name (Number of Modules)
Concluding Remarks • Distribution approach based on: • Optimization of clustering approach • Bunch’s MVC Architecture • Performance improved for large systems, further improvement still possible • Future improvement based on additional implementation optimizations • Bunch written in 100% Java, DBunch uses RMI/IIOP Infrastructure Visit Bunch Online: http://serg.mcs.drexel.edu/bunch