1 / 21

The Small World of Software Reverse Engineering

The Small World of Software Reverse Engineering. Ahmed E. Hassan and Richard C. Holt SoftWare Architecture Group (SWAG) University Of Waterloo. Publications. We study the evolution of a field through its publications Publications give a picture of Collaboration:

olathe
Download Presentation

The Small World of Software Reverse Engineering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Small World of Software Reverse Engineering Ahmed E. Hassan and Richard C. Holt SoftWare Architecture Group (SWAG) University Of Waterloo

  2. Publications • We study the evolution of a field through its publications • Publications give a picture of • Collaboration: • High degree in academia in contrast to industry • Emergence of topics: • Hot topics and their effects on collaborations

  3. DBLP • DBLP: • DataBase systems and Logic Programming • Digital Bibliography and Library Project • Tracks publications in several conferences and communities, such as: • WCRE • Reengineering and maintenance • Software engineering • Records for each publication: • Title • Authors • Conference: name and year • Abstract • Data available online as an XML file

  4. Studying Collaboration • Develop a social collaboration network using co-authorship data from DBLP: • A node exists for each author • An edge exists between two nodes (authors), if they co-authored a paper together • Size of node proportional to # of pubs • Edges have a weight proportional to # of co-authored papers • Use a force based algorithm to layout the network

  5. Co-Authorship Graph

  6. 1: E. Burd 2: E. Stroulia 3: M. Munro 4: M. Harman 5: E. Merlo 6: G. Canfora 7: K. Kontogiannis 8: R. Koschke 9: R. Holt Co-Authorship Graph

  7. The Largest Component over Time

  8. Small World Graphs • Large graphs with small paths connecting its nodes • Stanley Milgram studied them in the 60s: • Letters were given to people in Nebraska • Each person hands letter to someone they knew and whom they believe can eventually deliver the letter to a stockbroker in Pittsburgh • Average chain of people between both cities is 6 – six degrees of separation • Collaboration networks which are small world graphs: • Good indicator of ease of communication of knowledge between members of a community

  9. Small World Graph • Characteristic Path Length (L): measures on average how many individuals an author has to go through to reach other authors • The average shortest path from any node in the graph to any other node in the largest component of the graph • Clustering Coefficient (C): measures how collaborative on average are the co-authors of an author • For a node, C is the ratio of edges to neighbors of that node to the maximum number of edges between these neighboring nodes • Watts and Strogatz give a formal definition of small world graphs using C, L, and random graphs: L>Lrandomand C >> Crandom

  10. WCRE is a small world graph! • Clustering coefficient is 0.76 • Characteristic path length is 4.3 Author Centrality Canfora 2.76 Koschke 2.88 Merlo 2.94 De Lucia 3.1 Holt 3.2 Towards a Standard Schema for C/C++ By Ferenc, Sim, Holt, Koschke and Gyimothy 3.94  4.32

  11. Paper Titles Analysis for Emerging Terms

  12. Bigger Small Worlds in SE • We compare results against another two research communities: • Maintenance and Reengineering (MR): WCRE, IWPC, CSMR, ICSM • Software Engineering: MR + 17 Conferences • DBLP data is not as complete for these conferences

  13. The Largest Component over Time • Slow constant growth then rapid growth once researchers know each other • Soft Eng has slower growth • Less conferences in early days • Incomplete DBLP data • Wider scope • MR and Soft Eng rapid growth since 1996 • Internet and email? • MR and WCRE growing since late 90s: • Y2K?

  14. Most Central Authors over Time

  15. WCRE, MR, SE vs. other fields

  16. Joint work with J. Wu Java Title Spectrograph Reverse object Compon orient software program design system experi engin data abstract 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 00 01 02 03

  17. Conclusion • A meta paper on publications and collaboration networks in WCRE, MR and SE communities • Small World collaboration networks facilitate the exchange of ideas and results in a community • Many of the techniques presented could be used to study the evolution of software systems (files or developers as nodes)

  18. Generating Small World Graphs Using Random Re-wiring Large L Large C Small L Large C Small L Small C

  19. Percentage of Papers in a year

  20. # of new co-authors in a year

More Related