270 likes | 554 Views
Paper: “Impact of Software Engineering Research on the Practice of Software Configuration Management. Authors: Estublier, Leblang, Hoek, Conradi, Clemm, Tichy, Wiborg-Weber Citation: ACM TOSEM Oct 2005. The Impact Project . • Provide scientific scholarly answers to:
E N D
Paper: “Impact of Software Engineering Research on the Practice of Software Configuration Management Authors: Estublier, Leblang, Hoek, Conradi, Clemm, Tichy, Wiborg-Weber Citation: ACM TOSEM Oct 2005
The Impact Project • Provide scientific scholarly answers to: – What impact has academic and industry research really had on the practice of software engineering? – What future impacts should be expected? – What future directions will software research take? • How? – ACM Sigsoft project (international) – NSF and Sigsoft funding – EU, Japanese, private funding - Deliverables: journal articles, conference panels 2000 - 2003
Initial Subject Areas • Reviews/Walkthroughs – Dieter Rombach/Dewayne Perry • Configuration Management – Jacky Estublier • Testing and Analysis – Lori Clarke/David Rosenblum • Middleware – Wolfgang Emmerich • Process/workflow/lifecycle models – Volker Gruhn • Modern Programming Languages – Mary Lou Soffa/Barbara Ryder • Requirements Engineering – Anthony Finkelstein/Axel van Lamsweerde • Reverse Engineering – Hausi Muller • Cost/Economic Models
How do they define Impact? The research must have been: • Published – publicly available, AND • Incorporated in actual SCM product that are (or were) on the market, commercially or free. Other impacts not considered: - people (graduates) - workshops and conferences
Software Configuration Management is… The discipline of managing change in large, complex software systems. Goals: manage and control corrections, extensions, and adaptations throughout lifetime of software system - Systematic and traceable software development process - Managing files and directories
In the beginning… • 1950s • Aerospace industry • Colored punch cards • 1960s • Integrated within OS • 1970s • Separate discipline
SCM Spectrum of Functionality • Components • Versions • Configurations • Baselines • Project contexts • (keep track) • Structure • System model • Interfaces • Consistency • Selection • (how related) • Construction • Building • Snapshots • Regeneration • Optimization • (create exec) • Controlling • Access control • Change requests • Bug tracking • Partitioning • (track change) • Accounting • Statistics • Status • Reports • (gather stats) • Auditing • History • Traceability • Logging • (archive/rollback) • Process • Lifecycle support • Task mgmt. • Communication • Documentation • (choose tasks) • Team • Workspaces • Merging • Families • (conflicts) Susan Dart, SCM-3, 1991 And,…remain universally applicable – PL and App independent
Partition of SCM Approaches Product - Versioning - System Models and selection: support aggregate artifacts – configuration concept Tool - Workspace control: distributed users?, integration of change - Building: executable Process - Support for general development processes to manipulate artifacts
Approaches to Versioning • Capture artifacts as configuration items • Track relations among items in version graph Edges: • revision-of – seq develop • variant-of - || development • merge variants 1.0 1.1 1.2 1.2.1.0 2.0 1.2.1.1 2.1
When disk space was scarce: • Delta storage (eg, SCCS, RCS): baseline+deltas • Data compression • Combination of delta and compression • More accuracy in deltas: • Context-oriented • Operation-oriented • Semantics-oriented • Syntax-oriented • But for generality -> classic line-based merging
Advanced Versioning – Change sets • How it works: • Each change stored as a delta independently from other changes • Allow more flexibility • Can combine changes as desired • Not used in practice: • Deltas overlap/conflict – some combos do not work • For binary objects – cannot combine some deltas • Too unwieldy for large projects with large change
Alternative: Change PackagesTask, Activity, package, Subproject,… 1.0 1.0 1.0 1.0 2.0 1.1 1.1 1.1 2.1 1.2 1.2.1.0 1.2 2.2 2.0 1.2.1.1 1.3 2.0.1.0 1.2 2.3 2.1 2.0 - track changes at logical level
Aggregating and Accessing Multiple Artifacts • Data Models • Early 1970s – SCCS and RCS = file system • Since then – on top of commercial database systems • Research systems: • Adele, 1985: active, oop, versioned model • Object=any entity • Attribute=primitive, compound, predefined • Relations=model associations like derivation, dependency, composition • More advance commercial system: Aide-De-Camp 1989
Aggregating and Accessing Multiple Artifacts • System Models – late 1970s onward • MIL (Module interconnection language) to describe system structure • Model interfaces – provided, required functions • Behaviors - Pre and post conditions • Hierarchical construction of modules • system architecture; UML • Integrate into SCM – users can manage real organization of software • Major Problem – keeping evolution of model and implementation versions in synch
Aggregating and Accessing Multiple Artifacts • Selection • How do I get a set of artifacts in my workspace without requesting them individually? • Default: All latest version in workspace. Fetch the rest individually. • Other approaches: • Hierarchical workspaces 1989. local, parent,… • General queries 1984. • (status = approved) AND owner = Jacky) OR (date > 6.20.83) • Leverage change-sets 2000. • Baseline 2.5 + bug-fix2.83 + bugfix.2 + feature-12 • Rule-based 88, 94. • First, my checked-out versions • Otherwise, the latest versions on my branch • Otherwise, the latest versions on the main branch
A Typical Development Scenario Pete’s workspace Ellen’s workspace A B C D E C CMrepository
Workspace Control 3 functions of the workspace: • Sandbox – freely edit. May be locks. • Building – expand compressed files, keep compiled/derived objects • Isolation – allow developer to make changes, compile, test, debug without interference
Workspace Control Classic SCCS and RCS Systems – no workspace management CVS – first scripts on top of RCS Need: avoid source file copies in 100s of workspaces Sun/Forte Teamware – manage projects of subprojects Virtual workspaces – only copies of files editing ClearCase – avoid recompiling sources on builds
Building • Make, 1979. – dependencies, date-based rebuild, fast. • Improvement - Rebuild only if any source versions now in workspace are not exactly the same as in last build. • BOMs – bill of materials for each target object built • Language-based smart rebuild: semantic changes and dependencies • Winking-in (ClearCase) – language independent, reuse binaries across workspaces
Process Support Software Process = sequence of activities during creation and evolution • Change control: • Change request (requirement change) • Trouble report (malfunction issue)
Successful Transitions • SCCE, Make, RCS – immediate, long lasting impact • Change sets – slow, nonpractical, but standard feature – change packages. • Process support – advanced support for modeling and enforcing process • Differencing/merging – binary deltas, not semantic-based • Distributed/remote development – client-server protocol, web-based interfaces
Failed Transitions • Semantic-based recompilation – language dependent • Advanced systems models – more power than needed • Generic platform - research has focused on managing source code only. Too much needed for extra artifacts.
Summary • High impact Research – useful, ease of use by developer, generality • Low impact Research – level of complexity too high, not easy to master idea as a feature
What is Next for SCM? • How to fit SCM with rest of development process/tools? • Manage other artifacts beyond source code • Maybe not be language independent
Recognizing a Valuable Resource: Mining Software Repositories • Configuration management repositories are traditionally a “depot” • occasional roll-back • occasional search for relevant information • But what if we used the information captured by configuration management repositories to our advantage • understanding software developers • helping software developers