310 likes | 415 Views
Samad Paydar Web Technology Lab. Ferdowsi University of Mashhad 10 th August 2011.
E N D
SamadPaydar Web Technology Lab. Ferdowsi University of Mashhad 10th August 2011 This is a review of the paper:Semantic Web Enabled Software AnalysisJonas Tappolet, Christoph Kiefer, Abraham BernsteinDynamic and Distributed Information Systems, University of Zurich, Switzerland Journal of Web Semantics, 2010
Outline • Introduction • Software ontology models • Semantic web query methods for software analysis • Experimental evaluation • Conclusion 2
Introduction • In order for software to be developed, maintained and evolved • It is required that it is understood • How code works • Developers’ decisions • Some reasons • Development team changes • Programmers forget what they have done • Undocumented code • Outdated comments • Multiple versions 3
Introduction • Therefore a code comprehension framework is needed • Mainly composed of two major steps • Converting source code to an internal representation • Performing queries 4
Introduction • Further • Open source movement • Software complexity • Libraries dependent on other ones • Software that is developed locally is a node in a world-wide network of interlinked source code • Global Call Graph 5
Introduction • Each node in this cloud should exhibit its information in an open, accessible and uniquely identifiable way • Therefore “we propose the usage of semantic technologies such as OWL, RDF and SPARQL as a software comprehension framework with the abilities to be interlinked with other projects” 6
Software ontology model • Three models for different aspects of code • Software Ontology Model (SOM) • Bug Ontology Model (BOM) • Version Ontology Model (VOM) • Connected to related ontologies • DOAP • SIOC • FOAF • WF
SOM: Software Ontology Model • Based on FAMIX (FAMOOS Information Exchange Model) • A programming language independent model for representing object-oriented source code
VOM: Version Ontology Model • For specifying the relations between files, releases, and revisions of software projects • Based on the data model of Subversion
BOM: Bug Ontology Model • Based on the bug-tracking system Bugzilla
Query Methods • Two non-standard extensions of SPARQL • iSPARQL (Imprecise SPARQL) • SPARQL-ML (SPARQL Machine learning)
iSPARQL • Introduces the idea of “virtual triples” • Are not matched against the underlying ontology graph, but used to configure similarity joins • Which pairs of variables should be joined and compared using a certain type of similarity measure
SPARQL-ML • An extension of SPARQL with knowledge discovery capabilities • A tool for efficient relational data mining on Semantic Web data • Enables the Statistical Relational Learning (SLR) methods such as Relational Probability Trees (RPTs) and Relational Bayesian Classifiers (RBCs)
SPARQL-ML • Learning phase (building prediction model)
SPARQL-ML • Test phase (making prediction)
Experimental Evaluation • 4 years (2004-2007) of the proceedings of ICSE Workshop on Mining Software Repositories (MSR) are surveyed • Most actively investigated software analysis tasks are determined
Experimental Evaluation • Dataset: 206 releases of the org.eclipse.compare plug-in for Eclipse (average of about 150 Java classes per version) + bug tracking information • Exported to OWL
Experimental Evaluation • Task 1: software evolution analysis • Applicability of iSPARQL to software evolution visualization (i.e. visualization of code changes foe a certain time span) • Compared all the classes of one major release with another major release with different similarity strategies
Experimental Evaluation • Task 2: computing source code metrics • Calculating OO software design metrics
Experimental Evaluation • Changing methods (CM) and changing classes (CC) • A method that is invoked by many other methods has a higher risk of causing defect in presence of chance
Experimental Evaluation • Number of methods (NOM) and number of attributes (NOA) • As indicators of GOD classes
Experimental Evaluation • Number of bugs (NOB) and number of revisions (NOR)
Experimental Evaluation • Task 3: detection of code smells • Task 4: defeat and evolution density • Task 5: bug prediction
Conclusion • A novel approach to analyze software systems using Semantic Web technologies • EvoOnt provides the basis for representing source code and metadata in OWL • This representation reduces analysis tasks to simple queries in SPARQL (or its extensions) • A limitation: loss of some information due to the use of FAMIX-based ontology model
Conclusion • Language constructs like if-else are not modeled • Measurements cannot conducted at the level of statements • One of the greatest impediments towards widespread use of EvoOnt : current lack of high-performance industrial-strength triple-stores & reasoning engines