540 likes | 580 Views
Juergen Rilling. Tutorial: Application of Ontologies in Software Maintenance. Department of Computer Science and Software Engineering Concordia University Montréal, Québec, Canada rilling@cse.concordia.ca. Tutorial objectives:. Ontologies in Software Engineering
E N D
Juergen Rilling Tutorial: Application of Ontologies in Software Maintenance Department of Computer Science and Software Engineering Concordia University Montréal, Québec, Canada rilling@cse.concordia.ca
Tutorial objectives: • Ontologies in Software Engineering • Motivation for ontologies in Software Maintenance • The use of ontologies in Software Maintenance • Source code analysis • Document analysis • Traceability • Process modeling • Ongoing challenges Tutorial: Application of Ontologies in Software Engineering
Ontologies in Software Engineering • Replacement of XML and relational schemas • Verification and validation • Maintainability (extendibility) of ontologies • Model Driven Architecture (MDA) • Currently no existing standard formalization of development process • Ontologies: standardized and formal • Tool integration • Enabling of service sharing among tools • However, a common problem to all of these approaches • Limited use of reasoning to infer additional knowledge. • Ontologies are only treated as yet another exchange/ information storage format. Tutorial: Application of Ontologies in Software Engineering
Ontologies and Software Maintenance • Software comprehension refers to activities that humans perform: understanding, conceptualizing and reasoning about software. • Research in cognitive science suggests that mental models may take many forms, but the content normally constitutes an ontology [Johnson-Laird 1983]. [Johnson-Laird 1983] P. N. Johnson-Laird, “Mental Models: Towards a Cognitive Science of Language, Inference and Consciousness”. Harvard University, Cambridge, Mass., 1983. Tutorial: Application of Ontologies in Software Engineering
Tutorial: Ontologies in Software Maintenance Presented in this tutorial • Ontologies applied for • Source code analysis • Document analysis • Traceability analysis • Process modeling • Common to all of these applications • Formal ontological representation • Taking advantage of reasoning services • Extending the knowledge representation with newconcepts Tutorial: Application of Ontologies in Software Engineering
Source code analysis Exploring the known and unknown…. Yonggang Zhang1, René Witte1, Volker Haarslev1 1 Department of Computer Science and SE Concordia University Montréal, Québec, H3G 1M8, Canada Tutorial: Application of Ontologies in Software Engineering
Source code analysis- Beyond ASTs Current analysis approaches: • Focus on detailed source code analysis • Use Some form of Abstract Syntax Tree representation • AST based analysis techniques • Precise, but.. • Expensive => due to the size of the ASTs • Scalability issues • Programming language dependent • Low acceptance What is really used by maintainers? UNIX/LINUX style grep search facilities Tutorial: Application of Ontologies in Software Engineering
Ontologies and Source code analysis • Identify potential software artifacts from software documentation • Maintainer specifies/describes identified artifacts • Allow programmers to formulate hypotheses concerning properties of these artifacts • Confirm or refute these hypotheses using automated reasoning. Tutorial: Application of Ontologies in Software Engineering
Description Logics and Racer • Description Logics as ontology language • A family of knowledge representation formalisms applied to – • Specific application domains • Describe or specify concepts of a domain • Standard ontology languages • Foundation of the recently introduced Web OntologyLanguage (OWL) recommended by the W3C • Racer – state-of-the-art ontology reasoner • Conceptual reasoning – classification, subsumption, transitive closure and consistency checking • Instance reasoning – retrieval, checking, consistency, realization Tutorial: Application of Ontologies in Software Engineering
SOUND environment Tutorial: Application of Ontologies in Software Engineering
Source Code Ontology • Capture major concepts of (Object Oriented) Program languages • Class, Variable, Method, etc. • Extended ontology with some Java specific concepts and relations • Interface, abstract class, etc. • Concepts with a direct mapping to source code elements => can be automatically discovered by a Java compiler • Instances of roles are obtained by static source code analysis. Tutorial: Application of Ontologies in Software Engineering
Software Ontology Software ontology Role names in source code ontology Tutorial: Application of Ontologies in Software Engineering
Description Logics and Query Interfaces • Ease of use is one of the challenges • Most potential users are not familiar with formal representations. • Not specific to Description Logics => applies to all formal specification languages. • What is needed? • An abstraction users (maintainers) are familiar with. Tutorial: Application of Ontologies in Software Engineering
Query Interface • nRQL – an ontology query language • uses arbitrary concept and role names to specify the properties of the result • query variables can be used to bind to instances that satisfy the query • JavaScript –Simplified Procedural query language • Build-in Logic functions • Predefined objects to compose queries and manipulate results • Extends nRQL by providing additional procedural language constructs • To retrieve all methods that may throw exceptions var method_throw_exception = new Query(); method_throw_exception.declare(“M”, “E”); method_throw_exception.restrict(“M”, “Method”); method_throw_exception.restrict(“E”, “Exception”); method_throw_exception.restrict(“M”, “throw”, “E”); method_throw_exception.retrieve(“M”, “E”); var result = ontology.query(method_throw_exception); • To manipulate the result for(i = 0; i < result.size(); i++){ out.println(“Method “ + result.get(i, “M”) + “ throws ” + result.get(i, “E”); } Tutorial: Application of Ontologies in Software Engineering
Query Interface – UI Tutorial: Application of Ontologies in Software Engineering
Application (1/2 ): Security validation • To prevent access to un-initialized variables, a general guideline could be: all fields must be initialized in the constructors. • Retrieve all classes that did not follow this specific guideline SecurityConcern3.restrict(“F”, “Field”); SecurityConcern3.restrict(“I”, “Constructor”); SecurityConcern3.restrict(“C”, “Class”); SecurityConcern3.restrict(“F”, “definedIn”, “C”); SecurityConcern3.restrict(“I”, “definedIn”, “C”); SecurityConcern3.no_relation(“I”, “writeField”, “F”); SecurityConcern3.retrieve(“C”, “I”); Tutorial: Application of Ontologies in Software Engineering
Application (2/2): Impact analysis • InfoGlue* an open source Content Management System • InfoGlue 64KLOC, 770 classes distributed in 49 Java packages. • Example: • The Event interface in InfoGlue has to be changed. • Identify the potential impact related to this interface change. • By identifying and retrieving all classes that use the Event interface var impactOfEvent = new Query(); // Declare new Query variable impactOfEvent.declare("C"); // declare new variable impactOfEvent.restrict("C", "Class"); // restrict variable to all classes impactOfEvent.restrict("C", "indirectUse", // further restrict to include all "org.infoglue.cms.entities.workflow.Event"); // direct/indirect dependencies impactOfEvent.retrieve("C"); var result = ontology.query(impactOfEvent); out.println(result); • transitive closure of the indirectUse relation between classes. • Other applications: Architectural analysis, security enforcement, etc. * http://www.infoglue.org Tutorial: Application of Ontologies in Software Engineering
The missing link in software maintenance…. Traceability links René Witte1, Yonggang Zhang1, Volker Haarslev1 1 Department of Computer Science and SE Concordia University Montréal, Québec, H3G 1M8, Canada Tutorial: Application of Ontologies in Software Engineering
Traceability links • Background • Traceability links help software engineers understand the relations and dependencies among various software artifacts (e.g., source code, documentation). • Challenge • Links between different artifacts often get lost during the development process, for various reasons: • Difference in languages (natural language vs. Source code). • Difference in abstraction level (design or requirements vs. Implementation). • Maintenance of links is typically not enforced. • Lack of adequate (semi-automatic) tools to support creation and maintenance of links. Tutorial: Application of Ontologies in Software Engineering
Traceability Links • Approach • Automatic (semi-automatic) recovery of traceability links. • Use a single data model for knowledge concerning both source code and documentation artifacts: an ontology. • Instance information is extracted from source code using compilers and static code analysis. • Likewise, instance information can be obtained from documents using text mining. • The resulting ontologies can be aligned on the class level and linked or merged to provide traceability (and other new features). Tutorial: Application of Ontologies in Software Engineering
Implementation Overview Ontology Browser Document Navigator Eclipse IDE SOUND Plug-in Ontology Management Query Interface nRQL/Javascript Racer – Ontology Reasoner Text Mining System Software Ontology Source Code Ontology Documentation Ontology Tutorial: Application of Ontologies in Software Engineering
Documentation Ontology • A large body of concepts that can be discovered in software documents • Programming – languages, algorithms, data structures • Design – design patterns and software architectures • Document-specific – paragraphs, sentences, etc. • The documentation ontology and source code ontology share many concepts from the programming language domain – allows us to establish relations between source code and documentation • Document that describe both C1 and C2 • is there anywhere in the documents a reference to C1 in the context of a specific concept (e.g. concept Architecture::Layer). Tutorial: Application of Ontologies in Software Engineering
Applying Text Mining on Documents • Using Natural Language Processing (NLP) to extract semantic information from documents • Text Mining is not “Information Retrieval (IR)” • IR simply returns documents according to a query • TM analyzes natural language documents on a syntactical and semantical level in order to extract or create new structured or unstructured information • Ontology-based approach • Ontologies (OWL-DL) are used both for result export and as a processing resource in NLP • Ontology format allows us to semantically link information from documents with source code Tutorial: Application of Ontologies in Software Engineering
Text Mining • Text analysis: complex and typically split into several sub-tasks • Preprocessing • Tokenization, Part-of-Speech (POS) Tagging, Noun Phrase (NP) Chunking, Sentence Splitting, … • Named Entity (NE) Detection • Finding NEs like Methods, Classes, Design Patterns, etc. • NE Resolution and Normalization • Detect co-referring entities (“method X”, “this method”) and store them in co-reference chains • Normalize different names to a canonical one • Relation detection • Detect relations between entities (class implements method, design pattern contains classes) Tutorial: Application of Ontologies in Software Engineering
Text Mining Workflow Tutorial: Application of Ontologies in Software Engineering
Text Mining Subsystem • NLP subsystem has been implemented using the GATE (General Architecture for Text Engineering) environment • Component-based architecture • Documents are processed through so-called pipelines of NLP components • Combination of standard off-the-shelf components with custom developed ones • Analysis results are exported into an OWL-DL ontology (ontology population) Tutorial: Application of Ontologies in Software Engineering
Text Mining Subsystem Tutorial: Application of Ontologies in Software Engineering
Document ontolgoy Tutorial: Application of Ontologies in Software Engineering
Dynamic document views through Ontologies • We can use the document ontology to extract, re-structure, and re-combine documents • E.g., create a new document that contains all the sentences or paragraphs describing a certain class in the source code Tutorial: Application of Ontologies in Software Engineering
Linking Software + Documentation Ontology • Software Analysis and Text Mining result in two instantiated OWL ontologies • Source code ontology • Documentation ontology • One has to link the two ontologies to find information concerning an entity from both sides • For example, a class appears in both ontologies • This is currently done through simple ontology alignment • Ontology classes appearing in both ontologies are candidates for alignment • Instances from those classes that share the same name (or certain properties) are assumed to be equal • We can now either create <owl:sameAS> relations or merge the two ontologies Tutorial: Application of Ontologies in Software Engineering
Linking Software + Documentation Ontology Tutorial: Application of Ontologies in Software Engineering
Querying the Combined Ontology • Ontological Linking of source code and documents, allows queries across different artifact types • Allows to answer questions like: • Show me the documentation concerning the class containing the current method in my source code editor • Show me the implementation for a method described in the documentation • Show me the signature of all classes that are mentioned in a document within the same paragraph as the class currently displayed in my editor Tutorial: Application of Ontologies in Software Engineering
Linking example (uDig System) Tutorial: Application of Ontologies in Software Engineering
Manual Ontology Editing • The JavaScript interface also allows to manually define concepts, instances, and relations in both ontologies • E.g., creating links between a detected design pattern and source code classes Tutorial: Application of Ontologies in Software Engineering
Summary Traceability links • Ontology-based approach for traceability recovery • Using a novel approach representing both source code and natural language documents with formal (OWL-DL) ontologies. • Ontologies are automatically instantiated from source code analysis and text mining and subsequently aligned and linked/merged. • Ongoing challenges • Establish links across document hierarchies (requirements specification, design, implementation documents, JavaDoc) • Detailed performance evaluation • Dealing with inconsistencies within a recovery process(see upcoming talk on an Ontological Process Model) Tutorial: Application of Ontologies in Software Engineering
Applying Ontologies to Model a Software Maintenance Process Wen Jun Meng1, Yonggang Zhang1, René Witte1, Philippe Charland2 1 Department of Computer Science and SE Concordia University Montréal, Québec, H3G 1M8, Canada 2 System of Systems Section Defence Research and Dev. Canada Val-Bélair, Québec, Canada Tutorial: Application of Ontologies in Software Engineering
Motivation • Some estimate that up to 70% of the life cycle cost of a software is spent on maintenance • Maintenance is difficult. Several aspects affect software maintenance, e.g.: • Maintenance task • User expertise • Availability of different artifacts (documents, source code, tools, etc.) • Current research focuses mainly on developing better techniques and tools to tackle specific aspects of the comprehension problem. • Maintainers often have no active guidance on how to complete a maintenance task within a given context. Tutorial: Application of Ontologies in Software Engineering
Software Process and their Tool support Tutorial: Application of Ontologies in Software Engineering
Related work Program comprehension processes • Cognitive models • Top-down, Bottom-up, Integrated models • Process models • software maintenance, reverse engineering, architecture recovery, … • Common to most of these models is, that they: • Use existing information resources (e.g. source code, tools, user expertise) to help constructing a mental model of a program. • Provide only general descriptions of steps involved in a process. • Lack of active guidelines on how to complete these steps. • Lack a formalism and representation to unify resources and infer additional knowledge. • Lack knowledge management w.r.t. extendibility, flexibility and integration of new knowledge during the task solving process. Tutorial: Application of Ontologies in Software Engineering
Objective • We focus on a supported workflow that: • Provides context sensitive guidance to assist maintainers during software maintenance. • Unifies the formal representation of different knowledge and information resources using ontologies. • Provides the ability to reason and infer knowledge across the knowledge base. Tutorial: Application of Ontologies in Software Engineering
Goal: Learn from other application domains Sample online shopping work flow Tutorial: Application of Ontologies in Software Engineering
Objective (continued) Introduce a formal maintenance process model that: • constitutes an ontological knowledge base • Ontology as a means of integration for different comprehension concepts, like user expertise, reverse engineering. • represents comprehension tools and techniques within such a common process model. • supports • the management of new concepts and their relationships as well as enrich/refine existing concepts with newly gained knowledge or resources. • reasoning about information from the ontological representation. • provides • context-sensitive support for guiding the comprehension process itself. Tutorial: Application of Ontologies in Software Engineering
Conceptual Model (partial view) • Context sensitive support should answer questions like: • Which tools might directly/indirectly be required to perform a particular comprehension task (top-down)? • Given a current knowledge level (bottom-up), what are the potential (direct/indirectly) related tasks that can be performed? Tutorial: Application of Ontologies in Software Engineering
Modeling Software Maintenance Inter-relationships of the basic elements Tutorial: Application of Ontologies in Software Engineering
Main Concepts in Program Comprehension • Task: Description of the comprehension process tasks • User: Participant who has the main responsibility to fulfill the task • Tool: Description of existing and available program comprehension tools • Artifact: Description of software and comprehension process artifacts • Software Artifact: Description of the target software, e.g. its documents, software components, analysis artifacts etc. • Documents: Descriptions of the documented artifacts • Historical data: Information collected during the process for later (re)use Tutorial: Application of Ontologies in Software Engineering
Modeling Software Maintenance (II) Comprehension task example • Performing a task (typically) results in new artifacts, which are added to the KB • as a result, we might now be able to run other tools requiring these artifacts as inputs • and thus perform new or continue with the current tasks. Tutorial: Application of Ontologies in Software Engineering
Process Management • Users become immersed in the maintenance process • users interact with different resources (e.g., support from tools, techniques, documents and expertise) • or other experienced users to complete a task • A process manager is introduced to facilitate the communication and interactions between users, and the process and ontology manager • The process itself is interactive and incremental • user feedback is collected during the process to enrich the ontology or trigger the next iteration/phase of the maintenance process • we explicitly deal with differing viewpoints and inconsistencies through an enriched knowledge management facility Tutorial: Application of Ontologies in Software Engineering
Comprehension Scenario Tutorial: Application of Ontologies in Software Engineering
Knowledge Management Challenges • Need to manage knowledge from: • many sources (tool, users, …) • many different artifacts (source code, documents, …) • Unrealistic to expect a common, consistent view during a process for all users, tools, and tasks • rather, there will be inconsistencies, uncertainties, etc. • Explicitly represent knowledge using nested environments containing viewpoints and topics • topic – knowledge pertaining to a subject • viewpoint – contains topics or other viewpoints • We can now manage knowledge in separate, nested environments • not only store what a user beliefs, but also what he beliefs another user (or tool) beliefs • these beliefs can be contradictory • reconciliation through (automatic or manual) belief processes Tutorial: Application of Ontologies in Software Engineering
System Architecture Tutorial: Application of Ontologies in Software Engineering