350 likes | 363 Views
Formalizing the Evolution Process. Giorgos Flouris George Konstantinidis {fgeo,gconstan}@ics.forth.gr. Institute of Computer Science Foundation for Research and Technology – Hellas Heraklion, Greece. Introduction. Change management is a critical process in knowledge-intensive applications
E N D
Formalizing the Evolution Process Giorgos FlourisGeorge Konstantinidis {fgeo,gconstan}@ics.forth.gr Institute of Computer Science Foundation for Research and Technology – Hellas Heraklion, Greece Giorgos Flouris
Introduction • Change management is a critical process in knowledge-intensive applications • Lots of research on the subject • Several fields dealing with change management • Several change algorithms and tools have emerged for different contexts • Dedicated events (e.g., IWOD yearly series) • We define change (evolution) as the process of adapting (revising) a corpus of knowledge based on new information Giorgos Flouris
Motivation • Original motivation • Create “yet another” change algorithm (for RDF/S ontologies) • But, in the process… • Uncovered a “pattern of change” • Change algorithms can be viewed as different manifestations of the same general idea • Similarity offers an opportunity for abstraction • Study change at a more fundamental level • Apply this study to practical problems Giorgos Flouris
Basic Notions • Consider a pool of elements L (the language) • A Knowledge Base (KB) K is any set of elements from L (subset of L) • An update is a request to add and remove elements from K • U=(U+,U-) • U+: the elements to add (subset of L) • U-: the elements to remove (subset of L) • Update operation: K●U • A function, that returns a KB given a KB and an update Giorgos Flouris
Change Principles • General principles (inspired by research on belief revision) • Principle of Algorithmic Adequacy:well-defined and deterministic operation; the output is a KB • Principle of Irrelevance of Syntax:semantical considerations only (syntax is not important) • Principle of Validity:output is valid • Principle of Success:update takes precedence over existing knowledge • Principle of Minimal Change:protect existing knowledge (no unnecessary changes) Principles are widely applicable, but not in all contexts (exceptions exist); here, we adopt these principles Giorgos Flouris
Change Process (Intuition and Problems) Update (request for element additions and removals) KB(set of elements) KB(set of elements) But KBs are not just sets of elements (must be valid – Principle of Validity) Apply the update (must be successful – Principle of Success) Side-effects Main Challenge To determine the minimal set of side-effects that guarantee success and validity, taking into account only semantical considerations, in the presence of inference, and apply the side-effects (and the update) upon the original KB KB But KBs are not just sets of elements (inference applies – Principle of Validity, Principle of Success) Apply side-effects (to guarantee validity and success – Principle of ValidityPrinciple of Success) But there may be several different options for the side-effects (minimality considerations – Principle of Minimal Change) Validity Model Inference Model Determine the minimal side-effects (to guarantee minimality – Principle of Minimal Change) Minimality Model Giorgos Flouris
Easy part Hard part Algorithmic Scheme (Meta-algorithm) No side-effects YES Input:KB KUpdate U Valid and successful? Apply U and the side-effects upon K; return the result Apply U upon K NO Determine the side-effects; result must be successful, valid and minimal Giorgos Flouris
Parameters of Change • Change algorithms follow the same general scheme, but: • Are applicable to very different contexts • Give different results (select different side-effects to apply) • There must be some parameters hidden in the algorithmic scheme, which affect algorithms’ behaviour • Language • Domain of application • Validity model • Inference model • Minimality model • An algorithm’s expected result and behaviour is determined by the values of these five parameters Giorgos Flouris
List of Parameters • The language • Determines the pool of elements for the KBs and updates • The domain of application • Determines the supported <KB, update> pairs • Inference model • Determines the inference mechanism • Validity model • Determines the valid KBs • Selection process • Selects one of the possible sets of side-effects • Determines the minimality model Giorgos Flouris
Minimal Effect of Parameters • Language determines the pool of elements (for the KB and the update) • Domain of application determines the acceptable input of the algorithm • Validity determines the side-effects that satisfy validity • Inference affects the side-effects that satisfy validity and success • Selection mechanism determines the side-effects to apply (and, consequently, the expected result of the algorithm) All updates (side-effects) Satisfying success Satisfying validity Infeasibleupdates Minimal? Giorgos Flouris
Levels of Abstraction (1/2) • The values of the parameters determine: • The working hypotheses and context of the change algorithm • The expected result of the change algorithm • Did not specify • Possible values of the parameters • How to implement an algorithm returning the expected result • We study these questions at different abstraction levels • Parameters can be specific, or range over a vast pool of possibilities • This affects our ability to develop (design) an algorithm returning the expected result Giorgos Flouris
Levels of Abstraction (2/2) Level 1Meta-algorithm Most general; applicable in any context; framework; no implementation hints Level 2General-purpose algorithm Quite general; widely applicable, parameters range over a family of possibilities; implementation hints; special case of level 1 Generality Covered Contexts Specificity Implementability Practical Applicability Level 3RDF-specific algorithm Specific; usable only for RDF/S; directly implementable; proof of concept; special case of level 2 Giorgos Flouris
Introduction to RDF, RDFS • RDF: triple-based representation • (s, p, o): s (subject) has property p (predicate) with value o (object) • (myCar, hasColor, Red) • Triples connect resources (anything with a URI) • RDFS: semantics to RDF • Added special resources (e.g., property rdfs:subClassOf) • (Car, rdfs:subClassOf, Vehicle) • Added semantics to these resources (e.g., rdfs:subClassOf denotes the subsumption relationship, so it is transitive) Giorgos Flouris
Setting the Language (1/2) • Level 1: a non-empty set • Level 2: based on a finite set of predicates P and a set of constants Σ • The ground fact Q(x) represents some fact • Only relational ground facts in L (no operands like ¬, , , ) • Directly representable in a database • Examples: Q(x), R(x,y), … • The original language may be of a different form • Provide a mapping Giorgos Flouris
Setting the Language (2/2) • Example (RDF/S) • Subclass relationship (rdfs:subClassOf) denoted by C_IsA • (Car, rdfs:subClassOf, Vehicle) mapped into C_IsA(Car, Vehicle) • This allows mapping sets of triples (KBs in RDF) to sets of relational ground facts (KBs in L) • Level 3: a particular set of predicates P and constants Σ and the respective mapping • See (Konstantinidis, 2008) for details Giorgos Flouris
Setting the Domain of Application • Level 1: a non-empty set D of <KB, update> pairs • Level 2: some restrictions apply • KB must be valid • Update must not be infeasible • Level 3: the same for the RDF/S context • D={<K,U> | K: valid, U: not infeasible} Giorgos Flouris
Setting the Validity Model • Level 1: a set V of valid KBs • Level 2: validity decided through a set of validity rules • A KB is valid if and only if it satisfies the validity rules • The set V consists of the KBs that satisfy the validity rules • Validity rules are disjunctive embedded dependencies (DEDs), which is a general class of first-order logic axioms • Level 3: a particular set of validity rules (DEDs), suitable for the RDF/S context • See (Konstantinidis, 2008) for a full list Giorgos Flouris
Setting the Inference Model (1/2) • Level 1: a function Cn from KBs to KBs • Level 2: unusual handling • Inference rules (DEDs) determine the implications of a KB • Inference rules are included in the validity rules, not in the Cn function • No inference as such • If a KB is valid, then it is also “closed” with respect to “inference” • For a KB K, Cn(K)=K (no inference) • No implicit information, in the strict sense Giorgos Flouris
Setting the Inference Model (2/2) • Practical implications • No implicit information, so validity and success checks need not take inference into account • Simplifies algorithm design • Simplifies the definition of the validity rules • Guarantees satisfaction of the Principle of Irrelevance of Syntax • Equivalent (and valid) KBs are equal • Level 3: a particular set of inference rules, applicable for the RDFS context, added to the validity rules • See (Konstantinidis, 2008) for a full list Giorgos Flouris
Setting the Selection Mechanism • Level 1: a function σ that selects one update out of a set of updates • Level 2: a relation (<) comparing the different updates • Compares sets of effects plus side-effects • Determines (i.e., selects) the minimal update (must be unique) • Relation assumed to satisfy certain properties (conflict sensitivity, totality, partial antisymmetry, monotonicity, transitivity) • Level 3: via a particular relation, suitable for the application at hand (RDF/S) • See (Konstantinidis, 2008) for details Giorgos Flouris
All updates (side-effects) Satisfying success Satisfying validity Minimal Need for an Algorithm • Knowing the expected result is not the same as being able to produce it algorithmically • No algorithm to determine the candidate side-effects • The candidates may be infinite • Designing an algorithm is only possible for levels 2 and 3 • RDF-specific algorithm is an application of the general-purpose one Giorgos Flouris
General-Purpose Algorithm • Starting with U, we enhance it (add side-effects) in an effort to guarantee both validity and success • Possible solutions are compared (using <) and filtered • Some paths lead to infeasible solutions (rejected) • Tree-like search space (rooted in U) All updates (side-effects) Satisfying success Satisfying validity U Giorgos Flouris
Algorithm: Determining the Paths • If validity is not satisfied, then at least one rule is violated • Path determination is based on the violated rule(s) • x,y,… Q1(x,y,…) Q2(x,y,…) ∃z,w,…Q3(x,y,…,z,w,…) … • Each branch represents one way to restore a violated rule • Each node represents one violated rule that is restored Giorgos Flouris
Algorithm: Correctness • The nature of the algorithm, the properties of < and our hypotheses (language, validity rules etc) guarantee that: • The algorithm will always converge to a valid and successful set of side-effects • Unless the update is infeasible, in which case infeasibility will be detected • The minimal set of side-effects will be discovered (i.e., at least one path will lead to that) • The result will satisfy all the principles Giorgos Flouris
Algorithm: Termination • Unfortunately, we cannot, in general, guarantee that: • There is a finite number of paths • Each path will converge in a finite number of steps • For some selections of the parameters, the algorithm may not terminate • Applying the algorithm for specific contexts presupposes careful selection of the parameters • For the parameters used for level 3, termination is guaranteed Giorgos Flouris
Algorithm Summary (Levels 2 and 3) • Level 2: • The presented algorithm identifies the updates to compare • The minimality relation is used to compare them • Correctness is guaranteed • Termination is not guaranteed (for some parameters) • Level 3: • Uses the general-purpose algorithm • Applied for the particular parameters of level 3 • Correctness is guaranteed • Termination is guaranteed (for the specific parameters) Giorgos Flouris
Comparison With Related Work • Pattern (algorithmic scheme) is followed, in general • Side-effects decided, per-case, during design time • Decisions hard-coded in the algorithm • Our approach takes the related decisions at run-time • Avoids error-prone checking for all possible cases at design-time • Easier to guarantee “global policy” towards updates • Easier to prove formal properties of the algorithm • Can support an infinite number of different updates • Allows experimentation with certain parameters, even as part of user’s input • Less efficient • Special-purpose algorithms can be developed • Heuristics and application-specific optimizations can be used Giorgos Flouris
Contributions (Level 1) • Uncovers fundamental properties of change algorithms • Easier understanding of existing algorithms • Aids the development of new algorithms • Modularizes the development of new algorithms • Allows the understanding of change at a fundamental level • When you understand how you do changes in you mind, it is easier to encode and implement them • Abstracts away from the peculiarities of each application • Not implementable Giorgos Flouris
Contributions (Level 2) • Allows the design of a general-purpose algorithm • Returns the expected result, per its parameters • Provides formal guarantees (correctness, consistent policy etc) • Applicable in several different contexts • Specific contexts are simple applications of the general algorithm • Algorithm design can be reduced to the setting of the parameters • Designer only has to determine the correct parameters • Delegates all the hard work at run-time • Algorithm (partly) orthogonal to the context • Application-specific optimizations necessary for efficiency • Termination is not guaranteed for all possible parameters Giorgos Flouris
Contributions (Level 3) • Implemented algorithm for changing RDF/S ontologies • An application of the general-purpose algorithm for some specific set of parameters, suitable for RDF/S • Proof of concept • Enjoys all the nice formal properties and guarantees of the general-purpose algorithm (e.g., correctness) • Termination (for the particular parameterization) • Optimizations possible (application-specific) • Heuristics • Special-purpose algorithms • Very specific, only applicable for the RDF/S context Giorgos Flouris
Conclusions and Future Work • RDF-specific algorithm implemented in the SWKM (Semantic Web Knowledge Middleware) platform • Large scale real-time system based on web services, developed in FORTH • SWKM web site: http://athena.ics.forth.gr:9090/SWKM/ • Future work • Detailed experimental performance evaluation of RDF-specific algorithm • Optimizations, heuristics • Applications to ontology debugging Giorgos Flouris
Thank You References: • George Konstantinidis. Belief Change in Semantic Web Environments. Master Thesis, Computer Science Department, University of Crete, 2008. • George Konstantinidis, Giorgos Flouris, Grigoris Antoniou and Vassilis Christophides. “A Formal Approach for RDF/S Ontology Evolution”. In Proceedings of the 18th European Conference on Artificial Intelligence (ECAI-08), pages 405-409, 2008. • SWKM web site: http://athena.ics.forth.gr:9090/SWKM/ Giorgos Flouris
EXTRA SLIDES Giorgos Flouris
Rules and Language: Semantics • The language contains only relational atoms • No inference rules (only validity rules) • The language assumes closed-world semantics • Q(x) implied by K if and only if Q(x)∈K (explicitly) • Q(x) implied by K if and only if Q(x)∉K (explicitly) • Checking the validity of a rule becomes simple: • x,y C_IsA(x,y)→C_IsA(y,x) satisfied by K if and only if for all x,y, it holds that C_IsA(y,x)∉K whenever C_IsA(x,y)∈K Giorgos Flouris
Summary • Set some principles for rational updates • Expected update result is determined by five parameters: • Language • Domain of Application • Inference Model • Validity Model • Selection Mechanism • Implementing an algorithm returning the expected result is a different thing • Three levels of abstraction • Different restrictions on parameters’ values and different opportunities for algorithm design Giorgos Flouris