560 likes | 703 Views
Decisional DNA Ontology-based. Ontology. In philosophy: It is the most fundamental branch of metaphysics. It studies being or existence. Tries to find out what entities and what types of entities exist. In Computer Science: It is the explicit specification of a conceptualization.
E N D
Ontology • In philosophy: • It is the most fundamental branch of metaphysics. • It studies being or existence. • Tries to find out what entities and what types of entities exist. • In Computer Science: • It is the explicit specification of a conceptualization. • It is a description of the concepts and relationships in a domain. (Tom Gruber’s widespread accepted definition)
Ontology-based Technology It is the field in which computer-based semantic tools and systems are developed Main focus is in information sharing and knowledge management for querying and classification purposes. Several domains of application: medical, chemical, legal, cultural, etc. Commonly used in AI and KR.
Ontology-based Technology • Computer programs can use Ontologies for a variety of purposes: inductive reasoning, classification, and problem solving techniques. • Communication and sharing of information among different systems. • Emerging semantic web systems use Ontologies for a better interaction and understanding between different agent web-based systems.
Modeling Ontologies • Ontologies can be modelled using several languages - RFD and OWL (both expressed in eXtensible Markup Language-XML). • OWL (Ontology Web Language) is a W3C Recommendation. • OWL facilitates machine interpretability of web content by providing additional vocabulary along with formal semantics. • OWL has three increasingly-expressive sublanguages: OWL Lite, OWL DL, and OWL Full. • Set of experience Ontology-based can be a scenario for exploitation of semantic data.
Set of Experience in XML • <rule> • <joint> • <condition> • <factor> • <coef>1/1.2</coef> • <variable>Firing</variable> • </factor> • <sym><=</sym> • <variable>Competitor's Firing</variable> • </condition> • </joint> • <consequence> • <factor> • <variable>Status of Firing</variable> • </factor> • <sym>=</sym> • <value>VERY GOOD</value> • </consequence> • </rule> • <function> <obj>Min</obj> <fn_name>Payment Level</fn_name> <sym>=</sym> <factor> <coef>3</coef> <variable>X1</variable> </factor> <factor> <oper>+</oper> <coef>2</coef> <variable>X2</variable> <poten>2</poten> </factor> <unit>money</unit> </function>
Modeling Set of Experience Ontology-based Ontology perspective tag.
Modeling Set of Experience Ontology-based Relationships among the different classes of the Ontology can be seen using a plug-in for Ontology visualization.
Growing System • Decisional DNA is shared. • It is Community of practice distributing knowledge. • Created based upon ontology web technology. e-Decisional Community
Internal Analyzer layer Experience Creator Se1=Le1 Se2=Le2 BS1(r,f) PRIORITIES Si1=Li1 . . . . . . Si2=Li2 Ser=Ler BSk(r,f) . . . Li1 (r,f) Sim=Lim M1 Li2 (r,f) . . . M2 External Analyzer layer Intuition Creator . . . Lim (r,f) BS (r,f) I1(BSi)(l) Mn Le1 (r,f) . . . Le2 (r,f) Is(BSi)(l) . . . Ler (r,f) Ruler Creator R1(BSi,Vj) . . . Rq(Bsi,Vj) Knowledge-base layer Integration layer Risk Analyzer layer DIAGNOSIS SOLUTION KNOWLEDGE PROGNOSIS KSCS – Dissimilar SOEKS
Unification & Negotiation • The KSCS collects information from many different applications in the shape of SOEKS, providing the elements needed for the platform to perform. The group of SOE resulting from the multiple applications comprise the set O. • Variables are unified into a unique system of names, and values are unified into a unique system of measurement. • However, uniformity of the SOEi O is not assured. The SOE can be dissimilar in terms of value, dimension, structure and phenotype. • Thus, the set O is the aim group to be homogenized and unified.
Unification & Negotiation • S: It is the Universe of SOE, • O: It is the set of SOE from the applications, • U: It is the set of valid SOE that follow any Ci or Ri, • W: It is the set of valid SOE produced by mixing the different Oi O, • I: It is the set of invalid SOE, • H: It is a transitional set produced by the mix function. • T: It is a set of chosen valid SOE Ei W to be used in an initial population for a GA. PLAN • Mix O set H • Evaluate H • If Hi is valid Wi, • If Hi is invalid Ii • Create T from W • Get a solution using a GA with an initial pop. T
The mix Function Having the set O, Sets of Experience are reformulated by mixing all their components (i.e. mixing variables, functions, constraints, and rules). It is an integration of multiple formal decision events implemented by the application of the mix function: mix: O→H mix(O = {O1, O2, …, On}) = H Step 1. initialization of the set H It places each Oi as part of H without being mixed, Step 2. mixing the SOE …
The mix Function STEP 2. Continues until it generates sets W and I as follows: Step a. Test each existent Hi H for validity Step b. Test each Hi while changing existing values of Vi Step c. Test each Hi while changing for all possible values of each Vi If boolean then switch If numeric then change to existent value of Vi If categorical then change to next category
The Set H The set H, product of applying the mix function, can have three kinds of problems: • Dimensionality problem: the resulting Sets of Experience in the set H can have different dimensions than their predecessors in the set O, • Weight-efficiency problem: the resulting Sets of Experience in the set H need new definitions for weights and efficiency values, • Validity problem: the resulting Sets of Experience in the set H can be invalid according to any rule or constraint in any Oi O.
The Dimensionality Problem Our approach to solve this problem is confronted by expanding the produced SOE via homogenization of the whole group of elements. The resulting collection of SOE follows the following equation: SOEKS E1...n = (VTotal # VBLES, FTotal # FNS, CTotal # CTNS, RTotal # RLES) where VTotal # VBLES = n: number of SOE Oi O and m: number of variables in the SOE Oi. It is important to highlight here that such expansion, in the case of variables, correspond initially to the creation of the elements that compound the optimal (mixed) SOE.
Default Values A new problem emerges here, some variable’s fields can be empty as a result of the expansion of the SOE; thus only filling out such variable’s fields with valid values allow them to participate on the group of SOE to be considered as optimal solutions. Valid values used to fill out empty fields will be called ‘Default Values’. These defaults values can derive from: • modulus or neutral values, or • values defined as the minimum or maximum acceptable value for such variable, or • variable’s values in any of the existent SOE Oi.
The Weight-efficiency Problem All the SOE Oi O are considered equally important; thus, the weight associated to each of them is equal to 1/n. However, the weights associated to the different components of the SOE and its phenotype have to be redefined as part of the negotiation. The weight associated to the variable i, wvi, is defined as: Similarly, functions, constraints, and rules are defined. The implementation of a phenotype equation follows the weighted-sum-of-objective-functions method from MOO: Phenotype of Ei = Any duplication in any component is solved by the means of the union function performed previously.
The Validity Problem Constraints and rules are the ones that defined the concept of validity. Every SOE has to be evaluated according to the whole group of constraints and rules in order to avoid contradictions. Every Hi, after being filled with the default values, is evaluated for validity, in case a Hi is valid, this is taken into the set W, otherwise it goes to the set I. Having a SOE Hi H, its validity is defined as: Hi is valid iff: • Cij Hi, Cij is true under the given Vi values, and • Rij Hi, Rij is true under the given Vi values.
A Solution On the process of mixing, a question remains about whether the produced set H contains the universal optimal solution or not. Two types of solutions can be obtained: • it is assumed that the resulting set W contains the universal optimum, in such a case, the process of mixing produced a unique optimum SOE Ei* W, or • it is assumed that the resulting set W does NOT contain the universal optimum, in such a case, the process of mixing would produce multiple local optimum, and the user is comfortable with one of these.
Solution It is assumed that the resulting set W does NOT contain the universal optimum, in such a case, the process of mixing would produce multiple local optimum; however, the user is NOT comfortable with one of these local optimum; accordingly, it is possible to select a population (T) of SOE Ei W, such that an implementation of an Evolutionary Algorithm can lead to an improved set of Pareto-optimal solutions from which the user can choose one.
Multiobjective Evolutionary Algorithms • Multiple MOEA have been proposed. • Among the most well know techniques are: • VEGA - Vector Evaluated Genetic Algorithm (Schaeffer, 1985) • WEGA - Weighting-based Genetic Algorithm (Hajela and Lin, 1992) • MOGA - Multiobjective Genetic Algorithm (Fonseca and Fleming, 1993) • NPGA - Niched Pareto Genetic Algorithm (Horn et al., 1994) • NSGA - Nondominated Sorting Genetic Algorithm (Srinivas and Deb, 1994) • SPEA - Strength Pareto Evolutionary Algorithm (Zitzler and Thiele, 1999)
SPEA • Basically, what SPEA does, as other algorithms do, is: • stores the found nondominated solutions externally, • uses the concept of Pareto-optimal, and • performs clustering to reduce the number of nondominated solutions stored. • However, SPEA is unique on: • all solutions in the external nondominated set participate in the selection.
Solution Under any circumstance, once the set W is built up, each Wi can proceed to the analysers to be solved according to the principles exposed by Sanin and Szczerbicki (2004, 2007). At the end, the KSCS offers a set of solutions to the user; however, it is the set of user's priorities that defines which among them is the chosen optimal model u optimal Decisional Gene to be added to the Decisional DNA.
Set of Experience Knowledge Structure Sets of experience are grouped according to their phenotype creating chromosomes. Groups of chromosomes construct the decisional DNA of a company.
Similarity • Data mining (DM) and knowledge discovery (KD) are research areas that develop tools to use information and knowledge in an effective manner. Computers are used to recognize patterns automatically in large multivariate data sets for supporting decision-making (Duda 2001). • In searching for patterns, it is normally not enough to consider only equality or inequality of data. Instead, it must be able to calculate how much these two objects differ from each other. • Similarity between objects can help, for example, in predictions, hypothesis testing, and rule discovery (White 1996); additionally, it can offer rankings of results for choosing best matching objects.
Similarity • Evaluating similarity between data objects depends essentially on the type of the data. • It is possible to have several types of similarity measures on a single set of data, i.e. heterogeneous similarity measures. • Approaches generally used for defining similarity between two data objects consider their attributes in pairs and compare them as isolated elements. Hence, similarity metrics can be calculated based on a composed addition of many different comparisons.
Similarity • A common approach for similarity between objects is defined in terms of a notion of distance (i.e. a geometric approach). A data object is represented by its coordinates in a similarity space, such that similar data objects are plotted closer to one another in the multi-dimensional attribute space than less similar ones (Fabrikant 2001). • The most common distance measures may be considered part of the Minkowskian family distance metrics, such as the Euclidean and Hamming metrics. This approach takes two attribute vectors, says objects, si and sj and calculates the similarity measure d as follows:
Similarity • A problem with this model is that the method seems inappropriate for data objects that have a number of qualitative attributes. Some researchers have assumed that an object that is represented by such set of qualitative attributes may be prearranged as binary variables. • Other techniques to estimate a similarity metric are in the group of event sequences. Similarity between event sequences is based on the idea that a similarity notion should reflect how much work is needed to transform an event sequence into another.
Similarity • Additional techniques such as additive trees, for instance, represent data objects as terminal nodes in a tree, thus similarity between two objects is calculated by the length of the path between them; and additive clustering, which defines a number of clusters with associated weights, thus similarity between two objects is estimate by the sum of the weights of their common clusters. • Many more similarity measures have been proposed, such as information content, mutual information, Dice coefficient, cosine coefficient, and feature contrast model among others. A problem with preceding similarity measures is that each of them is attached to a particular domain model. • There is, at present, no agreement about the best similarity metric because there is no consensus on what the input data that a similarity function should integrate (Lee 1999).
Similarity Metric The similarity measure d is called a metric if it satisfies the following conditions for all sets of experience si, sj and sk in the universe U: • (non-negativity) • (identity) • (symmetry) • (triangle inequality)
Similarity Metric of SEKS Let C be a subset of the universe U of sets of experience named context set. A boolean expression θ containing one or many restrictions on elements of the head of the set of experience is called a selection condition on sets of experience, e.g. θ = (area = “Human Resources” AND subarea = “Salary Office“ AND aim function = “Payment Level”). The context set is a subset of U that satisfies θ. It is defined as
Similarity Metric of SEKS Because each set of experience has different elements that comprise it and each element has its own characteristics, similarity is examined separately for each of the elements; afterwards, a unique similarity measure is offered by combining their separate results. VARIABLES FUNCTIONS CONSTRAINTS RULES
Similarity Metric of SEKS: Variables • Qualitative variables are prearranged. • Euclidean Metric with normalization. • Similar = 0, Non-similar = 1.
Similarity Metric of SEKS: Functions Qualitative functions: • Similar = 0, Non-similar = 1.
Similarity Metric of SEKS: Functions Quantitative functions: • Similarity is estimated by comparing their factors. • SF = SFQl + SFQt • Similar = 0, Non-similar = 1.
Similarity Metric of SEKS: Constraints Calculated as Functions: • Similarity is estimated by comparing their factors. • Comparative relationship (<,>, <=, etc). • SC = SCQl + SCQt • Similar = 0, Non-similar = 1.
Similarity Metric of SEKS: Rules • Rules are elements that regulate the decision event, but they are not elements of comparison due to its conditional characteristic. • Rules are part of the decision event and influence the variables according to certain conditions; however, not all of them are executed and in consequence they are not part of the similarity estimation.
Similarity Metric of SEKS Defining a similarity metric for SEKS is as simple as adding up all the estimations for each of previous elements: variables, functions, and constraints, and assuming that all of them are assigned with the same weight due to their equal importance on influencing the formal decision event. S(i, j) = (SV + SF + SC)/3
Similarity Metric of SEKS S1 = 0.68 S5 = 0.53 S2 = 0.31 S6 = 0.12 S3 = 0.74 S7 = 0.07 S7 = 0.07 S4 = 0.26
? KNOWLEDGE INFORMATION DATA
Reflexive Ontologies • Reflexivity addresses the property of an abstract structure of a knowledge base (in this case, an ontology and its instances) to know about itself. • When an abstract ontology knowledge structure is able to maintain, in a persistent manner, every query performed on it, and store those queries as individuals of a class that extends the original ontology, it is said that such ontology is reflexive. • “A Reflexive Ontology is a description of the concepts and the relations of such concepts in a specific domain, enhanced by an explicit self contained set of queries over the instances”