610 likes | 732 Views
Constraint Generation and Reasoning in OWL. Dissertation Defense Thomas H. Briggs, VI Advisor: Dr. Yun Peng University of Maryland, Baltimore County. Introduction. Property Constraints Important to defining the semantics of an ontology Properties may have domain / range constraints
E N D
Constraint Generationand Reasoning in OWL Dissertation Defense Thomas H. Briggs, VI Advisor: Dr. YunPeng University of Maryland, Baltimore County
Introduction • Property Constraints • Important to defining the semantics of an ontology • Properties may have domain / range constraints • Global consequences from local assertions • 75% of properties are unconstrained • Property Constraint Generation • Uses information in the ontology to generate constraints • Can be used to determine missing, suggest new, or analyze existing constraints • Creates default knowledge that must be treated differently than other asserted or inferred knowledge.
Thesis The purpose of this research is to investigate methods for generating domain and range constraints from its defining ontology and to evaluate the quality of this generation. This work will also investigate the default reasoning necessary to support generated constraints. A specific focus will be on management of the default facts in the knowledge base including tracking default facts and efficient retraction operations to restore consistency.
Outcomes of this work are: Algorithmic framework to generate and evaluate domain and range constraints, and Quantitative comparison of the relationship between generated and specified constraints, and An inference procedure that will enable a limited form of default reasoning that maintains the completeness, and correctness of OWL reasoners. Research Outcomes
Description Logics • Description Logics: • are a branch of crisp logics • include well-researched languages • AL, CLASSIC, RACER • have a long history • are the basis of the Semantic Web • have fast and efficient reasoners (for some) DL • FACT, Pellet
Description Logics • Describe some world by • Defining classes, properties, and individuals • Classes define types of individuals • Properties define relationships between individuals • Individuals are things that are instances of classes, and are related to other individuals through properties. • Similar to first order logic
Constraints • An assertion about the types of fillers of a property • Subject is type of domain of property • Object is type of range of property • Unconstrained defaults to Thing/Top • Different interpretation than traditional languages • Define valid types of individuals • May force a type cast, but error otherwise teaches: domain(Teacher), range(Student) teaches(Adam, Bill) void foo(doublez) { printf(“%f\n”, z); } char x[] = “33.0”; foo(x);
Constraints • An assertion about the types of fillers of a property • Subject is type of domain of property • Object is type of range of property • Unconstrained defaults to Thing/Top • Different interpretation than traditional languages • Define valid types of individuals • May force a type cast, but error otherwise teaches: domain(Teacher), range(Student) teaches(Adam, Bill) Error: stringscannot bedoubles! void foo(doublez) { printf(“%f\n”, z); } char x[] = “33.0”; foo(x); Adam is a teacher, Bill a student
Open World Assumption • Open World Assumption (OWA) • Anything that isn’t asserted is considered as unknown. • Leads to monotonicity in reasoner. • Closed World Assumption (CWA) • Assume all facts are known • Default knowledge hasChild(ALICE, BOB) Does Alice have exactly one child? Closed World Open World Yes! No!?
Unique Name Assumption • Assumption that the name of an item is sufficient to make it unique (UNA). • We make this for classes and properties • Do not make this for individuals True only whensame individual Open World Assumption – Because we didn’t say they were different, then the reasoner canconclude that they are to makethe model true
Unconstrained Properties • Domain and range assert types to fillers of property • Unconstrained properties lack these type assertions • Reasons • Information is unknown • Artifact of ontology generator • Avoid conflicts with reuse • Faulty semantics
Constraint Generation • Unconstrained properties are a problem • Constraint generation is a non-trivial process: • Omitted constraints may be intentional or may not • Open World Assumption – information may not be there • Two sources of information on constraints: • ABox • TBox
ABox Generation • ABox generation problematic • Depends on individuals’ class membership • Individuals may not be defined / UNA • Frequently do not have a complete set of class assertions • Class assertions overlap What should the domain andrange of drives be?
TBox Generation • Terminology provides definition of the relationship between classes. Generation Lemma: Vehicle or Civic Class Vehicle: subClassOf: Thing and (drivenBy some Person) Class Civic: subClassOf: Thing and (madeBy only HONDA) and (drivenBy some Person) Vehicle ? Domain must subsume:Vehicle union Civic Vehicle or X X
Finding “Best” • Using terminology to find “best” • Intractable – exponential growth • Requires utility function to measure goodness • Requires future knowledge or omniscience
Generation Methods • Generation Methods • Construct a constraint that satisfies generation lemma • Three Generation Methods • Disjunction Method • Least-Common Named Subsumer • Vivification
Disjunction • Based on Generation Lemma • Computes the Least Common Subsumer (LCS) • In languages with disjunction, the LCS is simply the disjunction of the concepts • Generation time linear w.r.t. number classes and properties • Reasoning time is exponential.
Disjunction Example Domain for P: Range for P: C
Disjunction Discussion • Disjunction is good because • It is simple to compute • Most specific / accurate statement of constraint • Disjunction is bad because • Does not add useful information • Disjunction adds non-determinism to reasoner
Least Common Named Subsumer • Select a named concept that subsumes concepts • Trade-off in specificity for concept description • Quality depends on existence of named concepts • May be expensive to compute • Runtime is
LCNS Algorithm Subsumption checkingis the dominate cost
LCNS Example Disjunction Domain of P: LCNS Domain of P: A LCNS Range of P: C
LCNS Discussion • LCNS is good because • It selects a named class in the ontology • Runtime bound to cost for subsumption checking • Generalizes concepts from disjunction • LCNS is bad because • Requires existence of a named class or LCNS is Thing • Tends to over-generalize in other case as well • Over-generalization discards too much information
Vivification • Balance specificity and over-generalization • First proposed by Cohen & Hirsh 1992 • Difference here is partial absorption • Starts with disjunction, using inheritance relationship summarizes terms with common direct super-classes. • Only terms that do not share a common super-class remain in the disjunction
Absorption • Moderates the generalization process • Uses the class inheritance structure for operation
Vivification Algorithm Perform Absorption Vivify a concept list in L for a given absorption criteria Beta.
Vivification Example Property P is usedin the definitionof the threeyellow classes. Disjunction Domain: LCNS Domain: Vivification Domain:
Vivification Discussion • Vivification is good because • It creates general concepts that summarize over common super-classes, selecting named subsumers • It preserves outliers • It is fast • Vivification is bad because • Disjunctions may remain after summarization • Depends on the completeness of the terminology • Ignores individual assertions
Results - Domain Generated constraint was equal to originally specified one.Positive outcome. Correctly generated constraint with equal specificity.
Results - Domain Original more specific than generated. In all cases, the original constraint subsumed itself.Making it more specific than the generated one.
Results - Domain Original more general then generated. A negative to neutral outcome. The original constraintwas more general than its present usage.
Results - Domain Original Top, Generated Top. Both the original and generated concepts where top.It is a subclass of the case of row 1 where concepts are equal.
Results - Domain Original Top, Generated More Specific Strongly positive results. A constraint was generatedfor a concept that previously lacked one.
Results - Domain Generated Top, Original More Specific. A neutral to negative result. A constraint was generated as Top whenthe original was not Top. An example was an ontology that defined hasAuntas the union of Niece and Nephew, which was equivalent to Person, and Person was equivalent to everything – hence the generated created Top.
Results - Domain Property Unused. Neutral results. A constraint could not be generated becausethere were no role restrictions to define the constraints.
Results - Domain Processor or Reasoner Failed. There was a runtime failure of the processor or reasoners.
Results – Range Range results were similar to domain.
Results - Normalized Generation strategies created improved constraints almost 80% of time. Vivification created constraints nearly as specific as Disjunction.
Results - Runtime Time – Load, Reason, Generate, Build, Reason – 1000 Ontologies Time – Load / Reasoning Time Hypothesis Testing Vivification faster than disjunctionat 92.6% degree of confidence. Vivification faster than LCNSat 76.4% degree of confidence.
Results – Discussion • Generation • Remove unused properties gives better picture of future as technologies mature. • Generation a viable method • Vivification was dominate method • Generated constraints with near equal specificity to LCS • Able to generalize at appropriate times • Avoided the over-generalization of LCNS • All around best performance for generation and reasoning
Default Reasoning • Monotonicity • One goal of OWL is to maintain monotonicity – the property of a reasoner that adding new facts to the knowledge base does not cause existing facts to be retracted. • Default Knowledge / Rules • Default knowledge and rules about the terminology make use of Closed World Semantics, give up monotonicity. • A default rule may conflict with future statements • Statements must be retracted.
Contraction • When a clash occurs in a knowledge base with default statements, those default facts must be removed to restore consistency. This is called a contraction. • How to tell default from non-default? • Inference leads to multi-path problem • Default and non-default facts can be used to infer new facts • Default facts may block non-default facts from being generated
Default Example Class: A SubClassOf: Thing, P some B Class: B SubClassOf: Thing Class: C SubClassOf: Thing ObjectProperty: P Domain: Thing Range: Thing Individual: J Individual: I Facts: P(I,J) Before generation Class: A SubClassOf: Thing, P some B Class: B SubClassOf: Thing Class: C SubClassOf: Thing ObjectProperty: P Domain: A Range: B Individual: J Types: B Individual: I Types: A Facts: P(I,J) After property generation, domain and range on P were generated / default. What if the domain expert adds C SubClassOf: P some B? Now, the domain of P is generatedas A union C. I no longer in A!
Modifications • Default Descriptor • Indicates the defaultness of a statement or assertion. • Does not change the meaning of the term • Inference • Inference rules modified to propagate descriptor • Non-default statement must replace default statement
Concept Strength • Concept Strength between concepts C and D • Strength Relationship: • If C is default and D is not, then C weaker than D • If C and D have same defaultness, then equal • If C is not default and D, then C stronger than D