430 likes | 636 Views
A Library of Components for Classification Problem Solving. Wenjin Lu and Enrico Motta Knowledge Media Institute. Four Main Goals. To carry out a knowledge-level analysis of classification To develop a practical resource to support the development of classification applications
E N D
A Library of Components for Classification Problem Solving Wenjin Lu and Enrico Motta Knowledge Media Institute
Four Main Goals • To carry out a knowledge-level analysis of classification • To develop a practical resource to support the development of classification applications • To provide a concrete set of components to act as a test case for IBROW brokering system and IRS • To evaluate the UPML framework and the OCML modelling language on a non-trivial test-case
Detailed Modelling in OCML • Supports domain, task and PSM specification • Large Library (>90 Ontologies) • Extensive experience (~20 projects, 5 years) • Robust Infrastructure • Both web-based and ‘vanilla’ development environments • Intg. of specification and operationalization is a good thing! • Rapid development and validation • Result = both analytical and engineering resource
Amalgamating UPML and OCML • OCML Base Ontology was revised to comply with UPML • Tasks and PSMs become assumption-based
Classification Classification can be seen as the problem of finding the solution (class), which best explains a set of known facts (observables), according to some criterion Observables Classification Candidate Sols. Solution Criterion
Example Observables {background=green; area=china...} {chinese-granny, dutch-granny, etc..} Classification Candidate Sols. Solution {chinese-granny} Criterion Complete-coverage-criterion (every observable has to be explained)
Observables Observables = set_of (Observable); Observable = {feature, value}. Well defined Observables (obs): ({f1, v1} obs {f1, v2} obs) -> v1 = v2 ({f1, v1} obs) -> legal_feature_value (f1, v1 )
Solutions Solution = set_of (Feature_Spec); Feature_Spec = {Feature, Feature_value_spec} Feature_value_spec = Unary_Relation Well defined Solution (sol): {f1, s1} sol holds (s1, v1 ) -> legal_feature_value (f1, v1 )
Matching Observable={f1, v1} matches Solution=sol iff: {f1, c} sol holds (c, v1 )
Matching Sets of Obs to a Solution Sol: {{fsol1, c1}...{fsolm, cm}}; Obs: {{fob1, v1}...{fobn, vn}} Four possible cases: {fj, cj} sol {fj, vj} obs holds (cj, vj) -> Explained (fj) {fj, cj} sol {fj, vj} obs not holds (cj, vj) -> Inconsistent(fj) {fj, vj} obs {fj, cj} sol -> Unexplained (fj) {fj, vj} obs {fj, cj} sol -> Missing (fj)
Default Match Criterion Match Score: Vector: <I, E, U, M> Match Comparison Relation S1 = (i1, e1, u1, m1); S2 = (i2, e2, u2, m2) S1 better_score than S2 iff: (i1 < i2) (i2 = i1 e2 < e1) (i2 = i1 e2 = e1 u1 < u2) (i2 = i1 e2 = e1 u2 = u1 m1 < m2)
Possible Solution Criteria • Positive Coverage • Some feature is explained and none is incosistent • Complete Coverage • All features are explained and none is incosistent
Solution Criterion Hierarchy of Criteria Match Criterion Match Score Mechanism Match Score Comparison Rel Macro Score Mechanism Feature Score Mechanism
Observables (def-class observables (set) ?obs "This is simply a set of observables. An important constraint is that there cannot be two values for the same feature in a set of observables" :iff-def (every ?obs observable) :constraint (not (exists (?ob1 ?ob2) (and (member ?ob1 ?obs) (member ?ob2 ?obs) (has-observable-feature ?ob1 ?f) (has-observable-feature ?ob2 ?f) (has-observable-value ?ob1 ?v1) (has-observable-value ?ob2 ?v2) (not (= ?v1 ?v2))))))
Solutions (def-class solution () ?x "A solution is a set of feature definitions" :iff-def (every ?x feature-definition)) (def-class feature-definition () ?x ((has-feature-name :type feature) (has-feature-value-spec :type unary-relation)) :constraint (=> (and (has-feature-name ?x ?f) (has-feature-value-spec ?x ?spec)) (=> (holds ?spec ?v) (legal-feature-value ?f ?v))))
Solution Criterion (def-class solution-admissibility-criterion () ?c ((applies-to-match-score-type :type match-score-type) (has-solution-admissibility-relation :type unary-relation)) :constraint (=> (and (solution-admissibility-criterion ?c) (has-solution-admissibility-relation ?c ?r) (domain ?r ?d)) (subclass-of ?d match-score)))
Monotonicity of Admissibile Solutions (def-axiom admissibility-is-monotonic "This axiom states that the admissibility criterion is monotonic. That is, if a solution, ?sol, is admissible, then any solution which is better than ?sol will also be admissible" (forall (?sol1 ?sol2 ?obs ?criterion) (=> (and (admissible-solution ?sol1 (apply-match-criterion ?criterion ?obs ?sol1) ?criterion) (better-match-than ?sol2 ?sol1 ?obs ?criterion)) (admissible-solution ?sol2 (apply-match-criterion ?criterion ?obs ?sol2) ?criterion))))
Complete Coverage (def-instance complete-coverage-admissibility-criterion solution-admissibility-criterion ((applies-to-match-score-type default-match-score) (has-solution-admissibility-relation complete-coverage-admissibility-relation))) (def-relation complete-coverage-admissibility-relation (?score) "a solution should be consistent and explain all features" :constraint (default-match-score ?score) :iff-def (and (= (length (first ?score)) 0) ;;no inconsistency (= (length (third ?score)) 0))) ;;no unexplained
Classification Task Ontology • 42 Definitions • Provides both a theory of classification and a vocabulary to describe classification problems • Ontology is separated from task specifications
Generic Classification Task • Input roles • Candidate Solutions, Match Criterion, Solution Criterion, Observables • Precondition • Both observables and candidate solutions have to be provided • Goal • To find a solution from the candidate solutions which is admissible with respect to the given observables, solution criterion and match criterion
Specific Classification Tasks • Single-Solution Classification Task • Single-solution assumption • Optimal Classification Tasks • Goal requires optimality
Problem Solving Library • Based on heuristic classification model • Supports both data-directed and solution-directed classification • Based on search paradigm • Supported by a method ontology
Method Ontology: Main Concepts • Abstractors • Mechanism for performing abstraction on observables • Abstractor: Obs* -> Obs • Refiners • Mechanism for specialising a solution • Refiner: Sol -> Sol* • Candidate Exclusion Criterion • A criterion which is used to decide when a search path is a dead-end • Default criterion rules out inconsistent solutions
Monotonicity of Exclusion Criterion (def-axiom exclusion-is-monotonic (forall (?sol1 ?sol2 ?obs ?criterion) (=> (and (ruled-out-solution ?sol1 (the-match-score ?sol1) ?criterion) (not (better-match-than ?sol2 ?sol1 ?obs ?criterion))) (ruled-out-solution ?sol2 (the-match-score ?sol2)?criterion))))
Axiom of Congruence (def-axiom CONGRUENT-ADMISSIBILITY-AND-EXCLUSION-CRITERIA (forall (?sol ?task) (=> (member ?sol (the-solution-space ?task)) (not (and (admissible-solution ?sol (the-match-score ?sol) (role-value ?task 'has-solution-admissibility-criterion)) (ruled-out-solution ?sol (the-match-score ?sol) (role-value ?psm 'has-solution-exclusion-criterion)))))))
Three Heuristic Classification PSMs • Two Data-directed • Admissible Solution Classifier • Finds one admissible solution according to the given criteria • Uses backtracking hill climbing • Optimal Classifier • Performs complete search looking for optimal solution • Uses best-first strategy • Uses candidate exclusion criterion to prune search space • One Solution-directed • Goes down the solution hierarchy, acquiring observables as needed • Ask for observables with max discrimination power
Four Assumptions in Main PSMs • No cycles in abstraction hierarchy • No cycles in refinement hierarchy • At least one class in the solution space is an admissible solution • The solution refinement hierarchy is consistent with the candidate exclusion criterion. That is if sol is ruled out, all refinements of sol can also be ruled out
Example • Apple Domain • Originally developed in Amsterdam • Solutions = Apple Types = {granny, noble, delicious...} • Hierarchy of Apple Types • Features = {bkg-colour, fg-colour, rusty....} • Pretty trivial really!
Mapping Solutions and Obs to Apples (def-relation-mapping solution :up ((solution ?x) if (or (= ?x apple) (subclass-of ?x apple)))) (def-relation-mapping observable :up ((observable ?x) if (or (== ?X (?f ?v ?obs)) (== ?x (?f ?v)))))
More Relation Mappings (def-relation-mapping has-observable-feature :up ((has-observable-feature ?x ?f) if (or (== ?X (?f ?v ?obs)) (== ?x (?f ?v))))) (def-relation-mapping has-observable-value :up ((has-observable-value ?x ?v) if (or (== ?X (?f ?v ?obs)) (== ?x (?f ?v))))) (def-relation-mapping directly-abstracts-from :up ((directly-abstracts-from ?ob ?obs) if (== ?ob (?f ?v ?obs))))
Sample Abstractor (def-instance sugar-abstractor abstractor ((has-body '(lambda (?obs) (in-environment ((?v . (observables-feature-value ?obs 'sugar))) (cond ((>= ?v 70) (list-of 'sweet-level 'high (list-of (list-of 'sugar ?v)))) ((and (< ?v 70) (> ?v 40)) (list-of 'sweet-level 'medium (list-of (list-of 'sugar ?v)))) ((<= ?v 40) (list-of 'sweet-level 'low (list-of (list-of 'sugar ?v)))))))) (applicability-condition (kappa (?obs) (member 'sugar (all-features-in-observables ?obs))))))
Generic (reusable) Refiner (def-instance refinement-through-subclass-of-links refiner "If the solution space is specified by means of classes arranged in a subclass-of hierarchy, then this is a good refiner to use" ((has-body '(lambda (?sol) (setofall ?sub (direct-subclass-of ?sub ?sol)))) (applicability-condition (kappa (?sol) (and (class ?sol) (exists ?sub (direct-subclass-of ?sub ?sol)))))))
Evaluation/Results • All PSMs successfully tested on the apple domain • Assumptions also successfully tested in the domain • Library available online in WebOnto
Next Tasks • Start work on Internet Reasoning Service • Approach: Ever increasing levels of intelligent support • Browsing/Navigation/Manual PSM Configuration • Intelligent Assistant • Semi-automated component selection/configuration • Intelligent Broker • Multiple libraries/multiple platforms/symbol-level interoperability • Application to more complex domains • Scientific Classification, Selection of Manufacturing Tech.
Possible Platforms for IRS • Specialized WebOnto Configuration • Protégé • Intg. Protégé with OCML Library • Collaboration with Stanford (i.e., Monica) • Dedicated Tabs to support PSM selection/reuse • New Java/Lisp Tool • Java Applets interfaced with library sitting on Lisp server
Classification Library in OCML (at the end of IBROW 1) • Task spec (TaskSpec1) • Flat classification PSM (GenPSM1) • Applied to apple and Rocky-III domains