840 likes | 2.14k Views
Knowledge in Learning (Logical formulation of Learning) CS570 Lecture Notes by Jin Hyung Kim Computer Science Department KAIST. Leaning General Logical Descriptions. Find ( general ) logical descriptions consistent with sample data (examples)
E N D
Knowledge in Learning (Logical formulation of Learning) CS570 Lecture Notes by Jin Hyung Kim Computer Science Department KAIST
Leaning General Logical Descriptions • Find (general) logical descriptions consistent with sample data (examples) • logical connections among examples and goal(concept) • Iteratively refine Hypothesis space observing examples
Logical description • Description of X1 • Classification of Object • Examples positive example, negative example • Training Set • Conjunction of all description and classification sentences
Extension • Each hypothesis predicts examples of goal predicate • This set of examples is called extension of the predicates • Two hypothesis with different extensions => logically inconsistent each other • They disagree at least one example • Hypotheses space H = { H1, H2, … Hn } • Hypotheses space is H1 v H2 v H3 v H4 , • Example I inconsistent with H2 and H3 • Then new Hypotheses space H1 v H4
Learning is a Search Process • inductive learning is a search process of gradually eliminating inconsistent hypotheses • Hypotheses space is vast (often infinite) • Enumeration is impossible • Current-best Hypothesis search • Least Commitment search (version space)
Inconsistent Examples • false-negative example • Hi says negative, but example is positive • need generalization • false-positive example • Hi says positive, but example is negative • need specialization
Generalization / Specialization • Specialization and generalization relationship • C1 C2 , (blue book) book • C1 is derived from C2 : Dropping conditions • Weaker definition allows a larger set of positive exmaples • Transitive relationships hold • Generalization example • Hypothesis : (x)boy(x) KAIST(x) smart(x) • example : boy(x1) KAIST(x1) smart(x1) • Generalization : (x)KAIST(x) smart(x)
Specialization • By adding extra condition to definition • Removing disjuncts from disjunctive definition • Specialization example • Hypothesis : (x)KAIST(x) smart(x) • example : boy(x2) KAIST(x2) smart(x2) • specialization : (x)boy(x) KAIST(x) smart(x) • KAIST(x2) vs KAIST(x2) Postech(x2)
Why Search ? • Want simplest hypothesis • There are several possible specialization and generalization • Have to Choose best guess • May lead to an unrecoverable situation • Backtracking
Winston’s Arch Learning Support Support Left of Copied/modified from cs.haifa.ac.il/courses/nn/AI.Learning.ppt
Least-commitment search • Instead of choosing hypothesis • Keep all and only hypotheses consistent so far • Each instance either • no effect or • will get rid of some of the hypotheses
Version Space Method V is the version space • V H • For every example x in training set D do • Eliminate from V every hypothesis that does not agree with x • If V is empty then return failure • Return V But the size of Vis enormous!!! Idea: Define a partial orderingon the hypotheses in H and onlyrepresent the upper and lowerbounds of V for this ordering • Compared to the decisiontree method, this algorithm is: • incremental • least-commitment
Version Space • Generalization and specialization relations presented as Graph called Version Space • node : concept • arc : immediate specialization • General Boundary Set (G-set) • G-set is consistent so far • No consistent and more general hypothesis • Specific Boundary Set (S-set) • S-set is consistent so far • No consistent and more specific hypothesis • Between G-set and S-set is consistent
Version Space Algorithm • Iteratively eliminate candidates (Version Space Update) • False positive for Si(too general),remove Si from S-set • False negative for Si (too specific), replace Si by its immediate generalizations • False positive for Gi (too general), replace it by its immediate specializations • False negative for Gi (too specific), remove Gi from G-set • Termination • one concepts left - unique H • either S- or G-set becomes empty - No consistent H • several H’s remain - disjunction of H’s
Card Example copied from www.stanford.edu/class/cs121/slides/ Lecture14-version-space.ppt Rewarded Card Example (r=1) v … v (r=10) v (r=J) v (r=Q) v (r=K) ANY-RANK(r)(r=1) v … v (r=10) NUM(r) (r=J) v (r=Q) v (r=K) FACE(r)(s=) v (s=) v (s=) v (s=) ANY-SUIT(s)(s=) v (s=) BLACK(s)(s=) v (s=) RED(s) • An hypothesis is any sentence of the form:R(r) S(s) REWARD([r,s]) • where: • R(r) is ANY-RANK(r), NUM(r), FACE(r), or (r=j) • S(s) is ANY-SUIT(s), BLACK(s), RED(s), or (s=k)
Simplified Representation • For simplicity, we represent a concept by rs, with: • r = a, n, f, 1, …, 10, j, q, k • s = a, b, r, , , , For example: • n represents: NUM(r) (s=) REWARD([r,s]) • aa represents: • ANY-RANK(r) ANY-SUIT(s) REWARD([r,s])
Extension of an Hypothesis The extension of an hypothesis h is the set of objects that verifies h • Examples: • The extension of f is: {j, q, k} • The extension of aa is the set of all cards
More General/Specific Relation • Let h1 and h2 be two hypotheses in H • h1 is more general than h2 iff the extension of h1 is a proper superset of h2’s • Examples: • aa is more general than f • f is more general than q • fr and nr are not comparable
More General/Specific Relation • Let h1 and h2 be two hypotheses in H • h1 is more general than h2 iff the extension of h1 is a proper superset of h2’s • The inverse of the “more general” relation is the “more specific” relation • The “more general” relation defines a partial ordering on the hypotheses in H
aa na ab 4a nb a 4b n 4 Example: Subset of Partial Order
a a n b f r 1 10 j k … … Construction of Ordering Relation
G-Boundary / S-Boundary of V • An hypothesis in V is most general iff no hypothesis in V is more general • G-boundary G of V: Set of most general hypotheses in V • An hypothesis in V is most specific iff no hypothesis in V is more general • S-boundary S of V: Set of most specific hypotheses in V
aa aa na ab 4a nb a 4b n … … 1 4 4 k Example: G-/S-Boundaries of V G We replace every hypothesis in S whose extension does not contain 4 by its generalization set Now suppose that 4 is given as a positive example S
Example: G-/S-Boundaries of V aa na ab 4a nb a Here, both G and S have size 1. This is not the case in general! 4b n 4
Generalization set of 4 Example: G-/S-Boundaries of V aa The generalization setof an hypothesis h is theset of the hypotheses that are immediately moregeneral than h na ab 4a nb a 4b n Let 7 be the next (positive) example 4
Example: G-/S-Boundaries of V aa na ab 4a a nb 4b n Let 7 be the next (positive) example 4
Specializationset of aa Example: G-/S-Boundaries of V aa na ab nb a n Let 5 be the next (negative) example
Example: G-/S-Boundaries of V G and S, and all hypotheses in between form exactly the version space ab nb a 1. If an hypothesis between G and S disagreed with an example x, then an hypothesis G or S would also disagree with x, hence would have been removed n
Example: G-/S-Boundaries of V G and S, and all hypotheses in between form exactly the version space ab nb a 2. If there were an hypothesis not in this set which agreed with all examples, then it would have to be either no more specific than any member of G – but then it would be in G – or no more general than some member of S – but then it would be in S n
No Yes Maybe Example: G-/S-Boundaries of V At this stage … ab nb a n Do 8, 6, jsatisfy CONCEPT?
Example: G-/S-Boundaries of V ab nb a n Let 2 be the next (positive) example
Example: G-/S-Boundaries of V ab nb Let j be the next (negative) example
Example: G-/S-Boundaries of V + 4 7 2 –5 j nb NUM(r) BLACK(s) REWARD([r,s])
Example: G-/S-Boundaries of V Let us return to the version space … … and let 8 be the next (negative) example ab nb a The only most specific hypothesis disagrees with this example, hence no hypothesis in H agrees with all examples n
Example: G-/S-Boundaries of V Let us return to the version space … … and let jbe the next (positive) example ab nb a The only most general hypothesis disagrees with this example, hence no hypothesis in H agrees with all examples n
Version Space Update • x new example • If x is positive then (G,S) POSITIVE-UPDATE(G,S,x) • Else(G,S) NEGATIVE-UPDATE(G,S,x) • If G or S is empty then return failure
POSITIVE-UPDATE(G,S,x) • Eliminate all hypotheses in G that do not agree with x
Using the generalization sets of the hypotheses POSITIVE-UPDATE(G,S,x) • Eliminate all hypotheses in G that do not agree with x • Minimally generalize all hypotheses in S until they are consistent with x
This step was not needed in the card example POSITIVE-UPDATE(G,S,x) • Eliminate all hypotheses in G that do not agree with x • Minimally generalize all hypotheses in S until they are consistent with x • Remove from S every hypothesis that is neither more specific than nor equal to a hypothesis in G
POSITIVE-UPDATE(G,S,x) • Eliminate all hypotheses in G that do not agree with x • Minimally generalize all hypotheses in S until they are consistent with x • Remove from S every hypothesis that is neither more specific than nor equal to a hypothesis in G • Remove from S every hypothesis that is more general than another hypothesis in S • Return (G,S)
NEGATIVE-UPDATE(G,S,x) • Eliminate all hypotheses in S that do not agree with x • Minimally specialize all hypotheses in G until they are consistent with x • Remove from G every hypothesis that is neither more general than nor equal to a hypothesis in S • Remove from G every hypothesis that is more specific than another hypothesis in G • Return (G,S)
VSL vs DTL • Decision tree learning (DTL) is more efficient if all examples are given in advance; else, it may produce successive hypotheses, each poorly related to the previous one • Version space learning (VSL) is incremental • DTL can produce simplified hypotheses that do not agree with all examples • DTL has been more widely used in practice