Knowledge in Learning (Logical formulation of Learning) CS570 Lecture Notes by Jin Hyung Kim

Knowledge in Learning (Logical formulation of Learning) CS570 Lecture Notes by Jin Hyung Kim Computer Science Department KAIST

Leaning General Logical Descriptions • Find (general) logical descriptions consistent with sample data (examples) • logical connections among examples and goal(concept) • Iteratively refine Hypothesis space observing examples

Logical description • Description of X1 • Classification of Object • Examples positive example, negative example • Training Set • Conjunction of all description and classification sentences

Extension • Each hypothesis predicts examples of goal predicate • This set of examples is called extension of the predicates • Two hypothesis with different extensions => logically inconsistent each other • They disagree at least one example • Hypotheses space H = { H1, H2, … Hn } • Hypotheses space is H1 v H2 v H3 v H4 , • Example I inconsistent with H2 and H3 • Then new Hypotheses space H1 v H4

Learning is a Search Process • inductive learning is a search process of gradually eliminating inconsistent hypotheses • Hypotheses space is vast (often infinite) • Enumeration is impossible • Current-best Hypothesis search • Least Commitment search (version space)

Inconsistent Examples • false-negative example • Hi says negative, but example is positive • need generalization • false-positive example • Hi says positive, but example is negative • need specialization

Generalization / Specialization • Specialization and generalization relationship • C1  C2 , (blue  book)  book • C1 is derived from C2 : Dropping conditions • Weaker definition allows a larger set of positive exmaples • Transitive relationships hold • Generalization example • Hypothesis : (x)boy(x)  KAIST(x)  smart(x) • example : boy(x1)  KAIST(x1)  smart(x1) • Generalization : (x)KAIST(x)  smart(x)

Specialization • By adding extra condition to definition • Removing disjuncts from disjunctive definition • Specialization example • Hypothesis : (x)KAIST(x)  smart(x) • example : boy(x2)  KAIST(x2)  smart(x2) • specialization : (x)boy(x)  KAIST(x)  smart(x) • KAIST(x2) vs KAIST(x2)  Postech(x2)

Current-best-Learning

Why Search ? • Want simplest hypothesis • There are several possible specialization and generalization • Have to Choose best guess • May lead to an unrecoverable situation • Backtracking

Winston’s Arch Learning Support Support Left of Copied/modified from cs.haifa.ac.il/courses/nn/AI.Learning.ppt

Least-commitment search • Instead of choosing hypothesis • Keep all and only hypotheses consistent so far • Each instance either • no effect or • will get rid of some of the hypotheses

Version Space Method V is the version space • V H • For every example x in training set D do • Eliminate from V every hypothesis that does not agree with x • If V is empty then return failure • Return V But the size of Vis enormous!!! Idea: Define a partial orderingon the hypotheses in H and onlyrepresent the upper and lowerbounds of V for this ordering • Compared to the decisiontree method, this algorithm is: • incremental • least-commitment

Version Space • Generalization and specialization relations presented as Graph called Version Space • node : concept • arc : immediate specialization • General Boundary Set (G-set) • G-set is consistent so far • No consistent and more general hypothesis • Specific Boundary Set (S-set) • S-set is consistent so far • No consistent and more specific hypothesis • Between G-set and S-set is consistent

Version Space

Version Space Algorithm • Iteratively eliminate candidates (Version Space Update) • False positive for Si(too general),remove Si from S-set • False negative for Si (too specific), replace Si by its immediate generalizations • False positive for Gi (too general), replace it by its immediate specializations • False negative for Gi (too specific), remove Gi from G-set • Termination • one concepts left - unique H • either S- or G-set becomes empty - No consistent H • several H’s remain - disjunction of H’s

Card Example copied from www.stanford.edu/class/cs121/slides/ Lecture14-version-space.ppt Rewarded Card Example (r=1) v … v (r=10) v (r=J) v (r=Q) v (r=K)  ANY-RANK(r)(r=1) v … v (r=10)  NUM(r) (r=J) v (r=Q) v (r=K)  FACE(r)(s=) v (s=) v (s=) v (s=)  ANY-SUIT(s)(s=) v (s=)  BLACK(s)(s=) v (s=)  RED(s) • An hypothesis is any sentence of the form:R(r)  S(s)  REWARD([r,s]) • where: • R(r) is ANY-RANK(r), NUM(r), FACE(r), or (r=j) • S(s) is ANY-SUIT(s), BLACK(s), RED(s), or (s=k)

Simplified Representation • For simplicity, we represent a concept by rs, with: • r = a, n, f, 1, …, 10, j, q, k • s = a, b, r, , , , For example: • n represents: NUM(r)  (s=)  REWARD([r,s]) • aa represents: • ANY-RANK(r)  ANY-SUIT(s)  REWARD([r,s])

Extension of an Hypothesis The extension of an hypothesis h is the set of objects that verifies h • Examples: • The extension of f is: {j, q, k} • The extension of aa is the set of all cards

More General/Specific Relation • Let h1 and h2 be two hypotheses in H • h1 is more general than h2 iff the extension of h1 is a proper superset of h2’s • Examples: • aa is more general than f • f is more general than q • fr and nr are not comparable

More General/Specific Relation • Let h1 and h2 be two hypotheses in H • h1 is more general than h2 iff the extension of h1 is a proper superset of h2’s • The inverse of the “more general” relation is the “more specific” relation • The “more general” relation defines a partial ordering on the hypotheses in H

aa na ab 4a nb a 4b n 4 Example: Subset of Partial Order

a a n b f r 1   10 j   k … … Construction of Ordering Relation

G-Boundary / S-Boundary of V • An hypothesis in V is most general iff no hypothesis in V is more general • G-boundary G of V: Set of most general hypotheses in V • An hypothesis in V is most specific iff no hypothesis in V is more general • S-boundary S of V: Set of most specific hypotheses in V

aa aa na ab 4a nb a 4b n … … 1 4 4 k Example: G-/S-Boundaries of V G We replace every hypothesis in S whose extension does not contain 4 by its generalization set Now suppose that 4 is given as a positive example S

Example: G-/S-Boundaries of V aa na ab 4a nb a Here, both G and S have size 1. This is not the case in general! 4b n 4

Generalization set of 4 Example: G-/S-Boundaries of V aa The generalization setof an hypothesis h is theset of the hypotheses that are immediately moregeneral than h na ab 4a nb a 4b n Let 7 be the next (positive) example 4

Example: G-/S-Boundaries of V aa na ab 4a a nb 4b n Let 7 be the next (positive) example 4

Specializationset of aa Example: G-/S-Boundaries of V aa na ab nb a n Let 5 be the next (negative) example

Example: G-/S-Boundaries of V G and S, and all hypotheses in between form exactly the version space ab nb a 1. If an hypothesis between G and S disagreed with an example x, then an hypothesis G or S would also disagree with x, hence would have been removed n

Example: G-/S-Boundaries of V G and S, and all hypotheses in between form exactly the version space ab nb a 2. If there were an hypothesis not in this set which agreed with all examples, then it would have to be either no more specific than any member of G – but then it would be in G – or no more general than some member of S – but then it would be in S n

No Yes Maybe Example: G-/S-Boundaries of V At this stage … ab nb a n Do 8, 6, jsatisfy CONCEPT?

Example: G-/S-Boundaries of V ab nb a n Let 2 be the next (positive) example

Example: G-/S-Boundaries of V ab nb Let j be the next (negative) example

Example: G-/S-Boundaries of V + 4 7 2 –5 j nb NUM(r)  BLACK(s)  REWARD([r,s])

Example: G-/S-Boundaries of V Let us return to the version space … … and let 8 be the next (negative) example ab nb a The only most specific hypothesis disagrees with this example, hence no hypothesis in H agrees with all examples n

Example: G-/S-Boundaries of V Let us return to the version space … … and let jbe the next (positive) example ab nb a The only most general hypothesis disagrees with this example, hence no hypothesis in H agrees with all examples n

Version Space Update • x  new example • If x is positive then (G,S)  POSITIVE-UPDATE(G,S,x) • Else(G,S)  NEGATIVE-UPDATE(G,S,x) • If G or S is empty then return failure

POSITIVE-UPDATE(G,S,x) • Eliminate all hypotheses in G that do not agree with x

Using the generalization sets of the hypotheses POSITIVE-UPDATE(G,S,x) • Eliminate all hypotheses in G that do not agree with x • Minimally generalize all hypotheses in S until they are consistent with x

This step was not needed in the card example POSITIVE-UPDATE(G,S,x) • Eliminate all hypotheses in G that do not agree with x • Minimally generalize all hypotheses in S until they are consistent with x • Remove from S every hypothesis that is neither more specific than nor equal to a hypothesis in G

POSITIVE-UPDATE(G,S,x) • Eliminate all hypotheses in G that do not agree with x • Minimally generalize all hypotheses in S until they are consistent with x • Remove from S every hypothesis that is neither more specific than nor equal to a hypothesis in G • Remove from S every hypothesis that is more general than another hypothesis in S • Return (G,S)

NEGATIVE-UPDATE(G,S,x) • Eliminate all hypotheses in S that do not agree with x • Minimally specialize all hypotheses in G until they are consistent with x • Remove from G every hypothesis that is neither more general than nor equal to a hypothesis in S • Remove from G every hypothesis that is more specific than another hypothesis in G • Return (G,S)

Version Space Algorithm

VSL vs DTL • Decision tree learning (DTL) is more efficient if all examples are given in advance; else, it may produce successive hypotheses, each poorly related to the previous one • Version space learning (VSL) is incremental • DTL can produce simplified hypotheses that do not agree with all examples • DTL has been more widely used in practice

Knowledge in Learning (Logical formulation of Learning) CS570 Lecture Notes by Jin Hyung Kim

Knowledge in Learning (Logical formulation of Learning) CS570 Lecture Notes by Jin Hyung Kim

Presentation Transcript

Knowledge and Learning

Knowledge and Learning

Practical Statistical Relational Learning

A Logical Formulation of PRMs

What have we learned about learning?

What have we learned about learning?

Learning to Detect A Salient Object

Learning with Uncertain Labels

WHAT IS THE LOGICAL PROBLEM OF FOREIGN LANGUAGE LEARNING?

Ch. 19 – Knowledge in Learning

Statistical Relational Learning

LEARNING OBJECTIVES

Knowledge Discovery in Ontology Learning

Knowledge Development in E-Learning Environments

Knowledge in Learning

“Assessment LEarning in Knowledge Spaces”

Knowledge – based learning

SUS framework for learning

Learning to Detect A Salient Object

Ch. 19 – Knowledge in Learning

Learning Three-Valued Logical Programs

Importance of Knowledge In Learning