1.07k likes | 1.28k Views
Rough Sets Theory Logical Analysis of Data. Monday , November 26, 2007. Johanna GOLD. Introduction. Comparison of two theories for rules induction. Different methodologies Same results?. Generalities. Set of objects described by attributes. Each object belongs to a class.
E N D
Rough Sets Theory Logical Analysis of Data. Monday, November 26, 2007 Johanna GOLD
Introduction • Comparison of two theories for rules induction. • Different methodologies • Same results?
Generalities • Set of objects described by attributes. • Each object belongs to a class. • We want decision rules.
Approaches • There are two approaches: • Rough Sets Theory (RST) • Logical Analysis of Data (LAD) • Goal : compare them
Contents Rough Sets Theory Logical Analysis Of data Comparison Inconsistencies
Inconsistencies • Two examples having the exact same values in all attributes, but belonging to two different classes. • Example: two sick people have the same symptomas but different disease.
Covered by RST • RST doesn’t correct or aggregate inconsistencies. • For each class : determination of lower and upper approximations.
Approximations • Lower : objects we are sure they belong to the class. • Upper : objects than can belong to the class.
Impact on rules • Lower approximation → certain rules • Upper approximation → possible rules
Pretreatment • Rules induction on numerical data → poor rules → too many rules. • Need of pretreatment.
Discretization • Goal : convert numerical data into discrete data. • Principle : determination of cut points in order to divide domains into successive intervals.
Algorithms • First algorithm: LEM2 • Improved algorithms: • Include the pretreatment • MLEM2, MODLEM, …
LEM2 • Induction of certain rules from the lower approximation. • Induction of possible rules from the upper approximation. • Same procedure
Definitions (1) • For an attribute x and its value v, a block [(x,v)] of attribute-value pair (x,v) is all the cases where the attribute x has the value v. • Ex : [(Age,21)]=[Martha] [(Age,22)]=[David ; Audrey]
Definitions (2) • Let B be a non-empty lower or upper approximation of a concept represented by a decision-value pair (d,w). • Ex : (level,middle)→B=[obj1 ; obj5 ; obj7]
Definitions (3) • Let T be a set of pairs attribute-value (a,v). • Set B depends on set T if and only if:
Definitions (4) • A set T is minimal complex of B if and only if B depends on T and there is no subset T’ of T such as B depends on T’.
Definitions (5) • Let T be a non-empty collection of non-empty set of attribute-value pairs. • T is a set of T. • T is a set of (a,v).
Definitions (6) • T is a local cover of B if and only if: • Each member T of T is a minimal complex of B. • T is minimal
Algorithm principle • LEM2’s output is a local cover for each approximation of the decision table concept. • It then convert them into decision rules.
Heuristics details Among the possible blocks, we choose the one: • With the highest priority • With the highest intersection • With the smallest cardinal
Heuristics details • As long as it is not a minimal complex, pairs are added. • As long as there is not a local cover, minimal complexes are added.
Illustration • Illustration through an example. • We consider that the pretreatment has already been done.
Cut points • For the attribute Height, we have the values 160, 170 and 180. • The pretreatment gives us two cut points: 165 and 175.
Blocks [(a,v)] • [(Height, 160..165)]={1,3,5} • [(Height, 165..180)]={2,4} • [(Height, 160..175)]={1,2,3,5} • [(Height, 175..180)]={4} • [(Hair, Blond)]={1,2} • [(Hair, Red)]={3} • [(Hair, Black)]={4,5,6}
First concept • G = B = [(Attraction,-)] = {1,4,5,6} • Here there is no inconsistencies. If there were some, it’s at this point that we have to chose between the lower and the upper approximation.
Eligible pairs • Pair (a,v) such as [(a,v)]∩[(Attraction,-)]≠Ø • (Height,160..165) • (Height,165..180) • (Height,160..175) • (Height,175..180) • (Hair,Blond) • (Hair,Black)
Choice of a pair • We chose the most appropriate, which is to say (a,v) for which | [(a,v)] ∩ [(Attraction,-)] | is the highest. • Here : (Hair, Black)
Minimal complex • The pair (Hair, Black) is a minimal complex because:
New concept • B = [(Attraction,-)] – [(Hair,Black)] = {1,4,5,6} - {4,5,6} = {1}
Choice of a pair (1) • Through the pairs (Height,160..165), (Height,160..175) and (Hair, Blond). • Intersections having the same cardinality, we chose the pair having the smallest cardinal: (Hair, Blond)
Choice of a pair (2) • Problem : • (Hair, Blond) is non a minimal complex. • We chose the following pair: (Height,160..165).
Minimal Complex • {(Hair, Blond),(Height,160..165)} is a second minimal complex.
End of the concept • {{(Hair, Black)}, {(Hair, Blond), (Height, 160..165)}} is a local cover of [(Attraction,-)].
Rules • (Hair, Red) → (Attraction,+) • (Hair, Blond) & (Height,165..180 ) → (Attraction,+) • (Hair, Black) → (Attraction,-) • (Hair, Blond) & (Height,160..165 ) → (Attraction,-)
Contents Rough Sets Theory Logical Analysis Of data Comparison Inconsistencies
Principle • Work on binary data. • Extension of boolean approach on non-binary case.
Definitions (1) • Let S be the set of all observations. • Each observation is described by n attributes. • Each observation belongs to a class.
Definitions (2) • The classification can be considered as a partition into two sets • An archive is represented by a boolean function Φ :
Definitions (3) • A literal is a boolean variable or its negation: • A term is a conjunction of literals : • The degree of a term is the number of literals.
Definitions (4) • A term Tcovers a point if T(p)=1. • A characteristic term of a point p is the unique term of degree n covering p. • Ex :
Definitions (5) • A term T is an implicant of a boolean function f if T(p) ≤ f(p) for all • An implicant is called prime if it is minimal (its degree).
Definitions (6) • A positive prime patternis a term covering at least one positive example and no negative example. • A negative prime patternis a term covering at least one negative example and no positive example.
Example • is a positive pattern : • There is no negative example such as • There is one positive example : the 3rd line. • It's a positive prime pattern : • covers one negative example : 4th line. • covers one negative example : 5th line.
Pattern generation • symmetry between positive and negative patterns. • Two approaches : • Top-down • Bottom-up
Top-down • we associate each positive example to its characteristic term→ it’s a pattern. • we take out the literals one by one until having a prime pattern.
Bottom-up • we begin with terms of degree one: • if it does not cover a negative example, it is a pattern • If not, we add literals until having a pattern.