200 likes | 379 Views
Iterative Dichotomiser 3. By Christopher Archibald. Decision Trees. A Decision tree is a tree with branching nodes with a choice between 2 or more choices. Decision Node: A node that a choice is made Leaf Node: The result from that point of the tree. Decision Trees. Will it rain?
E N D
Iterative Dichotomiser 3 By Christopher Archibald
Decision Trees • A Decision tree is a tree with branching nodes with a choice between 2 or more choices. • Decision Node: A node that a choice is made • Leaf Node: The result from that point of the tree
Decision Trees • Will it rain? • If it is Sunny, it will not rain. • If it is cloudy it will rain. • If it is partially cloudy, it will depends on the if it is humid or not.
ID3 • Invented by J. Ross Quinlan • Employs a top-down greedy search through the space of possible decision trees. • Select attribute that is most useful for classifying examples (attribute that has the highest Information Gain).
Entropy • Entropy tells us how well an attribute will separate the given example according to the target classification class. • Entropy(S) = -Ppos log2 Ppos – Pneg log2 Pneg • Ppos = Proportion of positive examples • Pneg = proportion of negative example
Entropy Example Example: If S is a collection of 15 examples with 10 YES and 5 NO, then: Entropy(S) = - (10/15) log2 (10/15) - (5/15) log2 (5/15) = 0.918 In your Calculator you would have to enter -((10/15)log(10/15))/log2 – ((5/15)log(10/15))/log2 Because log is set to base 10 and you need base 2
Information Gain • Measures the expected reduction in entropy. The higher the Information Gain, more is the expected reduction in entropy. • The Equation for Information gain is.
Information Gain • A is an attribute of collection S • Sv = subset of S for which attribute A has value v • |Sv| = number of elements in Sv • |S| = number of elements in S
Example (cont) • Entropy(S) = -Ppos log2 Ppos – Pneg log2 Pneg Entropy(4Y,2N): -(4/6)log2(4/6) – (2/6)log2(2/6) = 0.91829 Now that we know the Entropy where going to use that answer to find the Information Gain
Example (Cont) • For Attributes (Contains Cars) S = [4Y,2N] SYes = [3Y,2N] E(SYes) = 0.97095 SNo = [1Y,0N] E(SNo) = 0 Gain (S, Contains Cars) = 0.91829–[(5/6)*0.97095 + (1/6)*0] = 0.10916
Example (Cont) • For Attributes (Contains Rally Cars) S = [4Y,2N] SYes = [0Y,1N] E(SYes) = 0 SNo = [4Y,1N] E(SNo) = 0.7219 Gain (S, Contains Rally Cars) = 0.91829 – [(1/6)*0 + (5/6)*0.7219] = 0.3167
Example (Cont) • For Attributes (Races) S = [4Y,2N] SYes = [0Y,1N] E(SYes) = 0 SNo = [4Y,1N] E(SNo) = 0.7219 Gain (S, Races) = 0.91829 – [(1/6)*0 + (5/6)*0.7219] = 0.3167
Example (Cont) • Gain (S, Contains Cars) = 0.10916 • Gain (S, Contains Rally Cars) = 0.3167 • Gain (S, Races) = 0.3167
Source • Dr. Lee’s Slides, San Jose State University, Spring 2008 • http://www.cise.ufl.edu/~ddd/cap6635/Fall-97/Short-papers/2.htm • http://decisiontrees.net/node/27