Iterative Dichotomiser 3

Iterative Dichotomiser 3 By Christopher Archibald

Decision Trees • A Decision tree is a tree with branching nodes with a choice between 2 or more choices. • Decision Node: A node that a choice is made • Leaf Node: The result from that point of the tree

Decision Trees • Will it rain? • If it is Sunny, it will not rain. • If it is cloudy it will rain. • If it is partially cloudy, it will depends on the if it is humid or not.

ID3 • Invented by J. Ross Quinlan • Employs a top-down greedy search through the space of possible decision trees. • Select attribute that is most useful for classifying examples (attribute that has the highest Information Gain).

Entropy • Entropy tells us how well an attribute will separate the given example according to the target classification class. • Entropy(S) = -Ppos log2 Ppos – Pneg log2 Pneg • Ppos = Proportion of positive examples • Pneg = proportion of negative example

Entropy Example Example: If S is a collection of 15 examples with 10 YES and 5 NO, then: Entropy(S) = - (10/15) log2 (10/15) - (5/15) log2 (5/15) = 0.918 In your Calculator you would have to enter -((10/15)log(10/15))/log2 – ((5/15)log(10/15))/log2 Because log is set to base 10 and you need base 2

Information Gain • Measures the expected reduction in entropy. The higher the Information Gain, more is the expected reduction in entropy. • The Equation for Information gain is.

Information Gain • A is an attribute of collection S • Sv = subset of S for which attribute A has value v • |Sv| = number of elements in Sv • |S| = number of elements in S

Example

Example (cont) • Entropy(S) = -Ppos log2 Ppos – Pneg log2 Pneg Entropy(4Y,2N): -(4/6)log2(4/6) – (2/6)log2(2/6) = 0.91829 Now that we know the Entropy where going to use that answer to find the Information Gain

Example

Example (Cont) • For Attributes (Contains Cars) S = [4Y,2N] SYes = [3Y,2N] E(SYes) = 0.97095 SNo = [1Y,0N] E(SNo) = 0 Gain (S, Contains Cars) = 0.91829–[(5/6)*0.97095 + (1/6)*0] = 0.10916

Example

Example (Cont) • For Attributes (Contains Rally Cars) S = [4Y,2N] SYes = [0Y,1N] E(SYes) = 0 SNo = [4Y,1N] E(SNo) = 0.7219 Gain (S, Contains Rally Cars) = 0.91829 – [(1/6)*0 + (5/6)*0.7219] = 0.3167

Example (Cont) • For Attributes (Races) S = [4Y,2N] SYes = [0Y,1N] E(SYes) = 0 SNo = [4Y,1N] E(SNo) = 0.7219 Gain (S, Races) = 0.91829 – [(1/6)*0 + (5/6)*0.7219] = 0.3167

Example (Cont) • Gain (S, Contains Cars) = 0.10916 • Gain (S, Contains Rally Cars) = 0.3167 • Gain (S, Races) = 0.3167

Source • Dr. Lee’s Slides, San Jose State University, Spring 2008 • http://www.cise.ufl.edu/~ddd/cap6635/Fall-97/Short-papers/2.htm • http://decisiontrees.net/node/27

Iterative Dichotomiser 3

Iterative Dichotomiser 3

Presentation Transcript

Iterative Dichotomiser 3 (ID3) Algorithm

Iterative Dichotomiser 3 ID3 Algorithm

Iterative Project Management

Iterative Project Management

Iterative Dichotomiser ID3 Algorithm

Decision Trees the Iterative Dichotomiser 3 ID3 Algorithm

Iterative Development

Iterative Dichotomiser 3 (ID3) Algorithm

Iterative methods

Iterative Patterns

Iterative Closest Point

Iterative Solution Methods

Iterative Equalization

Iterative Statements

Iterative Statements

Iterative Dichotomiser ( ID3) Algorithm

Iterative Deepening

Iterative Methods

Iterative Project Management