1 / 13

ID3 and Decision tree

ID3 and Decision tree. by Tuan Nguyen. ID3 and Decision tree. ID3 algorithm. Is the algorithm to construct a decision tree Using Entropy to generate the information gain The best value then be selected. ID3 and Decision tree. Entropy. The complete formula for entropy is:

Download Presentation

ID3 and Decision tree

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ID3 and Decision tree by Tuan Nguyen

  2. ID3 and Decision tree ID3 algorithm • Is the algorithm to construct a decision tree • Using Entropy to generate the information gain • The best value then be selected

  3. ID3 and Decision tree Entropy • The complete formula for entropy is: E(S) = -(p+)*log2(p+ ) - (p_ )*log2(p_ ) • Where p+ is the positive samples • Where p_ is the negative samples • Where S is the sample of attributions

  4. [29+,35-] A1=? True False [21+, 5-] [8+, 30-] ID3 and Decision tree Example E(A) = -29/(29+35)*log2(29/(29+35)) – 35/(35+29)log2(35/(35+29)) = 0.9937 E(TRUE) = - 21/(21+5)*log2(21/(21+5)) – 5/(5+21)*log2(5/(5+21)) = 0.7960 E(FALSE) = -8/(8+30)*log2(8/(8+30)) – 30/(30+8)*log2(30/(30+8)) = 0.7426 The Entropy of A1 is computed as the following: • The Entropy of True: • The Entropy of False:

  5. Gain(S,A) = Entropy(S) - vvalues(A) |Sv|/|S| Entropy(Sv) ID3 and Decision tree Information Gain • Gain (Sample, Attributes) or Gain (S,A) is expected reduction in entropy due to sorting S on attribute A So, for the previous example, the Information gain is calculated: • G(A1) = E(A1) - (21+5)/(29+35) * E(TRUE) - (8+30)/(29+35) * E(FALSE) = E(A1) - 26/64 * E(TRUE) - 38/64* E(FALSE) = 0.9937– 26/64 * 0.796 – 38/64* 0.7426 = 0.5465

  6. ID3 and Decision tree The complete example Consider the following table

  7. ID3 and Decision tree Decision tree • We want to build a decision tree for the tennis matches • The schedule of matches depend on the weather (Outlook, Temperature, Humidity, and Wind) • So to apply what we know to build a decision tree based on this table

  8. ID3 and Decision tree Example • Calculating the information gains for each of the weather attributes: • For the Wind • For the Humidity • For the Outlook

  9. S=[9+,5-] E=0.940 Wind Weak Strong [6+, 2-] [3+, 3-] Gain(S,Wind): =0.940 - (8/14)*0.811 - (6/14)*1.0 =0.048 ID3 and Decision tree For the Wind

  10. S=[9+,5-] E=0.940 Humidity High Normal [3+, 4-] [6+, 1-] Gain(S,Humidity) =0.940-(7/14)*0.985 – (7/14)*0.592 =0.151 ID3 and Decision tree For the Humidity

  11. S=[9+,5-] E=0.940 Outlook Rain Over cast Sunny [2+, 3-] [4+, 0] [3+, 2-] E=0.971 E=0.0 E=0.971 ID3 and Decision tree For the Outlook Gain(S,Outlook) =0.940-(5/14)*0.971 -(4/14)*0.0 – (5/14)*0.0971 =0.247

  12. Outlook Sunny Overcast Rain Humidity Yes Wind [D3,D7,D12,D13] High Normal Strong Weak No Yes No Yes [D6,D14] [D8,D9,D11] [D1,D2] ID3 and Decision tree Complete tree • Then here is the complete tree:

  13. Reference: • Dr. Lee’s Slides, San Jose State University, Spring 2007 • "Building Decision Trees with the ID3 Algorithm", by: Andrew Colin, Dr. Dobbs Journal, June 1996 • "Incremental Induction of Decision Trees", by Paul E. Utgoff, Kluwer Academic Publishers, 1989 • http://www.cise.ufl.edu/~ddd/cap6635/Fall-97/Short-papers/2.htm • http://decisiontrees.net/node/27

More Related