1 / 12

Classification & ID3

Classification & ID3. Dr. Riggs Spring 2004. Classification Problem. Given a some number of observed features Predict an unobserved feature (the ‘class’) Example: Given features of a borrower Predict whether he will default An interesting problem is learning rules from examples.

Download Presentation

Classification & ID3

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Classification & ID3 Dr. Riggs Spring 2004

  2. Classification Problem • Given a some number of observed features • Predict an unobserved feature (the ‘class’) • Example: • Given features of a borrower • Predict whether he will default • An interesting problem is learning rules from examples Dr. Riggs

  3. Example Data ?id ?size ?color ?shape ?class • (item 1 medium blue brick yes) • (item 2 small red sphere yes) • (item 3 large green pillar yes) • (item 4 large green sphere yes) • (item 5 small red wedge no) • (item 6 large red wedge no) • (item 7 large red pillar no) Dr. Riggs

  4. Distinguish ALL By Size {1,2,3,4,5,6,7} = All medium small large {3 4 6 7 } { 1 } {2 5 } Class: Y N Y Y Y N N Rule: (feature ?id size medium) => (class ?id yes) Dr. Riggs

  5. Distinguish {2 5} By Shape {2 5} = size: small sphere wedge {5 } {2 } Class:: Y N (feature ?id size small) (feature ?id shape sphere) => (class ?id yes) (feature ?id size small) (feature ?id shape wedge) => (class ?id no) Dr. Riggs

  6. Distinguish {3 4 6 7 } By COLOR { 3 4 6 7 } = size: large green red { 6 7 } {3 4 } C: Y Y N N (feature ?id size large) (feature ?id color green) => (class ?id yes) (feature ?id size large) (feature ?id color red) => (class ?id no) Dr. Riggs

  7. Considerations • Are the examples enough? • The examples must be enough to tell the classes apart • This is an unsolvable question • Are the rules the most efficient? • We could have made other choices • What should we uses to compare choices? Dr. Riggs

  8. Entropy • Measures ‘disorder’ • Def: n H(m1..mn) = -  Pr(mi) * lg( Pr(mi ) ) i=1 • Example (entropy of learning set): • Messages (m1…m7) : Y Y Y Y N N N • Pr(Y) = 4/7 Pr(N) = 3/7 • H = - [ 4/7*lg 4/7 + 3/7*lg(3/7) •  - [ .571*-.243 + .429*-.368] = .985 lg = log2 Dr. Riggs

  9. Gain If a set is partitioned by a feature into subsets • the gain in entropy is: Original_entropy - the_weighted_sum_of_subclass_entropies • Eg: Partition ALL = {1,2,3,4,5,6,7} by COLOR  {blue 1 } {red 2 5 6 7} {green 3 4} partition =>{blue Y} {red Y N N N} {green Y Y} map H(blue)=0 H(red)= .811 H(green)=0 • GAIN(color) = H(all) -  |ss|/|all|*H(ss) ss=red,green,blue = .985 – ( 1/7*0 + 4/7*.811 + 2/7*0) = .522 Dr. Riggs

  10. Distinguish All By Color {1,2,3,4,5,6,7} = All red blue green { 3 4 } { 2 5 6 7 } {1} map: Y Y N N N Y Y H: 0 -1/4lg1/4 -3/4lg3/4 0 wH= 0 + 4/7*(.5+.31) + 0 = .81 Gain = .985 - .464 = .521 Dr. Riggs

  11. Distinguish All By Shape {1,2,3,4,5,6,7} = All sphere brick wedge pillar {1} { 2 4 } { 5 6 } {3 7} C: Y Y Y N N Y N H: 0 0 0 1 wH= 0 + 2/7*0 + 2/7 *0 + 2/7*1 =.286 Gain = .985 -.286 = .699 Dr. Riggs

  12. ID3 • Given: a learning set (LS) • examples w/ features & outcome (class) • Use each (feature,value) to partition the LS • Calculate H for each partition Pf,v • Calculate the gain for each feature • Original H –  | Pf,v | / |LS| * H(Pf,v) v • Partition by the feature with highest gain • Apply ID3 to any subsets Pf,v with H>0 Dr. Riggs

More Related