1 / 17

Collective Classification for Network Data With Noise

Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date : 2012.10.11. Collective Classification for Network Data With Noise. Outline. Introduction Network data Collective Classification ICA Problem Algorithm For Collective Inference With Noise Experiments Conclusions.

jory
Download Presentation

Collective Classification for Network Data With Noise

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date : 2012.10.11 Collective Classification for Network Data With Noise

  2. Outline • Introduction • Network data • Collective Classification • ICA • Problem • Algorithm For Collective Inference With Noise • Experiments • Conclusions

  3. Introduction – Network data • traditional data: • instances are independent of each other • network data: • instances are maybe related of each other • application: • email • web pages • papers citation independent related

  4. Introduction – Collective Classification • classify interrelated instances using relational features. • Related instances: • The instances to be classified are related. • Classifier: • The base classifier uses contentfeatures and relational features. • Collective inference: • update the class labels • recompute the relational feature values

  5. Introduction – ICA • ICA : Iterative Collective Algorithm Initial : Training local classifier use content features predict unlabel instances Iterative{ for predict each unlabel instance { set unlabel instance ’s relational feature use local classifier predict unlabel instance } } step1 step2

  6. Introduction – ICA Example unlabel data: Class label : 1 2 3 training data: • Initial : • use content features predict unlabel instances 3 1 H C 2 A 3 3 1 B 2 1 Iterative 1: 1. set unlabel instance ’s relational feature 2. use local classifier predict unlabel instances Iterative 2: 1. set unlabel instance ’s relational feature 2. use local classifier predict unlabel instances D 2 E 1 F 1 G 2 1

  7. Problem – Noise C A • label the wrong class • make a mistake • difficult to judge B G D 2 E F 2 1 1 1 1 or 2 ? 2 1

  8. Problem – use ICA True label: unlabel data: C B A ICA: training data: 2 2 1 • Initial : • use content features predict unlabel instances 1 A 1 1 1 C 1 Iterative 1: 1. set unlabel instance ’s relational feature 2. use local classifier predict unlabel instances Iterative 2: 1. set unlabel instance ’s relational feature 2. use local classifier predict unlabel instances 2 D 1 • noise B 2 A : 1 1 Iteration 1 2/3 1/3 0 Iteration 2 2/3 1/3 0

  9. ACIN • ACIN : Algorithm For Collective Inference With Noise Initial : Training local classifier use content features predict unlabel instances Iterative{ for predict each unlabel instance { for nbunlabel instance ’s neighbors{ if(need to predict again) (class label, probability ) = local classifier(nb) } set unlabel instance ’s relational feature (class label, probability ) = local classifier(A) } retrain local classifier } step1 step2 step3 step4 step5

  10. ACIN -Example True label: unlabel data: C B A ACIN: training data: 2 2 1 • Initial : • use content features predict unlabel instances 1 ( 1 , 60%) ( 1 , 60%) A 1 2 1 1 predict again C ( 1 , 70%) 1 Iterative 2: 1. repredictunlabel instance ’s neighbors 2. set unlabel instance ’s relational feature 3. use localclassifier predict unlabel instances Iterative 1: 1. predict unlabel instance ’s neighbors 2. set unlabel instance ’s relational feature 3. use local classifier predict unlabel instances ( 2 , 60%) ( 1 , 60%) ( 1 , 70%) 2 ( 2 , 60%) ( 2 , 60%) ( 2 ,60%) D 1 • noise B ( 1 , 90%) 2 A : 1 1 Iteration 1 70/130 60/130 0 Iteration 2 60/120 60/120 0

  11. ACIN -Analysis • Compare with ICA • 1. different method for compute relational feature • 2. predictunlabel instance ’s neighbors again • 3.retrain local classifier

  12. ACIN –Analysis #1 • compute relational feature • use probability General method : Class 1 : 1/3 Class 2 : 1/3 Class 3 : 1/3 A 1 3 2 ( 1 , 80%) ( 3 , 70%) Our method: Class 1 : 80/(80+60+70) Class 2 : 60/(80+60+70) Class 3 : 70/(80+60+70) ( 2, 60%)

  13. ACIN –Analysis #2 • predictunlabel instance ’s neighbors again • first iteration need to predict again • different between originaland predict label : • Next iterationneed predict again • This iterationnot to adopt • similarbetween originaland predict label : • Next iterationnot need predict again • Average the probability A Example: predict again ( 2, 60%) ( 2, 80%) 1 2 1 C B ( 2, 70%) ( 2, 60%) ( 2, 60%) ( 1 , 80%)

  14. ACIN –Analysis #2 B’s True label : 2 2 A - noise B D 1 3 C ( 1 , 60%) ( 3 , 60%) ( 2 , 70%) ( 2 , 70%) predict again ( 3 , 60%) ( 2 , 80%) B is noise: Method 2 >Method 3> Method 1 ( ? , ??%) ( 3 , 60%) ( 2 , 75%) B is not noise: Method 1 >Method 3> Method 2 B: Method 1 : (1 , 50%) Method 2 : (2 , 60%) Method 3 : (1 , 0%) 2 Method 1 & Method 2 is too extreme. So we choose the Method 3.

  15. ACIN –Analysis #3 • retrain local classifier Initial ( ICA ) D A B 1 2 + E C retrain ( 3 , 70%) ( 1 , 80%) D 3 A ( 2 , 70%) 1 B 2 1 1 + 2 E C ( 2, 60%) ( 1 , 90%)

  16. Experiments • Data sets: • Cora • CiteSeer • WebKB

  17. Conclusions

More Related