430 likes | 514 Views
Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date : 2013.01.04. Ambiguous Nodes in Networked Data based on Measuring Reliable Neighboring Probabilities. Outline. Introduction Network data Traditional VS Networked data Classification Collective Classification ICA
E N D
Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date : 2013.01.04 Ambiguous Nodes in Networked Data based on Measuring Reliable Neighboring Probabilities
Outline • Introduction • Network data • TraditionalVS Networked data Classification • Collective Classification • ICA • Problem • Our Method • Collective Inference With Ambiguous Node (CIAN) • Experiments • Conclusion
Introduction – Network data • traditional data: • instances are independent of each other • network data: • instances are maybe related to each other • application: • emails • web page • paper citation independent related
Introduction • traditionalVS network data classification 1 Class: 1 2 A A 1 E E D D 2 B : Class 1 F F B B 2 G G C C H H
Introduction – Collective Classification • To classify interrelated instances using content features and link features. D D: E: 1 A B 1 1 0 0 1 1 0 2 1 1 0 2 0 0 2 + C E 1/2 1/2 0 1 0 0 1 0 0 1 We use : 1/2 1/2 0 2 1
Introduction – ICA • ICA : Iterative Classification Algorithm Initial : Training local classifier use content features to predict unlabel instances Iterative{ for predict each unlabeled instance { set unlabeled instance ’s link feature use local classifier to predict unlabeled instance } } step1 step2
Introduction – ICA Example unlabel data: Training data: training data: Class : 1 2 3 3 1 H C 2 3 A 3 1 B 2 1 2 2/3 0 1/3 1 A: 3 1/3 1/3 1/3 D E 2 I 3 1 F 1 1 G 2 1/2 1/2 0 B: 2 1/4 1/2 1/4
Problem – AmbiguousNode C A • label the wrong class • judge the label with difficulty • make a mistake B G D 2 E F 2 1 1 1 1 or 2 ? 2 1
Problem – use ICA Training data True class : unlabel data: B A C 2 2 1 training data: 1 A G 1 1 1 C F 1 2 D J 2 • Ambiguous B 1 1 I 2 1 2/3 1/3 0 A: 1 2 E 1 2/3 1/30 1 H C:
Idea • Make a new prediction for neighbors of unlabeled instance • Use probability to compute link feature • Retrain the CC classifier
Our Method –Method #1 • compute link feature • use probability General method : Class 1 : 1/3 Class 2 : 1/3 Class 3 : 1/3 A 1 3 2 ( 1 , 80%) ( 3 , 70%) Our method: Class 1 : 80/(80+60+70) Class 2 : 60/(80+60+70) Class 3 : 70/(80+60+70) ( 2, 60%)
Our Method –Method #2 True class : B A To predict unlabeled instance ’s neighbors again. C 2 2 1 A A G G 1 1 1 1 C C 1 1 F F D D ( 1 , 70%) ( 1 , 70%) 2 2 ( 1 , 70%) ( 2, 80%) ( 1 , 70%) ( 2, 80%) 1 1 ( 2 , 80%) ( 2 , 80%) • Noise • Ambiguous B B ( 1 , 70%) ( 1 , 70%) 2 2 E E ( 2, 90%) ( 2, 60%) predict again predict again 2 2 1 1 1 1 H H B is ambiguous node. B is noise node.
Our Method –Method #2 • To predictunlabeled instance ’s neighbors again • first iteration needs to predict again • difference between originaland predict label : • This iterationdoesn’t to adopt • Next iterationneed to predict again • similarity between originaland predict label : • Average the probability • Next iterationdoesn’t need to predict again A Example: new prediction ( 2, 60%) ( 2, 80%) 1 2 1 C B ( 2, 70%) ( 2, 60%) ( 2, 60%) ( 1 , 80%)
Our Method –Method #2 x’sTrue label : 2 2 w • Ambiguous x z 1 3 y ( 1 , 60%) ( 3 , 60%) ( 2 , 70%) new prediction ( 2 , 70%) ( 3 , 60%) ( 2 , 80%) x is ambiguous (ornoise) node: Method B >Method C > Method A ( ? , ??%) ( 3 , 60%) ( 2 , 75%) x is notambiguous (ornoise) node: Method A >Method C > Method B x: Method A : (1 , 50%) Method B : (2 , 60%) Method C : (1 , 0%) not change class 2 Method A & Method B is too extreme. So we choose the Method C. change class not adopt
Our Method –Method #2 Accuracy
Our Method –Method#3 • Retrain CC classifier Initial ( ICA ) D A B 1 2 + E C retrain ( 3 , 70%) ( 1 , 80%) D 3 A ( 2 , 70%) 1 B 2 1 1 + 2 E C ( 2, 60%) ( 1 , 90%)
CIAN Example – Ambiguous Training data True label: unlabel data: B A C training data: 2 2 1 B: 2 1 ( 1 , 60%) G ( 2 , 60%) ( 1 , 60%) A 2 1 1 predict again C ( 1 , 80%) 1 1 ( 2 , 80%) ( 2 , 60%) ( 1 , 80%) 2 ( 2 , 80%) D F 2 1 1/2 1/2 0 Our: • Ambiguous B 1 ( 1 , 70%) 2 ICA: E ( 2 , 80%)
CIAN Example – Noise Training data True label: unlabel data: B A C training data: 2 2 1 B: 2 1 ( 1 , 70%) ( 1 , 60%) G ( 2 , 70%) A 2 1 1 predict again C ( 1 , 80%) 1 1 ( 2 , 80%) ( 2 , 80%) ( 1 , 80%) 2 ( 2 , 80%) D F 1 Our: • Noise B 1 2 ( 1 , 70%) 1/2 1/2 0 2 E ( 2 , 80%) ICA:
CIAN • CIAN : Collective Inference With Ambiguous Node Initial : Training local classifier use content features to predict unlabel instances Iterative{ for predict each unlabel instance { for nb unlabeled instance ’s neighbors{ if(need to predict again) (class label, probability ) = local classifier(nb) } set unlabel instance ’s link feature (class label, probability ) = local classifier(A) } retrain local classifier } step1 step2 step3 step4 step5
Experiments-Experimental setting fixed argument ‧Compare with CO、ICA、CIAN
Experiments • 1. misclassified nodes • Proportion of misclassified nodes (0%~30% , 80%) • 2. ambiguous nodes • NB vs SVM • 3. misclassified and Ambiguous nodes • Proportion of misclassified and ambiguous nodes (0%~30% , 80%) • 4.iteration & stable • number of iterations
Experiments – 1. misclassified • CiteSeer
Experiments – 1. misclassified • WebKB-texas
Experiments – 1. misclassified • WebKB-washington
Experiments – 1. misclassified • 80% of misclassifiednodes
Experiments – 2. ambiguous • Cora Max ambiguous nodes : 429 Max ambiguous nodes : 356
Experiments – 2. ambiguous • CiteSeer Max ambiguous nodes : 590 Max ambiguous nodes : 365
Experiments – 2. ambiguous • WebKB-texas Max ambiguous nodes : 52 Max ambiguous nodes : 20
Experiments – 2. ambiguous • WebKB-washington Max ambiguous nodes : 33 Max ambiguous nodes : 31
Experiments – 2. ambiguous ‧ How much the same ambiguous nodes between NB and SVM?
Experiments – 3. misclassified and ambiguous • CiteSeer
Experiments – 3. misclassified and ambiguous • WebKB-texas
Experiments – 3. misclassified and ambiguous • WebKB-washington
Experiments – 3. misclassified and ambiguous • 80% of misclassified and ambiguousnodes
Experiments ‧ When the accuracy of ICA is lower than CO?
Experiments – 4. iteration & stable • CiteSeer
Experiments – 4. iteration & stable • WebKB-texas
Experiments – 4. iteration & stable • WebKB-washington