300 likes | 596 Views
Multi-Label Collective Classification. Xiangnan Kong Xiaoxiao Shi Philip S. Yu. University of Illinois at Chicago. Collective Classification. Conventional classification approaches assume that instances are independent identically distributed ( i.i.d . ). instance. label. x 1.
E N D
Multi-Label Collective Classification Xiangnan Kong Xiaoxiao Shi Philip S. Yu University of Illinois at Chicago
Collective Classification • Conventional classification approaches assume that instances are independentidentically distributed (i.i.d.) instance label x1 y1 independent x2 y2 • In relational data and information networks, instances are correlated with each other. x1 y1 related x2 x3 y2 y3
Example: Collective Classification y1 • Given a set of web pageslinkedwith each other,we need to classify categories • Task:Predict the labelsof webpages collectively, while considering dependencies among linked webpages y2 y5 y3 Training Data y6 y4 ? ? Test Data ? ? ? ? ?
Examples Coauthor Networks Business Network
Collective Classification • Collective Classification Given a set of instances which are related to each other, how to predict their labels simultaneously • Existing Methods • Exploit the dependencies among related instances • Focused on single-label settings • Assume one instance can only have one label
Multi-label Collective Classification DM IR Research Area How to effectively predict the label sets of a group of related instances? DM collaborations DB DB IR AI AI
The Problem • Multiple labels: # possible label sets is very large (the power set of all labels) 20 labels 1 million label sets The key is to exploit the correlationsamong the multiple labels • Relational Data: the label sets of related instances are correlated with each other.
Intra-instance Cross-label Dependency Y1 … x2 Yk … Ym Y1 e.g. “X1 is more likely to be DM, if labeled with DB or ML” e.g. “X1 is unlikely to be Bio, if labeled with OS … x1 Yk … Ym Y1 … x3 Yk … Ym
Inter-instance Single-label Dependency Y1 … x2 Yk … Ym Y1 e.g. “X1 is more likely to be DM, if collaborators (X2 X3) are labeled with DM” … x1 Yk … Ym Y1 … x3 Yk … Ym
Inter-instance Cross-label Dependency Y1 … x2 Yk … Ym Y1 e.g. “X1 is more likely to be DM, if collaborators (X2 X3) are labeled with DB or ML” … x1 Yk … Ym Y1 … x3 Yk … Ym
All dependencies: our approach Y1 … x2 Yk … Ym Y1 … x1 Yk … Ym Y1 … x3 Yk … Ym
Relational Feature Aggregation relational features content features labels 1 2 3 1 1 0 1 Y1 1 0 1 1 0 x1 1 1 0 Y2 0 1 1 0 1 1 Y3 1 0 1 1 0 Y1 1 x2 1 1 1 1 0 0 Y2 Intra-Instance Cross-Label … 1 1 0 0 1 1 1 Y3 Inter-Instance Single-Label 2 Inter-Instance Cross-Label 3
Inference only use Content Features IterativeClassification of Multiple Labels Initialize label sets usingpredicted label sets Update Relational Features usingcontent feature + relational feature Update label sets
The ICML Approach Properties: Simple & Efficient: train multiple local models to perform collective classification on multiple labels Effective: By considering the dependencies among related instances and multiple labels, the classification performance can be greatly improved over independent models.
Experiments: Compared Methods Dependencies Exploited • Binary classification • Binary SVMbinary decomposition + SVM [Boutell et.al., PR’04]none • Multi-label classification • ECC & CCensemble + classifier chains[Read et.al., ECML’09] 1 • Collective classification • ICAiterative classification algorithm [Lu&Getoor, ICML’03] 2 • Multi-label collective classification • ML-ICAa proposed baseline[this paper] • ICMLthe proposed approach [this paper] 1 2 1 2 3 Intra-Instance Cross-Label 1 Inter-Instance Cross-Label Inter-Instance Single-Label 3 2
Experiments: Data Sets • Research Collaboration Networks (DBLP) • Node: Researcher • Features: bag-of-words for paper titles • Link: Collaboration • Label: Research Area (DB, AI, IR, OS, etc) • Movie Database (IMDB) • Node: movie • Features: bag-of-words for movie plot • Link: share director • Label: movie type (comedy, horror, etc)
Evaluation • Multi-Label Metrics • Hamming Loss ↓[Elisseef&Weston NIPS’02] average #labels being misclassified • Subset 0/1 Loss ↓[Ghamrawi&McCallum CIKM’05] average #label sets being misclassified • Micro-F1 ↑[Ghamrawi&McCallum CIKM’05] micro average of F1 score • Macro-F1 ↑[Ghamrawi&McCallum CIKM’05] macro average of F1 score ↓the smaller the better ↑ the larger the better • 5-fold cross-validation
Experiment Results DBLP-A Dataset Y1 Y1 … … x1 x2 Yk Yk … … Ym Ym
Experiment Results DBLP-A Dataset Y1 Y1 … … … x2 x1 Yk Yk … … Ym Ym Intra-Instance Cross-Label 1
Experiment Results DBLP-A Dataset … Y1 Y1 … … x2 x1 Yk Yk … … … Ym Ym Inter-Instance Single-Label 2
Experiment Results DBLP-A Dataset Y1 Y1 … … x2 x1 Yk Yk … … Ym Ym Intra-Instance Cross-Label 1 Inter-Instance Single-Label 2
Experiment Results DBLP-A Dataset Y1 Y1 … … x2 x1 Yk Yk … … Ym Ym Intra-Instance Cross-Label 1 Inter-Instance Cross-Label Inter-Instance Single-Label 3 2
Experiment Results DBLP-B Dataset
Experiment Results IMDB Dataset • Our approach performed best at DBLP and IMDB datasets
Experiment Results ICML approach #Iteration DBLP-A dataset
Conclusions • Multi-labelCollective Classification • Propose an algorithm to exploit the dependencies among label sets of related instances • Intra-instance Cross-label Dependency • Inter-instance Single-label Dependency • Intra-instance Cross-label Dependency • Classification performances can be improved by considering the dependencies among instances and different labels. Thank you!