1 / 21

Pairwise Constraint Propagation for Semi-Supervised Classification

This work presents a method for classifying data from pairwise constraints and unlabeled samples, exploring a novel approach for constraint propagation. The framework involves a nonlinear mapping to satisfy constraints and effectively classify data, with emphasis on smoothness and distance between objects. Experimental results demonstrate the effectiveness of the proposed method. Future work includes improving constraint handling and extending the approach to practical applications.

dhicks
Download Presentation

Pairwise Constraint Propagation for Semi-Supervised Classification

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pairwise Constraint Propagation by Semidefinite Programming for Semi-Supervised Classification Zhenguo Li (Joint work with Jianzhuang Liu and Xiaoou Tang) Department of Information Engineering The Chinese University of Hong Kong

  2. Outline • Semi-Supervised Classification • Our Work • Experimental Results • Conclusions and Future Work

  3. Traditional Semi-Supervised Classification • Learning from labeled and unlabeled data. • Assumption • Nearby objects tend to be in the same class (cluster assumption). • Idea • The known class labels are propagated smoothly to unlabeled data (label propagation).

  4. Challenges • The distributions of real-world data are often more complex than expected where • a class may consist of multiple separate groups. • different classes may be close or overlapped. • Pairwise constraints are natural in these circumstances, which specify whether two objects are in the same class or not (must-link and cannot-link). • Techniques for label propagation are not readily extended to handle pairwise constraints.

  5. Our Work • We consider the general problem of classifying from pairwise constraints and unlabeled data. • It is more general than traditional semi-supervised classification. • In contrast to label propagation, we attempt to explore an approach for pairwise constraint propagation.

  6. A Toy Classification Example

  7. The Global Viewpoint • The must-link constraint asks to merge the outer and inner circles into one class; • The cannot-link constraint asks to keep the middle and outer circles into different classes.

  8. Our Assumptions • Cluster Assumption • Nearby objects shouldbe in the same class. • Pairwise Constraint Assumption • Objects similar to two must-link objects respectively should be in the same class; • Objects similar to two cannot-link objects respectively should be in different classes. • Our goal is to implement both the two assumptions in a unified framework.

  9. Our Idea • Learn a nonlinear mapping to reshape the data such that • Nearby objects are mapped nearby; • Two must-link objects are mapped close and two cannot-link objects are mapped far apart; • Objects similar to two must-link objects respectively are mapped close, and objects similar to two cannot-link objects respectively are mapped far apart.  • In doing so, the pairwise constraints will be propagated to the entire data set.

  10. The General Framework

  11. Interpretation • Constraint Satisfaction The inequalities require two must-link objects to be mapped close and two cannot-link objects to be mapped far apart. • Constraint Propagation By enforcing the smoothness on the mapping, two objects similar to two must-link objects respectively are mapped close and two objects similar to two cannot-link objects respectively are mapped far apart. • After the mapping, hopefully each class becomes compact and different classes become far apart.

  12. The Unit Hypersphere Model • All the objects are mapped onto the unit hypersphere. • Two must-link objects are mapped to the same point. • Two cannot-link objects to be orthogonal. • Smoothness measure

  13. Learning a Kernel Matrix • Let • The matrix can be thought as a kernel over the data set, where is just the feature map induced by . • (Kernel Trick) We can implicitly obtain the feature map by explicitly pursuing the corresponding kernel matrix. 

  14. Learning a Kernel Matrix • The constraints become •  The smoothness measure becomes

  15. The SDP Problem

  16. Kernel K-means • Finally, we apply the kernel K-means to the learned kernel matrix to obtain k classes of the objects. 

  17. Experimental Results: Toy Data • Distance matrices before and after the mapping 

  18. Experimental Results: UCI Data

  19. Experimental Results: Image Data

  20. Conclusions • We have proposed a framework PCP for learning from pairwise constraints and unlabeled data: • It can effectively propagate pairwise constraints; • It is formulated as a SDP problem. • Future work includes • accelerating PCP; • handling noisy constraints effectively; • applying PCP to practical applications.

  21. Thank You!

More Related