1 / 19

Fast SDP Relaxations of Graph Cut Clustering, Transduction, and Other Combinatorial Problems

Fast SDP Relaxations of Graph Cut Clustering, Transduction, and Other Combinatorial Problems (JMLR 2006). Tijl De Bie and Nello Cristianini. Presented by Lihan He March 16, 2007. Outline. Statement of the problem Spectral relaxation and eigenvector SDP relaxation and Lagrange dual

abennett
Download Presentation

Fast SDP Relaxations of Graph Cut Clustering, Transduction, and Other Combinatorial Problems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Fast SDP Relaxations of Graph Cut Clustering, Transduction, and Other Combinatorial Problems (JMLR 2006) Tijl De Bie and Nello Cristianini Presented by Lihan He March 16, 2007

  2. Outline • Statement of the problem • Spectral relaxation and eigenvector • SDP relaxation and Lagrange dual • Generalization: between spectral and SDP • Transduction and side information • Experiments • Conclusions

  3. Statement of the problem Data set S: Affinity matrix A: Objective: graph cut clustering -- divide the data points into two set, P and N, such that No label: clustering With some labels: transduction

  4. Statement of the problem Normalized graph cut problem (NCut) Cost function: How well the clusters are balanced Cut cost where

  5. Statement of the problem Normalized graph cut problem (NCut) Unknown label vector Let Write Rewrite the NCut problem as a combinatorial optimization problem (1) NP-complete problem, the exponent is very high.

  6. Spectral Relaxation Let the problem becomes Relax the constraints by adding and dropping the combinatorial constraints on , we obtain the spectral clustering relaxation (2)

  7. Spectral Relaxation: eigenvector Solution: the eigenvector corresponding to the second smallest generalized eigenvalue. Solve the constrained optimization by Lagrange dual: The second constraint is automatically satisfied:

  8. SDP Relaxation Let the problem becomes Note that Relax the constraints by adding the above constraints and dropping and Let and we obtain the SDP relaxation (3)

  9. SDP Relaxation: Lagrange dual Lagrangian: We obtain the dual problem (strong dual is hold): (4) n+1 variables

  10. Generalization: between spectral and SDP A cascade of relaxations tighter than spectral and looser than SDP where n constraints m constraints, Looser than SDP m+1 variables design how to relax the constraints Design the structure of W

  11. 1 … 1 1 … W= 1 … 1 … 1 Generalization: betweenspectral and SDP • rank(W)=n: original SDP relaxation. • rank(W)=1: m=1, W=d: spectral relaxation. • A relaxation is tighter than another if the column space of the matrix W used in the first one contains the full column space of W of the second. • If choose d within the column space of W, then all relaxations in the cascade are tighter than the spectral relaxation. • One approach of designing W proposed by the author: • Sort the entries of the label vector (2nd eigenvector) from spectral relaxation; • Construct partition: m subsets are roughly equally large; • Reorder the data points by this sorted order; • W … 1 2 m ~ n/m

  12. yt 0 Labeled L= Unlabeled 0 I Transduction Given some labels, written as label vector yt -- transductive problem Reparameterize Label constraints are imposed: • Rows (columns) corresponding to oppositely labeled training points then automatically are each other’s opposite; • Rows (columns) corresponding to same-labeled training points are equal to each other.

  13. Transduction Transductive NCut relaxation: ntest+2 variables

  14. General constraints • An equivalence constraint between two sets of data points specifies that they belong to the same class; • An inequivalence constraint specifies two set of data points to belong to opposite classes. • No detailed label information provided.

  15. Experiments Affinity matrix: 1. Toy problems

  16. Experiments 2. Clustering and transduction on text 4 languages Data set:195 articles several topics 1 Affinity matrix: 20-nearest neighbor: A(i,j)= 0.5 0 Distance of two articles: cosine distance on the bag of words representation Define dictionary

  17. Experiments 2. Clustering and transduction on text: cost By language By topic Spectral (randomized rounding) Cost Cost SDP (randomized rounding) SDP (lower bound) Spectral (lower bound) Fraction of labeled data points Fraction of labeled data points Cost: randomized rounding ≥ opt ≥ lower bound

  18. Experiments 2. Clustering and transduction on text: accuracy By language By topic Accuracy Accuracy SDP (randomized rounding) Spectral (randomized rounding) Fraction of labeled data points Fraction of labeled data points

  19. Conclusions • Proposed a new cascade of SDP relaxations of the NP-complete normalized graph cut optimization problem; • One extreme: spectral relaxation; • The other extreme: newly proposed SDP relaxation; • For unsupervised and semi-supervised learning, and more general constraints; • Balance the computational cost and the accuracy.

More Related