Improving the Graph Mincut Approach to Learning from Labeled and Unlabeled Examples

Improving the Graph Mincut Approach to Learning from Labeled and Unlabeled Examples Avrim Blum, John Lafferty, Raja Reddy, Mugizi Rwebangira

Outline • Often have little labeled data but lots of unlabeled data • Graph mincuts: based on a belief that most ‘close’ examples have same classification • Problem: -Does not say where it is most confident • Our approach: Add noise to edges to extract confidence scores

Learning using Graph Mincuts:Blum and Chawla (ICML 2001)

Construct a Graph

Add sink and source + -

Obtain s-t mincut + - Mincut

Classification + - Mincut

Goal • To obtain a measure of confidence on each classification Our approach • Add random noise to the edges • Run min cut several times • For each unlabeled example take majority vote

Experiments • Digits data set (each digit is a 16 X 16 integer array) • 100 labeled examples • 3900 unlabeled examples • 100 runs of mincut

Results

Conclusions • 3% error on 80% of the data • Standard mincut only gives us 6% error on all the data • Future Work • Conduct further experiments on other data sets • Compare with similar algorithm of Jerry Zhu • Investigate the properties of the distribution we get by selecting minimum cuts in this way

Questions?

Improving the Graph Mincut Approach to Learning from Labeled and Unlabeled Examples

Improving the Graph Mincut Approach to Learning from Labeled and Unlabeled Examples

Presentation Transcript

Learning with Positive and Unlabeled Examples using Weighted Logistic Regression

Self-taught Learning Transfer Learning from Unlabeled Data

Text Classification from Labeled and Unlabeled Documents using EM

Clustering tagged documents with labeled and unlabeled documents

Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples

Learning from Positive and Unlabeled Examples

Learning from Positive and Unlabeled Examples Investigator: Bing Liu, Computer Science

Combining Labeled and Unlabeled Data for Multiclass Text Categorization

Learning from labelled and unlabeled data

A Simple Probabilistic Approach to Learning from Positive and Unlabeled Examples

Motivation : Graph on labeled and unlabeled data W; Laplacian

Text Classification from Labeled and Unlabeled Documents using EM

Filtering noisy continuous labeled examples

Learning from Partially Labeled Data

Text Classification from Labeled and Unlabeled Documents using EM

Learning from Labeled and Unlabeled Data using Graph Mincuts

“Learning From Bad Examples!”

A Theoretical Model for Learning from Labeled and Unlabeled Data

Learning with Positive and Unlabeled Examples using Weighted Logistic Regression

Learning Description From Examples