120 likes | 290 Views
Learning using Graph Mincuts. Shuchi Chawla Carnegie Mellon University 1/11/2003. Learning from Labeled and Unlabeled Data. Cheap and available in large amounts Gives information about distribution of examples Useful with a prior Our prior: ‘close’ examples have a similar classification.
E N D
Learning using Graph Mincuts Shuchi Chawla Carnegie Mellon University 1/11/2003
Learning from Labeled and Unlabeled Data • Cheap and available in large amounts • Gives information about distribution of examples • Useful with a prior • Our prior: ‘close’ examples have a similar classification Shuchi Chawla, Carnegie Mellon University
Classification using Graph Mincut • Suppose the quality of a classification is defined by pairwise relationships between examples: If two examples are similar, but classified differently, we incur a penalty eg. Markov Random Fields • Graph mincut minimizes this penalty Shuchi Chawla, Carnegie Mellon University
Design Issues • What is the right Energy function? • Given an energy function, find a graph that represent the energy function • We deal with a simpler question: Given a distance metric on data, “learn” a graph (edge weights) that gives a good clustering Shuchi Chawla, Carnegie Mellon University
Assigning Edge Weights • Some decreasing function of distance between nodes eg. exponential decrease with appropriate slope • Unit weight edges • Connect nodes if they are within a distance of d What is a good value of d ? • Connect every node to its k nearest neighbours What is a good value of k ? • Sparser graph => faster algorithm Shuchi Chawla, Carnegie Mellon University
Connecting “near-by” nodes • Connect every pair with distance less than d • Need a method for finding a “good” d • very problem dependent • Possible approach: Use degree of connectivity, density of edges or value of the cut to pick the right value Shuchi Chawla, Carnegie Mellon University
Connecting “near-by” nodes • As d increases, value of the cut increases • Cut value = 0 ) supposedly no-error situation “Mincut-d0” • Very sensistive to ambiguity in classification or noise in the dataset • Should allow longer distance dependencies Shuchi Chawla, Carnegie Mellon University
Connecting “near-by” nodes • Grow till the graph becomes sufficiently well connected • Growing till the largest component contains half the nodes seems to work well (Mincut- ½ ) • Reasonably robust to noise Shuchi Chawla, Carnegie Mellon University
A sample of results Shuchi Chawla, Carnegie Mellon University
Which mincut is the “correct” mincut? • There can be “many” mincuts in the graph • Assign a high confidence value to examples on which all mincuts agree • Overall accuracy related to the fraction of examples that get a “high confidence” label. • Grow d until a reasonable fraction of examples gets a high confidence label Shuchi Chawla, Carnegie Mellon University
Connecting to nearest neighbors • Connect every node to its k nearest neighbours • As k increases, it is more likely to have small disconnected components • Connect to m nearest labeled and k other nearest neighbors Shuchi Chawla, Carnegie Mellon University
Other “hacks” • Weigh edges to labeled and unlabeled examples differently • Weigh different attributes differently eg. Use information gain as in decision trees • Weigh edges to positive and negative example differently: for a more balanced cut Shuchi Chawla, Carnegie Mellon University