Similarity-based Classifiers: Problems and Solutions

Similarity-based Classifiers:Problems and Solutions

Classifying based on similarities: Van Gogh Monet Van Gogh Or Monet ?

the Similarity-based Classification Problem (paintings) (painter)

the Similarity-based Classification Problem

the Similarity-based Classification Problem ?

Examples of Similarity Functions Computational Biology • Smith-Waterman algorithm (Smith & Waterman, 1981) • FASTA algorithm (Lipman & Pearson, 1985) • BLAST algorithm (Altschul et al., 1990) Computer Vision • Tangent distance (Duda et al., 2001) • Earth mover’s distance (Rubner et al., 2000) • Shape matching distance (Belongie et al., 2002) • Pyramid match kernel (Grauman & Darrell, 2007) Information Retrieval • Levenshtein distance (Levenshtein, 1966) • Cosine similarity between tf-idf vectors (Manning & Schütze, 1999)

Approaches to Similarity-based Classification

Can we treat similarities as kernels?

Example: Amazon similarity 96 books 96 books

Example: Amazon similarity 96 books Eigenvalues Rank 96 books

Well, let’s just make S be a kernel matrix 0 0

Well, let’s just make S be a kernel matrix Flip, Clip or Shift? Best bet is Clip. 0 0

Well, let’s just make S be a kernel matrix Learn the best kernel matrix for the SVM: (Luss NIPS 2007, Chen et al. ICML 2009)

Let the similarities to the training samples be features • SVM (Graepel et al., 1998; Liao & Noble, 2003) • Linear programming (LP) machine (Graepel et al., 1999) • Linear discriminant analysis (LDA) (Pekalska et al., 2001) • Quadratic discriminant analysis (QDA) (Pekalska & Duin, 2002) • Potential support vector machine (P-SVM) (Hochreiter & Obermayer, 2006; Knebel et al., 2008)

Weighted Nearest-Neighbors Take a weighted vote of the k-nearest-neighbors: Algorithmic parallel of the exemplar model of human learning. ?

Weighted Nearest-Neighbors Take a weighted vote of the k-nearest-neighbors: Algorithmic parallel of the exemplar model of human learning.

Design Goals for the Weights ?

Design Goals for the Weights ? Design Goal 1 (Affinity):wi should be an increasing function of ψ(x, xi).

Design Goals for the Weights ?

Design Goals for the Weights (Chen et al. JMLR 2009) ? Design Goal 2 (Diversity):wi should be a decreasing function of ψ(xi, xj).

Linear Interpolation Weights Linear interpolation weights will meet these goals:

LIME weights Linear interpolation weights will meet these goals: Linear interpolation with maximum entropy (LIME) weights (Gupta et al., IEEE PAMI 2006):

Kernelize Linear Interpolation (Chen et al. JMLR 2009)

Kernelize Linear Interpolation regularizes the variance of the weights

Kernelize Linear Interpolation only need inner products – can replace with kernel or similarities!

KRI Weights Satisfy Design Goals Kernel ridge interpolation (KRI) weights:

KRI Weights Satisfy Design Goals Kernel ridge interpolation (KRI) weights: affinity:

KRI Weights Satisfy Design Goals Kernel ridge interpolation (KRI) weights: diversity:

KRI Weights Satisfy Design Goals Kernel ridge interpolation (KRI) weights:

KRI Weights Satisfy Design Goals Kernel ridge interpolation (KRI) weights: Remove the constraints on the weights: Can show equivalent to local ridge regression: KRR weights.

Weighted k-NN: Example 1 KRI weights KRR weights

Similarity-based Classifiers: Problems and Solutions

Similarity-based Classifiers: Problems and Solutions

Presentation Transcript

“A” students work (without solutions manual) ~ 10 problems/night .

Game Theoretic Problems in Network Economics and Mechanism Design Solutions

Lima’s Slums: Problems or Solutions?

Chapter 7: Solutions and Colloids

Development and Validation of Predictive Classifiers using Gene Expression Profiles

The Company

Sample Selection Bias – Covariate Shift: Problems, Solutions, and Applications

Deepwater Drilling Solutions

SimRank : A Measure of Structural-Context Similarity

Using global climate models to evaluate environmental problems and potential solutions

Server Consolidation

Sequence Comparison

Millennium Development Goals: Global Solutions to Global Problems

Topic 1 Outline

Support Vector Machine 支持向量機

A few methods for learning binary classifiers

Practice Problems Chang, Chapter 4 Reactions in Aqueous Solutions

A gentle introduction to the mathematics of biosurveillance: Bayes Rule and Bayes Classifiers

HEMALATHA

Process Improvement from a Materials Perspective

Learning Embeddings for Similarity-Based Retrieval

Support Vector Machine Classifiers