Kernels

Kernels CMPUT 466/551 Nilanjan Ray

Agenda • Kernel functions in SVM: A quick recapitulation • Kernels in regression • Kernels in k-nearest neighbor classifier • Kernel function: a deeper understanding • A case study

Kernel Functions: SVM The dual cost function: The non-linear classifier in dual variables: The kernel function K is symmetric and positive (semi)definite by definition

Input Space to Feature Space Picture taken from: Kernel methods for pattern analysis By Shawe-Taylor and Cristianini

Input Space to Feature Space: Example

Kernel Ridge Regression Consider the regression problem: fit the function to N data points Basis functions are non-linear in x Form the cost function: The solution is given by: where, Using the identity Ex. Prove this identity we have We have defined: Note that is the kernel matrix Finally the solution is given by The basis functions h have disappeared!

Kernel k-Nearest Neighbor Classifier Consider the k-nn classification problem in the feature space. Basis functions are typically non-linear in x The Euclidean distance in the feature space can be written as follows: Once again, the basis functions h have disappeared! Note also that a kernel function essentially provides similarity between two points in the input space (opposite of distance measure!)

The Kernel Architecture Picture taken from: Learning with kernels By Scholkopf and Smola

Inside Kernels Picture taken from: Learning with kernels

Inside Kernels… Given a point x in the input space, the function k(., x) is essentially function So, x is mapped into a function space (known as Reproducing kernel Hilbert space (RKHS) When we measure similarity of two points x and y in the input space, we are actually measuring the similarity between two functions k(., x) and k(., y) in RKHS. How is this similarity defined in in RKHS? By a (defined) inner product in RKHS: Reproducing property All the solutions so far we obtained has the form: This means these solutions are functions in RKHS. Functions in RKHS are nicer: they are smooth, they have finite-dimensional representation. Good for computations and practical solutions See “Learning with kernels” for more; Must read G. Wahba’s work to learn more on RKHS vis-à-vis M/C Learning.

Kernels

Kernels

Presentation Transcript

Micro-kernels

Introduction to Kernels

Sparse Kernels Methods

Extensible Kernels

Latent Semantic Kernels

Lecture 14: Kernels

Extensible Kernels

Porting Kernels

broadband kernels:

Introduction to Kernels

Introduction to Kernels

Kernels and Margins

Characterization of Kernels

Kernels and Routines

 -kernels

Kernels

Kernels

Marginalized Kernels & Graph Kernels

Porting Kernels

Porting Kernels

Extensible Kernels

Kernels and Margins

Kernels

Kernels

Presentation Transcript

Micro-kernels

Introduction to Kernels

Sparse Kernels Methods

Extensible Kernels

Latent Semantic Kernels

Lecture 14: Kernels

Extensible Kernels

Porting Kernels

broadband kernels:

Introduction to Kernels

Introduction to Kernels

Kernels and Margins

Characterization of Kernels

Kernels and Routines

 -kernels

Kernels

Kernels

Marginalized Kernels &amp; Graph Kernels

Porting Kernels

Porting Kernels

Extensible Kernels

Kernels and Margins

Marginalized Kernels & Graph Kernels