100 likes | 305 Views
Kernels. CMPUT 466/551 Nilanjan Ray. Agenda. Kernel functions in SVM: A quick recapitulation Kernels in regression Kernels in k -nearest neighbor classifier Kernel function: a deeper understanding A case study. Kernel Functions: SVM. The dual cost function:. The non-linear classifier
E N D
Kernels CMPUT 466/551 Nilanjan Ray
Agenda • Kernel functions in SVM: A quick recapitulation • Kernels in regression • Kernels in k-nearest neighbor classifier • Kernel function: a deeper understanding • A case study
Kernel Functions: SVM The dual cost function: The non-linear classifier in dual variables: The kernel function K is symmetric and positive (semi)definite by definition
Input Space to Feature Space Picture taken from: Kernel methods for pattern analysis By Shawe-Taylor and Cristianini
Kernel Ridge Regression Consider the regression problem: fit the function to N data points Basis functions are non-linear in x Form the cost function: The solution is given by: where, Using the identity Ex. Prove this identity we have We have defined: Note that is the kernel matrix Finally the solution is given by The basis functions h have disappeared!
Kernel k-Nearest Neighbor Classifier Consider the k-nn classification problem in the feature space. Basis functions are typically non-linear in x The Euclidean distance in the feature space can be written as follows: Once again, the basis functions h have disappeared! Note also that a kernel function essentially provides similarity between two points in the input space (opposite of distance measure!)
The Kernel Architecture Picture taken from: Learning with kernels By Scholkopf and Smola
Inside Kernels Picture taken from: Learning with kernels
Inside Kernels… Given a point x in the input space, the function k(., x) is essentially function So, x is mapped into a function space (known as Reproducing kernel Hilbert space (RKHS) When we measure similarity of two points x and y in the input space, we are actually measuring the similarity between two functions k(., x) and k(., y) in RKHS. How is this similarity defined in in RKHS? By a (defined) inner product in RKHS: Reproducing property All the solutions so far we obtained has the form: This means these solutions are functions in RKHS. Functions in RKHS are nicer: they are smooth, they have finite-dimensional representation. Good for computations and practical solutions See “Learning with kernels” for more; Must read G. Wahba’s work to learn more on RKHS vis-à-vis M/C Learning.