1 / 27

CS 489/698 : Intro to ML

CS 489/698 : Intro to ML. Lecture 0 8 : Kernels. Announcement. Final exam: Dec 15 , 2017, Friday, 12:30 – 3:00 pm Room TBA Project proposal due tonight ICLR challenge. Outline. Feature map Kernels The Kernel Trick Advanced. XOR revisited. Quadratic classifier. Weights

marybrown
Download Presentation

CS 489/698 : Intro to ML

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS489/698: IntrotoML Lecture 08: Kernels Yao-Liang Yu

  2. Announcement • Finalexam:Dec15,2017,Friday,12:30–3:00pm • RoomTBA • Projectproposalduetonight • ICLRchallenge Yao-Liang Yu

  3. Outline • Feature map • Kernels • The Kernel Trick • Advanced Yao-Liang Yu

  4. XOR revisited Yao-Liang Yu

  5. Quadratic classifier Weights (to be learned) Yao-Liang Yu

  6. The power of lifting Rd Rd*d+d+1 Feature map Yao-Liang Yu

  7. Example Yao-Liang Yu

  8. Doesitwork? Yao-Liang Yu

  9. Curse of dimensionality? • This is still computable in O(d)! computation in this space now But, all we need is the dot product !!! Yao-Liang Yu

  10. Feature transform • NN: learn ϕ simultaneously with w • Here: choose a nonlinear ϕ so that for some f : RR save computation Yao-Liang Yu

  11. Outline • Feature map • Kernels • The Kernel Trick • Advanced Yao-Liang Yu

  12. Reverse engineering • Start with some function , s.t. exists feature transform ϕ with • As long as k is efficiently computable, don’t care the dim of ϕ (could be infinite!) • Such k is called a (reproducing) kernel. Yao-Liang Yu

  13. Examples • Polynomial kernel • Gaussian Kernel • Laplace Kernel • Matérn Kernel Yao-Liang Yu

  14. Verifying a kernel For any n, for any x1, x2, …, xn, the kernel matrix K with is symmetric and positive semidefinite( ) • Symmetric: Kij = Kji • Positive semidefinite (PSD): for all Yao-Liang Yu

  15. Kernel calculus • If k is a kernel, so is λk for any λ ≥ 0 • If k1 and k2 are kernels, so is k1+k2 • If k1 and k2 are kernels, so is k1k2 Yao-Liang Yu

  16. Outline • Feature map • Kernels • The Kernel Trick • Advanced Yao-Liang Yu

  17. Kernel SVM (dual) With α, but ϕis implicit… Yao-Liang Yu

  18. Does it work? Yao-Liang Yu

  19. Testing • Given test sample x’, how to perform testing? No explicit access to ϕ, again! dual variables kernel training set Yao-Liang Yu

  20. Tradeoff • Previously: training O(nd), test O(d) • Kernel: training O(n2d), test O(nd) • Nice to avoid explicit dependence on h (could be inf) • But if n is also large…(maybe later) Yao-Liang Yu

  21. Learning the kernel (Lanckriet et al.’04) • Nonnegative combination of t pre-selected kernels, with coefficients ζ simultaneously learned Yao-Liang Yu

  22. Logistic regression revisited Representer Theorem (Wabha, Schölkopf, Herbrich, Smola, Dinuzzo, …). The optimal w has the following form: kernelize Yao-Liang Yu

  23. Kernel Logistic Regression (primal) uncertain predictions get bigger weight Yao-Liang Yu

  24. Outline • Feature map • Kernels • The Kernel Trick • Advanced Yao-Liang Yu

  25. Universal approximation (Micchelli, Xu, Zhang’06) Universal kernel. For any compact set Z, for any continuous function f : Z  R, for any ε > 0, there exist x1, x2, …, xnin Z and α1,α2,…,αn in R such that kernel methods decision boundary Example. The Gaussian kernel. Yao-Liang Yu

  26. Kernel mean embedding (Smola, Song, Gretton, Schölkopf, …) • Characteristickernel: the above mapping is 1-1 • Completely preserve the information in the distribution P • Lots of applications feature map of some kernel Yao-Liang Yu

  27. Questions? Yao-Liang Yu

More Related