1 / 83

Kernel Density Estimation

Kernel Density Estimation. Theory and Application in Discriminant Analysis. Thomas Ledl Universität Wien. Contents:. Introduction Theory Aspects of Application Simulation Study Summary. Introduction. 0. 1. 2. 3. 4. Introduction. Theory. Application Aspects. Simulation Study.

menora
Download Presentation

Kernel Density Estimation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Kernel Density Estimation Theory and Application in Discriminant Analysis Thomas Ledl Universität Wien

  2. Contents: • Introduction • Theory • Aspects of Application • Simulation Study • Summary

  3. Introduction

  4. 0 1 2 3 4 Introduction Theory Application Aspects Simulation Study Summary 25 observations: Which distribution? Introduction

  5. 0 1 2 3 4 ? ? ? ? ?

  6. 0 1 2 3 4 Introduction Theory K(.) and h to choose Application Aspects Simulation Study Summary Kernel density estimator model:

  7. 0 1 2 3 4 kernel/ bandwidth: „large“ h „small“ h triangular gaussian

  8. Introduction Theory Application Aspects Simulation Study Summary Question 1: Which choice of K(.) and h is the best for a descriptive purpose?

  9. Introduction Theory Application Aspects Simulation Study Summary Classification: Introduction

  10. Introduction Theory Application Aspects Simulation Study Summary Classification: Levelplot – LDA (based on assumption of a multivariate normal distribution): Introduction

  11. Introduction Theory Application Aspects Simulation Study Summary Classification: Introduction

  12. Introduction Theory Application Aspects Simulation Study Summary Classification: Levelplot – KDE classificator: Introduction

  13. Introduction Theory Application Aspects Simulation Study Summary Question 2: Performance of classification based on KDE in more than 2 dimensions? Introduction

  14. Theory

  15. Introduction Theory Application Aspects Simulation Study Summary Essential issues • Optimization criteria • Improvements of the standard model • Resulting optimal choices of the model parameters K(.) and h

  16. Introduction Theory Application Aspects Simulation Study Summary Essential issues • Optimization criteria • Improvements of the standard model • Resulting optimal choices of the model parameters K(.) and h

  17. Introduction Theory Application Aspects Simulation Study Summary Optimization criteria Lp-distances:

  18. Introduction Theory Application Aspects Simulation Study Summary f(.) g(.)

  19. Introduction Theory Application Aspects Simulation Study Summary

  20. =IAE „Integrated absolute error“ =ISE „Integrated squared error“ Introduction Theory Application Aspects Simulation Study Summary

  21. =IAE „Integrated absolute error“ =ISE „Integrated squared error“ Introduction Theory Application Aspects Simulation Study Summary

  22. Minimization of the maximum vertical distance Introduction Theory Application Aspects Simulation Study Summary Other ideas: • Consideration of horizontal distances for a more intuitive fit (Marron and Tsybakov, 1995) • Compare the number and position of modes

  23. L1-distance=IAE L-distance=Maximum difference „Modern“ criteria, which include a kind of measure of the horizontal distances L2-distance=ISE, MISE,AMISE,... Difficult mathematical tractability Does not consider overall fit Difficult mathematical tractability Introduction Theory Application Aspects Simulation Study Summary Overview about some minimization criteria • Most commonlyused

  24. ISE is a random variable MISE=E(ISE), the expectation of ISE AMISE=Taylor approximation of MISE, easier to calculate Introduction Theory Application Aspects Simulation Study Summary ISE, MISE, AMISE,...

  25. Introduction Theory Application Aspects Simulation Study Summary Essential issues • Optimization criteria • Improvements of the standard model • Resulting optimal choices of the model parameters K(.) and h

  26. Introduction Theory Application Aspects Simulation Study Summary The AMISE-optimal bandwidth

  27. dependent on the kernel function K(.) Introduction minimized by Theory „Epanechnikov kernel“ Application Aspects Simulation Study Summary The AMISE-optimal bandwidth

  28. dependent on the unknown density f(.) Introduction Theory Application Aspects Simulation Study Summary The AMISE-optimal bandwidth How to proceed?

  29. Maximum Likelihood Cross-Validation Least-squares cross-validation (Bowman, 1984) Leave-one-out selectors Criteria based on substituting R(f“) in the AMISE-formula Introduction Theory Application Aspects Simulation Study Summary Data-driven bandwidth selection methods • „Normal rule“ („Rule of thumb“; Silverman, 1986) • Plug-in methods (Sheather and Jones, 1991; Park and Marron,1990) • Smoothed bootstrap

  30. Introduction Theory Application Aspects Simulation Study Summary Data-driven bandwidth selection methods Leave-one-out selectors • Maximum Likelihood Cross-Validation • Least-squares cross-validation (Bowman, 1984) Criteria based on substituting R(f“) in the AMISE-formula • „Normal rule“ („Rule of thumb“; Silverman, 1986) • Plug-in methods (Sheather and Jones, 1991; Park and Marron,1990) • Smoothed bootstrap

  31. Introduction Theory Application Aspects Simulation Study Summary Least squares cross-validation (LSCV) • Undisputed selector in the 1980s • Gives an unbiased estimator for the ISE • Suffers from more than one local minimizer – no agreement about which one to use • Bad convergence rate for the resulting bandwidth hopt

  32. Maximum Likelihood Cross-Validation Least-squares cross-validation (Bowman, 1984) Introduction Theory Application Aspects Simulation Study Summary Data-driven bandwidth selection methods Leave-one-out selectors Criteria based on substituting R(f“) in the AMISE-formula • „Normal rule“ („Rule of thumb“; Silverman, 1986) • Plug-in methods (Sheather and Jones, 1991; Park and Marron,1990) • Smoothed bootstrap

  33. The resulting bandwidth is given by: Introduction Theory Application Aspects Simulation Study Summary Normal rule („Rule of thumb“) • Assumes f(x) to be N(,2) • Easiest selector • Often oversmooths the function

  34. Maximum Likelihood Cross-Validation Least-squares cross-validation (Bowman, 1984) Introduction Theory Application Aspects Simulation Study Summary Data-driven bandwidth selection methods Leave-one-out selectors Criteria based on substituting R(f“) in the AMISE-formula • „Normal rule“ („Rule of thumb“; Silverman, 1986) • Plug-in methods (Sheather and Jones, 1991; Park and Marron,1990) • Smoothed bootstrap

  35. Introduction Theory Application Aspects Simulation Study Summary Plug in-methods (Sheather and Jones, 1991; Park and Marron,1990) • Does not substitute R(f“) in the AMISE-formula, but estimates it via R(f(IV)) and R(f(IV)) via R(f(VI)),etc. • Another parameter i to chose (the number of stages to go back) – one stage is mostly sufficient • Better rates of convergence • Does not finally circumvent the problem of the unknown density, either

  36. Introduction Theory Application Aspects Simulation Study Summary The multivariate case h H...the bandwidth matrix

  37. Introduction Theory Application Aspects Simulation Study Summary Issues of generalization in d dimensions • d2 instead of one bandwidth parameter • Unstable estimates • Bandwidth selectors are essentially straightforward to generalize • For Plug-in methods it is „too difficult“ to give succint expressions for d>2 dimensions

  38. Aspects of Application

  39. Introduction Theory Application Aspects Simulation Study Summary Essential issues • Curse of dimensionality • Connection between goodness-of-fit and optimal classification • Two methods for discrimatory purposes

  40. Introduction Theory Application Aspects Simulation Study Summary Essential issues • Curse of dimensionality • Connection between goodness-of-fit and optimal classification • Two methods for discrimatory purposes

  41. Introduction Theory Application Aspects Simulation Study d :a good fit in the tails is desired! Summary The „curse of dimensionality“  The data „disappears“ into the distribution tails in high dimensions

  42. Introduction Theory Application Aspects Simulation Study Summary The „curse of dimensionality“  Much data is necessary to obey a constant estimation error in high dimensions

  43. Introduction Theory Application Aspects Simulation Study Summary Essential issues • Curse of dimensionality • Connection between goodness-of-fit and optimal classification • Two methods for discrimatory purposes

  44. Optimal classification (in high dimensions) AMISE-optimal parameter choice • L2-optimal • L1-optimal (Misclassification rate) • Worse fit in the tails • Estimation of tails important • Calculation intensive for large n • Many observations required for a reasonable fit Essential issues

  45. Introduction Theory Application Aspects Simulation Study Summary Essential issues • Curse of dimensionality • Connection between goodness-of-fit and optimal classification • Two methods for discrimatory purposes

  46. Introduction Theory Application Aspects Simulation Study Summary Method 1: • Reduction of the data onto a subspace which allows a somewhat accurate estimation, however does not destoy too much information  „trade-off“ • Use the multivariate kernel density concept to estimate the class densities

  47. Introduction Theory Application Aspects Simulation Study Summary Method 2: • Use the univariate concept to „normalize“ the data nonparametrically • Use the classical methods like LDA and QDA for classification • Drawback: calculation intensive

  48. Introduction Theory Application Aspects Simulation Study Summary Method 2:

  49. Simulation Study

  50. Introduction Theory Application Aspects Simulation Study Summary Criticism on former simulation studies • Carried out 20-30 years ago • Out-dated parameter selectors • Restriction to uncorrelated normals • Fruitless estimation because of high dimensions • No dimension reduction

More Related