1 / 15

Nonparametric Density Estimation

Nonparametric Density Estimation. Riu Baring CIS 8526 Machine Learning Temple University Fall 2007. Christopher M. Bishop, Pattern Recognition and Machine Learning , Chapter 2.5 Some slides from http://courses.cs.tamu.edu/rgutier/cpsc689_f07/. Overview. Density Estimation

susannah
Download Presentation

Nonparametric Density Estimation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Nonparametric Density Estimation Riu Baring CIS 8526 Machine Learning Temple University Fall 2007 Christopher M. Bishop, Pattern Recognition and Machine Learning, Chapter 2.5 Some slides from http://courses.cs.tamu.edu/rgutier/cpsc689_f07/

  2. Overview • Density Estimation • Given: a finite set x1,…,xN • Task: to model the probability distribution p(x) • Parametric Distribution • Governed by adaptive parameters • Mean and variance – Gaussian Distribution • Need procedure to determine suitable values for the parameters • Discrete rv – binomial and multinomial distributions • Continuous rv – Gaussian distributions

  3. Nonparametric Method • Attempt to estimate the density directly from the data without making any parametric assumptions about the underlying distribution • . Nonparametric Density Estimation

  4. Histogram • Divide the sample space into a number of bins and approximate the density at the center of each bin by the fraction of points in the training data that fall into the corresponding bin • .

  5. Histogram • Parameter: bin width • .

  6. Histogram - Drawbacks • The discontinuities of the estimate are not due to the underlying density, they are only an artifact of the chosen bin locations • These discontinuities make it very difficult (to the naïve analyst) to grasp the structure of the data • A much more serious problem is the curse of dimensionality, since the number of bins grows exponentially with the number of dimensions • In high dimensions we would require a very large number of examples or else most of the bins would be empty

  7. Nonparametric DE

  8. Nonparametric DE

  9. Nonparametric DE

  10. Kernel Density Estimator

  11. Kernel Density Estimator

  12. k-nearest-neighbors • To estimate p(x): • Consider small sphere centered on the point x • Allow the radius of the sphere to grow until it contains k data points

  13. k-nearest-neighbors • Data set comprising Nk points in class Ck, so that • Suppose the sphere has volume, V, and contains kk points from class Ck • Density Estimate Unconditional density Class Prior • Posterior probability of class membership • .

  14. k-nearest-neighbors • To classify new point x • Identify K nearest neighbors from training data • Assign to the class having the largest number of representatives • Parameter, K • .

  15. My thoughts • KDE and KNN require the entire training data set to be stored • Leads to expensive computation • Tweak “parameters” • KDE: bandwidth, h • KNN: K

More Related