190 likes | 370 Views
Non-Parametric Learning. Prof. A.L. Yuille Stat 231. Fall 2004. Chp 4.1 – 4.3. Parametric versus Non-Parametric. Previous lectures on MLE learning assumed a functional form for the probability distribution. We now consider an alternative non-parametric method based on window function.
E N D
Non-Parametric Learning Prof. A.L. Yuille Stat 231. Fall 2004. Chp 4.1 – 4.3.
Parametric versus Non-Parametric • Previous lectures on MLE learning assumed a functional form for the probability distribution. • We now consider an alternative non-parametric method based on window function.
Non-Parametric • It is hard to develop probability models for some data. • Example: estimate the distribution of annual rainfall in the U.S.A. Want to model p(x,y) – probability that a raindrop hits a position (x,y). • Problems: (i) multi-modal density is difficult for parametric models, (ii) difficult/impossible to collect enough data at each point (x,y).
Intuition • Assume that the probability density is locally smooth. • Goal: estimate the class density model p(x) from data • Method 1: Windows based on points x in space.
Windows • For each point x, form a window centred at x with volume Count the number of samples that fall in the window. • Probability density is estimated as:
Non-Parametric • Goal: to design a sequence of windows so that at each point x • (f(x) is the true density). • Conditions for window design: • increasing spatial resolution. (ii) many samples at each point (iii)
Two Design Methods • Parzen Window: Fix window size: • K-NN: Fix no. samples in window:
Parzen Window • Parzen window uses a window function • Example: • (i) Unit hypercube: and 0 otherwise. • (ii) Gaussian in d-dimensions.
Parzen Windows • No. of samples in the hypercube is • Volume • The estimate of the distribution is: • More generally, the window interpolates the data.
Parzen Window Example • Estimate a density with five modes using Gaussian windows at scales h=1,0.5, 0.2.
Convergence Proof. • We will show that the Parzen window estimator converges to the true density at each point x with increasing number of samples.
Proof Strategy. • Parzen distribution is a random variable which depends on the samples used to estimate it. • We have to take the expectation of the distribution with respect to the samples. • We show that the expected value of the Parzen distribution will be the true distribution. And the expected variance of the Parzen distribution will tend to 0 as no. samples gets large.
Convergence of the Mean • Result follows.
Convergence of Variance • Variance:
Example of Parzen Window • Underlying density is Gaussian. Window volume decreases as
Example of Parzen Window • Underlying Density is bi-modal.
Parzen Window and Interpolation. • In practice, we do not have an infinite number of samples. • The choice of window shape is important. This effectively interpolates the data. • If the window shape fits the local structure of the density, then Parzen windows are effective.