210 likes | 412 Views
A gentle introduction to Gaussian distribution. X = 0. X = 1. X: Random variable. Review. Random variable Coin flip experiment. P(x). P(x) >= 0. 0. 1. x. Review. Probability mass function (discrete). Any other constraints? Hint: What is the sum?. Example: Coin flip experiment.
E N D
X = 0 X = 1 X: Random variable Review • Random variable • Coin flip experiment
P(x) P(x) >= 0 0 1 x Review • Probability mass function (discrete) Any other constraints? Hint: What is the sum? Example: Coin flip experiment
f(x) f(x) >= 0 x Review • Probability density function (continuous) Unlike discrete, Density function does not represent probability but its rate of change called the “likelihood” Examples?
f(x) f(x) >= 0 x Review • Probability density function (continuous) & Integrates to 1.0 x0 X0+dx P( x0 < x < x0+dx ) = f(x0).dx But, P( x = x0 ) = 0
The Gaussian Distribution Courtesy: http://research.microsoft.com/~cmbishop/PRML/index.htm
Central Limit Theorem The distribution of the sum of N i.i.d. random variables becomes increasingly Gaussian as N grows. Example: N uniform [0,1] random variables.
Central Limit Theorem (Coin flip) • Flip coin N times • Each outcome has an associated random variable Xi (=1, if heads, otherwise 0) • Number of heads • NH is a random variable • Sum of N i.i.d. random variables NH = x1 + x2 + …. + xN
Central Limit Theorem (Coin flip) • Probability mass function of NH • P(Head) = 0.5 (fair coin) N = 5 N = 10 N = 40
Moments of the Multivariate Gaussian (1) thanks to anti-symmetry of z
Maximum likelihood • Fit a probability density model p(x | θ) to the data • Estimate θ • Given independent identically distributed (i.i.d.) data X = (x1, x2, …, xN) • Likelihood • Log likelihood • Maximum likelihood: Maximize ln p(X | θ) w.r.t. θ
Maximum Likelihood for the Gaussian (1) Given i.i.d. data , the log likelihood function is given by Sufficient statistics
Maximum Likelihood for the Gaussian (2) Set the derivative of the log likelihood function to zero, and solve to obtain Similarly
Mixtures of Gaussians (1) Single Gaussian Mixture of two Gaussians Old Faithful data set
Mixtures of Gaussians (2) Component Mixing coefficient K=3 Combine simple models into a complex model:
Mixtures of Gaussians (4) Log of a sum; no closed form maximum. Determining parameters ¹, §, and ¼ using maximum log likelihood Solution: use standard, iterative, numeric optimization methods or the expectation maximization algorithm (Chapter 9).