120 likes | 246 Views
Monte Carlo simulation. Gil McVean, Department of Statistics Thursday February 12 th 2009. Simulating random variables. Often we wish to simulate random variables from a given distribution Explore properties of the distribution Assess properties of estimators Check model fit
E N D
Monte Carlo simulation Gil McVean, Department of Statistics Thursday February 12th 2009
Simulating random variables • Often we wish to simulate random variables from a given distribution • Explore properties of the distribution • Assess properties of estimators • Check model fit • Suppose we have a distribution defined by the cumulative density function F(x) • A natural method of simulation is inversion of the CDF • Simulate a pseudorandom number X~U(0,1) • Invert the CDF: Y = F-1(X) Exponential
Simulating discrete random variables • For discrete distributions, the CDF is a series of steps • To invert, find the minimum value of Y for which F(Y) ≥ X • Often, it is best to approximate discrete distributions by continuous ones (e.g. Poisson by normal) as a first step CDF for Poisson(3) X = 0.27 F-1(X) = 2
What if you can’t invert the CDF? • For several distributions, such as the normal distribution, inversion of the CDF cannot be done analytically • In some cases there are clever tricks that allow you to get round the problem • Polar transformation of two iid normal random variables (X, Y) Through the transformation, it can be shown that the angle, q, and the radius, R, are independent random variables. qis uniform on [0, 2p], whileR2has a exponential density with parameter ½. To simulate two iid normal random variables, simulate qand R2, then apply the transformation:X = R cos q, Y = R sin q Y R q X
Rejection sampling • For many pdfs of interest we can neither analytically invert the CDF nor find a clever trick! • However, we might be able to find a function, M(x), that envelopes the target distribution, f(x), over its support, A • If we can sample from the pdf, m(x), obtained by normalising the envelope function over the support of f(x), we can also sample from f(x) by rejection M(x) f(x) x
The rejection step • Generate a random variable, T, from the distribution m(x) • Calculate the ratio f(T)/M(T) • Accept T if a U[0,1] random variable is less than or equal to f(T)/M(T) • The set of accepted samples will follow f(x) • The efficiency of the method can be measured in terms of the acceptance rate. This is simply the ratio m(x)/M(x) M(x) x
Sampling from a Gamma distribution • Suppose we wish to sample from a Gamma(2,2) distribution • We can envelope this with the exponential function 2 exp(-x) • The efficiency of the algorithm is 1/2 M(x) f(x) x
Importance sampling • In rejection sampling much effort is wasted, importance sampling allows us to use all samples • Simulate a series of random variables (X1, X2,…,Xn) from a proposal distribution, g(x). For each, calculate an importance weight, f(x)/ g(x). Resample from the simulated variables according to their importance weights. • For example, simulate exponential (mean = 1) random variables. The resampling importance weights for the Gamma(2,2) distribution are highly skewed g(x) f(x) x
Other uses of importance sampling • Importance sampling is more generally used to evaluate functions of a distribution by Monte Carlo integration • The efficacy of the importance sampling approach depends on the match between the proposal and target distributions • A poor match results in highly variable importance weights, such that a few observations dominate the estimate • The optimal choice for the proposal distribution, g*(x), is the one in which all samples have the same weight. I.e.
Metropolised independence sampling • A very powerful technique in modern statistics is Markov Chain Monte Carlo • A simple application of the idea can be used to simulate from a required distribution • Start with an initial value, X0 • Draw a random variable, Z, from a prior distribution, g(x), that has the support of the target distribution, f(x) • Set X1=Z with probability • Otherwise X1=X0 • Repeat • The resulting Xi are drawn from the required distribution
In action • Suppose we wish to sample from the Gamma(2,2) distribution, using an Exponential(1) proposal distribution • It’s a good idea to use a proposal distribution that has fatter tails than the target distribution f(x) Xt Iteration x