410 likes | 716 Views
The Monte Carlo method. What is Monte Carlo?. A method to evaluate integrals Popular in all areas of science, economics, sociology, etc. Involves: Random sampling of domain Evaluation of function Acceptance/rejection of points. Why Monte Carlo?. The domain of integration is very complex:
E N D
What is Monte Carlo? • A method to evaluate integrals • Popular in all areas of science, economics, sociology, etc. • Involves: • Random sampling of domain • Evaluation of function • Acceptance/rejection of points
Why Monte Carlo? • The domain of integration is very complex: • Find the integral of a complicated solvent-solute energy function outside the boundary of a protein. • Find the volume of an object from a tomographic data set. • The integrals are very high-dimensional: • A partition function or free energy for a protein • Analyzing metabolic networks • Predicting trends in the stock market
Basic Monte Carlo • Numerical integral: Divide the region [0,1] evenly into M slices • Equivalent to sampling a [0,1] uniform random number • Error typically decays as
Error analysis • Standard deviation: Error from the Monte Carlo Error from trapezoid rule High dimensional problems: statistical physics, many body problem For d>10, Monte Carlo method is the Unique workable quadrature.
Basic Monte Carlo Algorithm • Suppose we want to approximate in a high-dimensional space • For i = 1 to n • Pick a point xiat random • Accept or reject the point based on criterion • If accepted, then add f(xi) to total sum • Error estimates are “free” by calculating sums of squares • Error typically decays as
Example: the area of a circle • Sample points randomly from square surrounding circle of radius 5 • 10,000 sample points • Acceptance criterion: inside circle • Actual area: 78.54 • Calculated area: 78.66
Example: multiple dimensions • What is the average of a variable for a N-dimensional probability distribution? • Two approaches: • Quadrature • Discretize each dimension into a set of n points • Possibly use adaptivity to guide discretization • For a reasonably smooth function, error decreases as n-N/2 • Monte Carlo • Sample m points from the space • Possibly weight sampling based on reference function • Error decreases as m-1/2
Problems: sampling tails of distributions • We want to • Integrate a sharply-peaked function • Use Monte Carlo with uniformly-distributed random numbers • What happens? • Very few points contribute to the integral (~9%) • Poor computational efficiency/convergence • Can we ignore the tails? NO! • Solution: use a different distribution
Improved sampling: change of variables • One way to improve sampling is to change variables: • New distribution is flatter • Uniform variates more useful • Advantages: • Simplicity • Very useful for generating distributions of non-uniform variates (coming up) • Disadvantages • Most useful for invertible functions
Change of variables: method • Given an integral • Transform variables • Choose variables to give (nearly) flat distribution • Integrate
Change of variables application: exponential • Given an integral • Transform variables • Choose variables to give (nearly) flat distribution • Integrate
Change of variables application: exponential • Before transformation: • 0.5% of points in domain contribute to integral • Slow convergence
Change of variables example: exponential • Before transformation: • 0.5% of points in domain contribute to integral • Slow convergence • After transformation: • All of the points in domain contribute • Rapid (exact) convergence • In practice: • Speed-up best when inversion is nearly exact
Importance sampling • Functions aren’t always invertible • Is there another way to improve sampling of “important” regions of the functions? • Find flat distributions • Bias sampling • Find a function that is almost proportional to the one of interest: • Rewrite your integral as a “weighted” integral:
Importance sampling example: a lumpy Gaussian • Our original integrand is • This is close to • Therefore: • Sample random numbers from (we’ll talk about how to do this later): • Evaluate the following integrand over the random number distribution:
Importance sampling example: a lumpy Gaussian • Convergence is pretty good (actual value 1.41377…)
Evolution of Monte Carlo methods so far… • Uniform points and original integrand… • …but this had very poor efficiency…. • Uniform points and transformed integrand… • …but this only worked for certain integrands…. • Non-uniform points and scaled integrand… • …but this is very cumbersome for complicated integrands… • Now, we try Markov chain approaches…
Markov chains • Properties • A sequence of randomly-chosen states • The probability of transitions between states is independent of history • The entire chain represents a stationary probability distribution • Examples • Random number generators • Brownian motion • Hidden Markov Models • Financial market
Markov chain Monte Carlo • Assembling the entire distribution for MC is usually hard: • Complicated energy landscapes • High-dimensional systems • Extraordinarily difficult normalization • Solution: • Build up distribution from Markov chain • Choose local transition probabilities which generate distribution of interest (i.e., ensure detailed balance) • Each random variable is chosen based on the previous variable in the chain • “Walk” along the Markov chain until convergence reached • Result: Normalization not required, calculations are local
Detailed balance: example • Markov chain with a Gaussian stationary probability distribution • Detailed balance satisfied
A B Application: stochastic transitions • This is a very simplistic version of kinetic Monte Carlo… • Each Monte Carlo step corresponds to a time interval • The probability of moving between states in that time interval is related to the rate constants • Simulate to give mean first passage times, transient populations, etc.
Metropolis Monte Carlo • Start with the detailed balance condition • Derive an “acceptance ratio” condition • Choose a particular acceptance ratio
Application: canonical ensemble • Our un-normalized probabilities look like Boltzmann factors: • Our acceptance ratio is therefore:
Algorithm: NVT Metropolis Monte Carlo Metropolis MC in a harmonic potential
Choose step size • [0,1] random number • Accepting rate ~50%
Advantages of Metropolis MC simulations • Does not require forces • Rapidly-changing energy functions • No differentiation required • Amenable to complex move sets: • Torsions • Rotamers • Tautomers • Etc.
Monte Carlo moves • Trial moves • Rigid body translation • Rigid body rotation • Internal conformational changes (soft vs. stiff modes) • Titration/electronic states • … • Questions: • How “big” a move should we take? • Move one particle or many?
Monte Carlo moves • How “big” a move should we take? • Smaller moves: better acceptance rate, slower sampling • Bigger moves: faster sampling, poorer acceptance rate • Amortize mean squared displacement with respect to CPU time • “Rules of thumb”: percent acceptance rate • Move one particle or many? • Possible to achieve more efficient sampling with correct multi-particle moves • One-particle moves must choose particles at random
Summary • A method for sampling • Integrals • Configuration space • Error space • Any distribution that can be represented by transitions can be sampled • Sampling can be accelerated using various tricks