Dario Bressanini

Universita’ dell’Insubria, Como, Italy Introduction to Quantum Monte Carlo Dario Bressanini http://scienze-como.uninsubria.it/bressanini UNAM, Mexico City, 2007

Why do simulations? • Simulations are a general method for “solving” many-body problems. Other methods usually involve approximations. • Experiment is limited and expensive. Simulations can complement the experiment. • Simulations are easy even for complex systems. • They scale up with the computer power.

Buffon needle experiment, AD 1777 L d

Simulations • “The general theory of quantum mechanics is now almost complete. The underlying physical laws necessary for the mathematical theory of a large part of physics and the whole of chemistry are thus completely known, and the difficulty is only that the exact application of these laws leads to equations much too complicated to be soluble.”Dirac, 1929

General strategy • How to solve a deterministic problem using a Monte Carlo method? • Rephrase the problem using a probabilitydistribution • “Measure” A by sampling the probability distribution

The points Ri are generated using random numbers We introduce noise into the problem!! Our results have error bars... ... Nevertheless it might be a good way to proceed Monte Carlo Methods This is why the methods are called MonteCarlo methods • Metropolis, Ulam, Fermi, Von Neumann (-1945)

Stanislaw Ulam (1909-1984) S. Ulam is credited as the inventor of Monte Carlo method in 1940s, which solves mathematical problems using statistical sampling.

Why Monte Carlo? • We can approximate the numerical value of a definite integral by the definition: • where we use L points xi uniformly spaced.

Error in Quadrature • Consider an integral in D dimensions: • N= LD uniformly spacedpoints, to CPU time • The error with N sampling points is

Monte Carlo Estimates of Integrals • If we sample the points not on regular grids, but randomly (uniformly distributed), then Where we assume the integration domain is a regular box of V=LD.

Monte Carlo Error • From probability theory one can show that the Monte Carlo error decreases with sample size N as • Independent of dimension D (good). • To get another decimal place takes 100 times longer! (bad)

MC is advantageous for large dimensions • Error by simple quadrature  N-1/D • Using smarter quadrature  N-A/D • Error by Monte Carlo always  N-1/2 • Monte Carlo is always more efficient for large D (usually D > 4 - 6)

Monte Carlo Estimates of π (1,0) We can estimate π using Monte Carlo

Monte Carlo Integration • Note that • Can automatically estimate the error by computing the standard deviation of the sampled function values • All points generated are independent • All points generated are used

Inefficient? • If the function is strongly peaked, the process is inefficient • We should generate more points where the function is large • Use a non-uniform distribution!

General Monte Carlo • If the samples are not drawn uniformly but with some probability distribution p(R), we can compute by Monte Carlo: Where p(R) is normalized,

Monte Carlo • so Convergence guaranteed by the Central Limit Theorem • The statistical error0 if p(R)  f(R), convergence is faster

Warning! • Beware of Monte Carlo integration routines in libraries: they usually cannot assume anything about your functions since they must be general. • Can be quite inefficients • Also beware of standard compiler supplied Random Number Generators (they are known to be bad!!)

Equation of state of a fluid The problem: compute the equation of state (p as function of particle density N/V ) of a fluid in a box given some interaction potential between the particles Assume for every position of particles we can compute the potential energy V(R)

The Statistical Mechanics Problem • For equilibrium properties we can just compute the Boltzmann multi-dimensional integrals • Where the energy usually is a sum

An inefficient recipe • For 100 particles (not really the thermodynamic limit), integrals are in 300 dimensions. • The naïve MC procedure would be to uniformly distribute the particles in the box, throwing them randomly. • If the density is high, throwing particles at random will put them some of them too close to each other. • almost all such generated points will give negligible contribution, due to the boltzmann factor

An inefficient recipe • E(R) becomes very large and positive • We should try to generate more points where E(R) is close to the minima

How do we do it? The Metropolis Algorithm • Use the Metropolis algorithm (M(RT)2 1953) ... ... and a powerful computer • The algorithm is a random walk (markov chain) in configuration space. Points are not independent Anyone who consider arithmetical methods of producing random digits is, of course, in a state of sin. John Von Neumann

Importance Sampling • The idea is to use Importance Sampling, that is sampling more where the function is large • “…, instead of choosing configurations randomly, …, we choose configuration with a probability exp(-E/kBT) and weight them evenly.” - from M(RT)2 paper

The key Ideas • Points are no longer independent! • We consider a point (a Walker) that moves in configuration space according to some rules

A Markov Chain • A Markov chain is a random walk through configuration space: R1R2 R3 R4 … • Given Rn there is a transition probability to go to the next point Rn+1 : p(RnRn+1)stochastic matrix • In a Markov chain, the distribution of Rn+1 depends only on Rn. There is no memory • We must use an ergodic markov chain

The key Ideas • Choose an appropriate p(RnRn+1) so that at equilibrium we sample a distribution π(R) (for this problem is just π = exp(-E/kBT) ) • A sufficient condition is to apply detailed balance. • Consider an infinite number of walkers, and two positions R, and R’ • At equilibrium, the #of walkers that go from RR’ is equal to the #of walkers R’R p(RR’) ≠ p(R’R)

The Detailed Balance • π(R) is the distribution we want to sample • We have the freedom to choose p(RR’)

Rejecting points • The third key idea is to use rejection to enforce detailed balance • p(RR’) is split into a Transition step and an Acceptance/Rejection step • T(RR’) generate the next “candidate” point • A(RR’) will decide to accept or reject this point

The Acceptance probability • Given some T, a possible choice for A is • For symmetric T

What it does • Suppose π(R’) ≥ π(R) • move is always accepted • Suppose π(R’) < π(R) • move is accepted with probability π(R’)/π(R) • Flip a coin • The algorithm samples regions of large π(R) • Convergence is guaranteed but the rate is not!!

IMPORTANT! • Accepted and rejected states count the same! • When a point is rejected, you add the previous one to the averages • Measure acceptance ratio. Set to roughly 1/2 by varying the “step size” • Exact: no time step error, no ergodic problems in principle (but no dynamics).

We wish to solve HY = EY to high accuracy The solution usually involves computing integrals in high dimensions: 3-30000 The “classic” approach (from 1929): Find approximate Y( ... but good ...) ... whose integrals are analitically computable (gaussians) Compute the approximate energy Quantum Mechanics chemical accuracy ~ 0.001 hartree ~ 0.027 eV

VMC: Variational Monte Carlo • Start from the Variational Principle • Translate it into Monte Carlo language

VMC: Variational Monte Carlo • E is a statistical average of the local energy over P(R) • Recipe: • take an appropriate trial wave function • distribute N points according to P(R) • compute the average of the local energy

Error bars estimation • In Monte Carlo it is easy to estimate the statistical error • if • The associated statistical error is

Call the Oracle Compute averages The Metropolis Algorithm move reject accept Ri Rtry Ri+1=Ri Ri+1=Rtry

The Oracle The Metropolis Algorithm if p ≥ 1 /* accept always */ accept move If 0 ≤ p < 1 /* accept with probability p */ if p > rnd() accept move else reject move

No need to analytically compute integrals: complete freedom in the choice of the trial wave function. r12 r1 r2 He atom VMC: Variational Monte Carlo • Can use explicitly correlated wave functions • Can satisfy the cusp conditions

VMC advantages • Can go beyond the Born-Oppenheimer approximation, with any potential, in any number of dimensions. • Can compute lower bounds Ps2 molecule (e+e+e-e-) in 2D and 3D M+m+M-m- as a function of M/m

Properties of the Local energy • For an exact eigenstate EL is a constant • At particles coalescence the divergence of V must be cancelled by the divergence of the kinetic term • For an approximate trial function, EL is not constant

Reducing Errors • For a trial function, if EL can diverge, the statistical error will be large • To eliminate the divergence we impose the Kato’s cusp conditions

Kato’s cusps conditions on Y We can include the correct analytical structure electron – electron cusps: electron – nucleus cusps:

Optimization of Y • Suppose we have variational parameters in the trial wave function that we want to optimize • The straigthforward optimization of E is numerically unstable, because EL can diverge • For a finite N can be unbound • Also, our energies have error bars. Can be difficult to compare

Optimization of Y • It is better to optimize • It is a measure of the quality of the trial function • Even for finite N is numerically stable. • The lowest s will not have the lowest E but it is usually close

Optimization of Y • Meaning of optimization of s Trying to reduce the distance between upper and lower bound • For which potential V’ is YT an eigenfunction? • We want V’ to be “close” to the real V

VMCdrawbacks • Error bar goes down as N-1/2 • It is computationally demanding • The optimization of Y becomes difficult as the number of nonlinear parameters increases • It depends critically on our skill to invent a good Y • There exist exact, automatic ways to get better wave functions. Let the computer do the work ... To be continued...

In the last episode: VMC Today: DMC

Dario Bressanini