1 / 155

An Introduction to Monte Carlo Methods for Lattice Quantum Field Theory

An Introduction to Monte Carlo Methods for Lattice Quantum Field Theory. A D Kennedy University of Edinburgh. Model or Theory?. QCD is highly constrained by symmetry Only very few free parameters Coupling constant g Quark masses m Scaling behaviour well understood

Download Presentation

An Introduction to Monte Carlo Methods for Lattice Quantum Field Theory

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Introduction to Monte Carlo Methods for Lattice Quantum Field Theory A D Kennedy University of Edinburgh XI Seminario Nazionale di Fisica Teorica

  2. Model or Theory? • QCD is highly constrained by symmetry • Only very few free parameters • Coupling constant g • Quark masses m • Scaling behaviour well understood • Approach to continuum limit • Approach to thermodynamic (infinite volume) limit XI Seminario Nazionale di Fisica Teorica 

  3. Progress in Algorithms • Important advances made • None so far have had a major impact on machine architecture • Inclusion of dynamical fermion effects • Now routine, but expensive • Improved actions • New formulations for lattice fermions • Chiral fermions can be defined satisfactorily • Need algorithms to use new formalism • Very compute intensive XI Seminario Nazionale di Fisica Teorica 

  4. Algorithms • Hybrid Monte Carlo • Full QCD including dynamical quarks • Most of the computer time spent integrating Hamilton’s equations in fictitious (fifth dimensional) time • Symmetric symplectic integrators used • Known as leapfrog to its friends • Higher-order integration schemes go unstable for smaller integration step sizes XI Seminario Nazionale di Fisica Teorica 

  5. Sparse Linear Solvers • Non-local dynamical fermion effects require solution of large system of linear equations for each time step • Krylov space methods used • Conjugate gradients, BiCGStab,… • Typical matrix size > 107 107 • 324 lattice • 3 colours • 4 spinor components XI Seminario Nazionale di Fisica Teorica 

  6. The Expectation value of an operator is defined non-perturbatively by the Functional Integral • Normalisation constant Zchosen such that • d is the appropriate functional measure Functional Integrals and QFT • The action is S() • Defined in Euclidean space-time • Lattice regularisation • Continuum limit: lattice spacing a  0 • Thermodynamic limit: physical volume V   XI Seminario Nazionale di Fisica Teorica 

  7. Monte Carlo Integration • Monte Carlo integration is based on the identification of probabilities with measures • There are much better methods of carrying out low dimensional quadrature • All other methods become hopelessly expensive for large dimensions • In lattice QFT there is one integration per degree of freedom • We are approximating an infinite dimensional functional integral XI Seminario Nazionale di Fisica Teorica 

  8. Example • Let’s integrate sin x over the interval [0,2π] • The integral is • Its variance is • The error (or standard deviation) of our Monte Carlo estimate is thus σ = ±1.7725 (σ2 = V) XI Seminario Nazionale di Fisica Teorica 

  9. Generate a sequence of random field configurations chosen from the probability distribution • Measure the value of  on each configuration and compute the average Monte Carlo Method XI Seminario Nazionale di Fisica Teorica 

  10. Law of Large Numbers • Central Limit Theorem • The variance of the distribution of  is • The Laplace–DeMoivre Central Limit theorem is an asymptotic expansion for the probability distribution of Central Limit Theorem • Distribution of values for a single sample  = () XI Seminario Nazionale di Fisica Teorica 

  11. Generating function for connected moments • The first few cumulants are Cumulant Expansion • Note that this is an asymptotic expansion XI Seminario Nazionale di Fisica Teorica 

  12. Distribution of the average of N samples • Connected generating function Distribution of the Average XI Seminario Nazionale di Fisica Teorica 

  13. Take inverse Fourier transform to obtain distribution Central Limit Theorem XI Seminario Nazionale di Fisica Teorica 

  14. Central Limit Theorem XI Seminario Nazionale di Fisica Teorica 

  15. Re-scale to show convergence to Gaussian distribution • where and Central Limit Theorem XI Seminario Nazionale di Fisica Teorica 

  16. Integral • Sample from distribution • Probability • Normalisation • Estimator of integral • Estimator of variance Importance Sampling XI Seminario Nazionale di Fisica Teorica 

  17. Optimal measure • Optimal variance Optimal Importance Sampling • Minimise variance • Constraint N=1 • Lagrange multiplier  XI Seminario Nazionale di Fisica Teorica 

  18. Example • Optimal weight • Optimal variance Example XI Seminario Nazionale di Fisica Teorica 

  19. 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 Binning 1 Construct cheap approximation to |sin(x)| 2 Calculate relative area within each rectangle 3 Choose a random number uniformly 4 Select corresponding rectangle 5 Choose another random number uniformly 6 Select corresponding x value within rectangle 7 Compute |sin(x)| XI Seminario Nazionale di Fisica Teorica 

  20. … but we can do better! • For which Example • With 100 rectangles we have V = 16.02328561 • With 100 rectangles we have V = 0.011642808 XI Seminario Nazionale di Fisica Teorica 

  21. Distribution converges to unique fixed point Markov chains • State space  • Deterministic evolution of probability distribution P: Q  Q • (Ergodic) stochastic transitions P’:   XI Seminario Nazionale di Fisica Teorica 

  22. Define a metric on the space of (equivalence classes of) probability distributions • Prove that with  > 0, so the Markov process P is a contraction mapping • The space of probability distributions is complete, so the sequence converges to a unique fixed point distribution Convergence of Markov Chains • The sequence Q, PQ, P2Q, P3Q,…is Cauchy XI Seminario Nazionale di Fisica Teorica 

  23. Convergence of Markov Chains XI Seminario Nazionale di Fisica Teorica 

  24. Convergence of Markov Chains XI Seminario Nazionale di Fisica Teorica 

  25. Use of Markov Chains • Use Markov chains to sample from Q • Suppose we can construct an ergodic Markov processP which has distribution Q as its fixed point • Start with an arbitrary state (“field configuration”) • Iterate the Markov process until it has converged (“thermalized”) • Thereafter, successive configurations will be distributed according to Q • But in general they will be correlated • To construct P we only need relative probabilities of states • Don’t know the normalisation of Q • Cannot use Markov chains to compute integrals directly • We can compute ratios of integrals XI Seminario Nazionale di Fisica Teorica 

  26. How do we construct a Markov process witha specified fixed point? • Detailed balance • Metropolis algorithm • other choices are possible, e.g., Metropolis Algorithm • integrate w.r.t. y to obtain fixed point condition • sufficient but not necessary for fixed point • consider cases Q(x)>Q(y)and Q(x)<Q(y)separately to obtain detailed balance condition • sufficient but not necessary for detailed balance XI Seminario Nazionale di Fisica Teorica 

  27. Composite Markov Steps • Composition of Markov steps • Let P1and P2be two Markov steps which have the desired fixed point distribution... • … they need not be ergodic • Then the composition of the two steps P1P2 will also have the desired fixed point... • … and it may be ergodic • This trivially generalises to any (fixed) number of steps • For the case where Pis not ergodic but Pnis the terminology “weakly” and “strongly” ergodic are sometimes used XI Seminario Nazionale di Fisica Teorica 

  28. Site-by-Site Updates • This result justifies “sweeping” through a lattice performing single site updates • Each individual single site update has the desired fixed point because it satisfies detailed balance • The entire sweep therefore has the desired fixed point, and is ergodic... • … but the entire sweep does not satisfy detailed balance • Of course it would satisfy detailed balance if the sites were updated in a random order… • … but this is not necessary • … and it is undesirable because it puts too much randomness into the system XI Seminario Nazionale di Fisica Teorica 

  29. This corresponds to the exponential autocorrelation time Exponential Autocorrelations • The unique fixed point of an ergodic Markov process corresponds to the unique eigenvector with eigenvalue 1 • All its other eigenvalues must lie within the unit circle • In particular, the largest subleading eigenvalue is |λmax|<1 XI Seminario Nazionale di Fisica Teorica 

  30. Without loss of generality we may assume Integrated Autocorrelations • Consider autocorrelation of some operatorΩ • Define the autocorrelation function XI Seminario Nazionale di Fisica Teorica 

  31. The autocorrelation function falls faster that the exponential autocorrelation • Define integrated autocorrelation function Integrated Autocorrelations • For a sufficiently large number of samples XI Seminario Nazionale di Fisica Teorica 

  32. Coupling From The Past • Propp and Wilson (1996) • Use fixed set of random numbers • Flypaper principle: if states coalesce they stay together forever • Eventually, all states coalesce to some state  with probability one • Any state from t = - will coalesce to  •  is a sample from the fixed point distribution  XI Seminario Nazionale di Fisica Teorica 

  33. Continuum Limits • We are not interested in lattice QFTs per se, but in their continuum limit as a  0 • This corresponds to a continuous phase transition of the discrete lattice model • For the continuum theory to have a finite correlation length a(inverse mass gap) in physical units this correlation length must diverge in lattice units • We expect such systems to exhibit universal behaviour as they approach a continuous phase transition • The nature of the continuum theory is characterised by its symmetries • The details of the microscopic interactions are unimportant at macroscopic length scales • Universality is a consequence of the way that the the theory scales as a  0 while a is kept fixed XI Seminario Nazionale di Fisica Teorica 

  34. Continuum Limits • The nature of the continuum field theory depends on the way that physical quantities behave as the system approaches the continuum limit • The scaling of the parameters in the action required is described by the renormalisation group equation (RGE) • The RGE states the “reparameterisation invariance” of the theory as the choice of scale at which we choose to fix the renormalisation conditions • As a  0 we expect the details of the regularisation scheme (cut off effects, lattice artefacts) to vanish, so the effect of changing ais just an RGE transformation • On the lattice the a renormalisation group transformation may be implemented as a “block spin” transformation of the fields • Strictly speaking, the renormalisation “group” is a semigroup on the lattice, as blockings are not invertible XI Seminario Nazionale di Fisica Teorica 

  35. Continuum Limits • The continuum limit of a lattice QFT corresponds to a fixed point of the renormalisation group • At such a fixed point the form of the action does not change under an RG transformation • The parameters in the action scale according to a set of critical exponents • All known four dimensional QFTs correspond to trivial (Gaussian) fixed points • For such a fixed point the UV nature of the theory may by analysed using perturbation theory • Monomials in the action may be classified according to their power counting dimension • d < 4 relevant (superrenormalisable) • d = 4 marginal (renormalisable) • d > 4 irrelevant (nonrenormalisable) XI Seminario Nazionale di Fisica Teorica 

  36. the dynamical critical exponent ztells us how the cost of generating an independent configuration grows as the correlation length of the system is taken to , Continuum Limits • The behaviour of our Markov chains as the system approaches a continuous phase transition is described by its dynamical critical exponents • these describe how badly the system (model + algorithm) undergo critical slowing down • this is closely related (but not always identical) to the dynamical critical exponent for the exponential or integrated autocorrelations XI Seminario Nazionale di Fisica Teorica 

  37. Global Heatbath • The ideal generator selects field configurations randomly and independently from the desired distribution • It is hard to construct such global heatbath generators for any but the simplest systems • They can be built by selecting sufficiently independent configurations from a Markov process… • … or better yet using CFTP which guarantees that the samples are truly uncorrelated • But this such generators are expensive! XI Seminario Nazionale di Fisica Teorica 

  38. Global Heatbath • For the purposes of Monte Carlo integration these is no need for the configurations to be completely uncorrelated • We just need to take the autocorrelations into account in our estimate of the variance of the resulting integral • Using all (or most) of the configurations generated by a Markov process is more cost-effective than just using independent ones • The optimal choice balances the cost of applying each Markov step with the cost of making measurements XI Seminario Nazionale di Fisica Teorica 

  39. Local Heatbath • For systems with a local bosonic action we can build a Markov process with the fixed point distribution  exp(-S) out of steps which update a single site with this fixed point • If this update generates a new site variable value which is completely independent of its old value then it is called a local heatbath • For free field theory we just need to generate a Gaussian-distributed random variable XI Seminario Nazionale di Fisica Teorica 

  40. If are uniformly distributed random numbers then is approximately Gaussian by the Central Limit theorem Gaussian Generators • This is neither cheap nor accurate XI Seminario Nazionale di Fisica Teorica 

  41. If xis uniformly distributed and fis a monotonically increasing function then f(x)is distributed as • Choosing we obtain • Therefore generate two uniform random numbers x1and x2, set ,then are two independent Gaussian distributed random numbers Gaussian Generators XI Seminario Nazionale di Fisica Teorica 

  42. Gaussian Generators • Even better methods exist • The Rectangle-Wedge-Tail (RWT) method • The Ziggurat method • These do not require special function evaluations • They can be more interesting to implement for parallel computers XI Seminario Nazionale di Fisica Teorica 

  43. Local Heatbath • For pure gauge theories the field variables live on the links of the lattice and take their values in a representation of the gauge group • For SU(2) Creutz gave an exact local heatbath algorithm • It requires a rejection test: this is different from the Metropolis accept/reject step in that one must continue generating candidate group elements until one is accepted • For SU(3) the “quasi-heatbath” method of Cabibbo and Marinari is widely used • Update a sequence of SU(2) subgroups • This is not quite an SU(3) heatbath method… • … but sweeping through the lattice updating SU(2) subgroups is also a valid algorithm, as long as the entire sweep is ergodic • For a higher acceptance rate there is an alternative SU(2) subgroup heatbath algorithm XI Seminario Nazionale di Fisica Teorica 

  44. Hybrid Monte Carlo • In order to carry out Monte Carlo computations including the effects of dynamical fermions we would like to find an algorithm which • Updates the fields globally • Because single link updates are not cheap if the action is not local • Takes large steps through configuration space • Because small-step methods carry out a random walk which leads to critical slowing down with a dynamical critical exponent z=2 • Does not introduce any systematic errors XI Seminario Nazionale di Fisica Teorica 

  45. Hybrid Monte Carlo • A useful class of algorithms with these properties is the (Generalised) Hybrid Monte Carlo (HMC) method • Introduce a “fictitious momentum” p corresponding to each dynamical degree of freedom q • Find a Markov chain with fixed point  exp[-H(q,p)] where H is the “fictitious Hamiltonian” ½ p2 + S(q) • The action S of the underlying QFT plays the rôle of the potential in the “fictitious” classical mechanical system • This gives the evolution of the system in a fifth dimension, “fictitious” or computer time • This generates the desired distribution exp[-S(q)] if we ignore the momenta q (i.e., the marginal distribution) XI Seminario Nazionale di Fisica Teorica 

  46. Hybrid Monte Carlo • The GHMC Markov chain alternates two Markov steps • Molecular Dynamics Monte Carlo (MDMC) • Partial Momentum Refreshment • Both have the desired fixed point • Together they are ergodic XI Seminario Nazionale di Fisica Teorica 

  47. Molecular Dynamics • If we could integrate Hamilton’s equations exactly we could follow a trajectory of constant fictitious energy • This corresponds to a set of equiprobable fictitious phase space configurations • Liouville’s theorem tells us that this also preserves the functional integral measure dp dq as required • Any approximate integration scheme which is reversible and area preserving may be used to suggest configurations to a Metropolis accept/reject test • With acceptance probability min[1,exp(-H)] XI Seminario Nazionale di Fisica Teorica 

  48. Molecular Dynamics (MD), an approximate integrator which is exactly • Area preserving • Reversible • A momentum flip • The composition of these gives MDMC • We build the MDMC step out of three parts • A Metropolis accept/reject step • With y being uniformly distributed in [0,1] XI Seminario Nazionale di Fisica Teorica 

  49. This mixes the Gaussian distributed momenta pwith Gaussian noise  Momentum Refreshment • The Gaussian distribution of p is invariant under F • the extra momentum flip F ensures that for small the momenta are reversed after a rejection rather than after an acceptance • for =/2 all momentum flips are irrelevant XI Seminario Nazionale di Fisica Teorica 

  50. Special Cases of GHMC • The Hybrid Monte Carlo (HMC) algorithm is the special case where =/2 • =0 corresponds to an exact version of the Molecular Dynamics (MD) or Microcanonical algorithm (which is in general non-ergodic) • The L2MC algorithm of Horowitz corresponds to choosing arbitrary  but MDMC trajectories of a single leapfrog integration step (=). This method is also called Kramers algorithm. • The Langevin Monte Carlo algorithm corresponds to choosing =/2 and MDMC trajectories of a single leapfrog integration step (=). XI Seminario Nazionale di Fisica Teorica 

More Related