340 likes | 511 Views
Slow and Fast Mixing of Tempering and Swapping for the Potts Model. Nayantara Bhatnagar, UC Berkeley Dana Randall, Georgia Tech. lim Pr[X t = Y | X 0 ] = π (Y). t → ∞. Markov Chains. K = ( Ω , P ). P(X,Y).
E N D
Slow and Fast Mixing of Tempering and Swapping for the Potts Model Nayantara Bhatnagar, UC Berkeley Dana Randall, Georgia Tech
lim Pr[Xt = Y | X0] = π(Y) t →∞ Markov Chains K = (Ω,P) P(X,Y) • Theorem: If K is connected and “aperiodic”, the Markov chain X0,X1,... converges in the limit to a unique stationary distribution π over Ω. P(Y,X) If P(X,Y) = P(Y,X), π is uniform over Ω.
δ Markov Chains Matchings Independent Sets Broder’s Markov chain Glauber dynamics Partition functions of Ising, Potts models Volume of a convex body Ball walk, Lattice walk Glauber dynamics
Introduction:Markov Chain Monte Carlo • Markov Chains: • Matchings – Broder’s Markov chain • Colorings – Glauber dynamics • Independent Sets – Glauber dynamics • Ising, Potts model – Glauber dynamics • Volume – Ball walk, Lattice walk • Mixing Time, T: time to get within 1/4 in variation distance to π. • Rapid mixing (polynomial), slowly mixing (exponential). • Techniques for proving rapid mixing: • Coupling, Spectral Gap, Conductance and isoperimetry, Multicommodity flows, Decomposition, Comparison ... What if natural Markov chain is slowly mixing?
The q-state Potts Model q-state Ferromagnetic Potts Model: Underlying graph: G(V,E) Configurations Ω = { x : x [q]n} Inverse temperature β > 0, πβ(x) eβ(H(x)) H(x) = Σδxi = xj (i,j) • Glauber dynamics Markov Chain • Choose (v, ct+1(v)) R V x [q]. • Update ct(v) to ct+1(v) with Metropolis probabilities.
c2 c1 Φ2 Φ Why Simulated Tempering πβ(x) H(x) Glauber dynamics mixes slowly for the q-state Potts for Knfor q ≥ 2, at large enough β. Conductance:[Jerrum-Sinclair ’89, Lawler-Sokal ’88] ΦS= P[ Xt+1 S | Xt ~ π(S)] Φ= minΦS S:π(S)½ S Sc Theorem : T
bi =bM· i M πi (x) ˆ π(x,i) = 1 M+1 Simulated Tempering[Marinari-Parisi ’92] Define inverse temperatures 0 =β0 < ... < βM =βand distributionsπ0, π1, ... , πM = πβ on Ω. ˆ Ω= Ω × [M+1], πM … • Tempering Markov Chain: • From (x,i), • W.p. ½, Glauber dynamics at βi • W.p. ½, randomly move to (x,i±1) … π0
bi =bM· i M Swapping[Geyer ’91] Define inverse temperatures 0 =β0 < ... < βM =βand distributionsπ0, π1, ... , πM = πβ on Ω. ˆ Ω= Ω[M+1], πM … π(x) = Π πi (xi) ˆ i • Swapping Markov Chain: • From x, choose random i • W.p. ½, Glauber dynamics at βi • W.p. ½, move to x(i,i+1) … π0
Theoretical Results • Madras-Zheng ’99: • Tempering mixes rapidly at all temperatures forthe ferromagnetic Ising model (Potts model, q = 2) onKn. • Rapid mixing for symmetric bimodal exponential distribution on an interval. • Zheng ’99: • Rapid mixing of swapping implies tempering mixes rapidly. • B-Randall ’04: • Simulated Tempering mixes slowly for 3 state ferromagnetic Potts model on Kn. • Modified swapping algorithm is rapidly mixing for mean-field Ising model with an external field. • Woodard, Schmidler, Huber ’08: • Sufficient conditions for rapid mixing of tempering and swapping. • Sufficient conditions for torpid mixing of tempering and swapping.
In This Talk: B-Randall ’04: Tempering and swapping for the mean-field Potts model. • Slow Mixing. • Tempering can be slowly mixing for any choice of temperatures. • Rapid Mixing • Alternative tempered distributions for rapid mixing.
Tempering for Potts Model Theorem [BR]:There existsβcrit> 0, such that tempering for Potts model onKnat βcrit mixes slowly. ˆ • Proof idea: Bound conductanceonΩ = Ω× [M+1]. • Cut depends on number of vertices of each color. • Induces the same cut on Ω at each βi • The space Ω partitioned into equivalence classes σ: (n,0,0) (n/2, 0, n/2) (0,0,n)
(σR)2 + (σB)2 + (σG)2 eβi() Stationary Distribution of Tempering Chain At β0 At 0 < βi < βcrit … n σRσB σG πi (σ) … At βcrit ordered mode disordered mode
Tempering Fails to Converge β0 At βcrit, tempering mixes slowly for any set of intermediate temperatures. … 0 < βi < βcrit … βcrit
Swapping and Tempering for Assymetric Distributions – Rapid Mixing Assymetric exponential π(x) C |x|, x [-n1,n2] n1 > n2 n Ising Model with an external Field πβ(x) eβ(H(x)) H(x) = Σδxi = xj + B Σδxi=+ i (i,j) Potts model on KR, the line σB = σG n/3 πβ(x) eβ(H(x)) H(x) = Σδxi = xj
i M … Decomposition of Swapping Chain πi(x) C|x| • Madras-Randall ’02 • Decomposition for Markov chains • Mixing of restricted chains R0,i and R1,i at each temperature. • Mixing of the projection chain P. • Tswap C min TRb,ix TP b {0,1}, i M
i M … Decomposition of Swapping Chain πi(x) C|x| Projection for Swapping chain 011010 010110 011010 011011
Decomposition of Swapping Chain Projection for Swapping chain Weighted Cube (WC) 011010 010110 011010 011011 011010 010010
Decomposition of Swapping Chain Projection for Swapping chain Weighted Cube (WC) Uptopolynomials, πi(0) Cn1 i / M /Zi and πi(1) Cn2 i / M /Zi Lemma:If for i > j, πi(1) πj(0) p(n)πi(0) πj(1), then TP q(n) TWC.
(σR)2 + (σB)2 + (σG)2 eβi() … … Flat-Swap: Fast Mixing for Mean-Field Models • Modify more than just temperature • Define π’M… π’0so cut is not preserved. n σRσB σG πi (σ)
i i-M M M (σR)2 + (σB)2 + (σG)2 eβi() Flat-Swap: Fast Mixing for Mean-Field Models • Modify more than just temperature • Define π’M… π’0so cut is not preserved. n σRσB σG π’i (σ) =πi (σ) fi(σ) =πi (σ) n σRσB σG π’i (σ) … …
Flat Swap for Mean-Field Models • Modify more than just temperature • Define π’M… π’0so cut is not preserved. Lemma: For i > j, π’i(0) π’j(1) p(n)π’i(1) π’j(0) • Theorem [B-Randall]: • Flat swap for the 3-state Potts model onbKRusing the distributionsπ’M… π’0mixes rapidly at every temperature. • Flat swap mixes rapidly for the mean field Ising model at every temperature and for any external field B.
Summary and Open problems Summary • Simulated tempering algorithms for other problems? • Relative complexity of swapping and tempering • Insight into why tempering can fail to converge. • Designing more robust tempering algorithms. Open Problems
Tempering vs. Fixed Temperature Theorem[BR]:On the line KR, σG = σB ≤ n/3, Tempering mixes slower than Metropolis at bM >bcrit by an exponential factor. bM >bcrit S … … b0 n