540 likes | 647 Views
Adaptive annealing: a near-optimal connection between sampling and counting. Daniel Štefankovič (University of Rochester) Santosh Vempala Eric Vigoda (Georgia Tech). Counting. independent sets spanning trees matchings perfect matchings k-colorings.
E N D
Adaptive annealing: a near-optimal connectionbetween sampling and counting Daniel Štefankovič (University of Rochester) Santosh Vempala Eric Vigoda (Georgia Tech)
Counting independent sets spanning trees matchings perfect matchings k-colorings
(approx) counting sampling Valleau,Card’72 (physical chemistry), Babai’79 (for matchings and colorings), Jerrum,Valiant,V.Vazirani’86, the outcome of the JVV reduction: random variables: X1 X2 ... Xt such that E[X1 X2 ... Xt] 1) = “WANTED” 2) the Xi are easy to estimate V[Xi] squared coefficient of variation (SCV) = O(1) E[Xi]2
(approx) counting sampling E[X1 X2 ... Xt] 1) = “WANTED” 2) the Xi are easy to estimate V[Xi] = O(1) E[Xi]2 Theorem (Dyer-Frieze’91) O(t2/2) samples (O(t/2) from each Xi) give 1 estimator of “WANTED” with prob3/4
JVV for independent sets GOAL: given a graph G, estimate the number of independent sets of G 1 # independent sets = P( )
P(AB)=P(A)P(B|A) JVV for independent sets ? ? ? ? ? ? P() = P() P() P( ) P( ) X1 X2 X3 X4 V[Xi] Xi [0,1] and E[Xi] ½ = O(1) E[Xi]2
JVV: If we have a sampler oracle: random independent set of G SAMPLER ORACLE graph G then FPRAS using O(n2) samples.
JVV: If we have a sampler oracle: random independent set of G SAMPLER ORACLE graph G then FPRAS using O(n2) samples. ŠVV: If we have a sampler oracle: SAMPLER ORACLE set from gas-model Gibbs at , graph G then FPRAS using O*(n) samples.
Application – independent sets O*( |V| ) samples suffice for counting Cost per sample (Vigoda’01,Dyer-Greenhill’01) time = O*( |V| ) for graphs of degree 4. Total running time: O* ( |V|2 ).
Other applications matchings O*(n2m) (using Jerrum, Sinclair’89) spin systems: Ising model O*(n2) for <C (using Marinelli, Olivieri’95) k-colorings O*(n2) for k>2 (using Jerrum’95) total running time
easy = hot hard = cold
Hamiltonian 4 2 1 0
Big set = Hamiltonian H : {0,...,n} Goal: estimate |H-1(0)| |H-1(0)| = E[X1] ... E[Xt]
Distributions between hot and cold • = inverse temperature • = 0 hot uniform on • = cold uniform on H-1(0) (x) exp(-H(x)) (Gibbs distributions)
Distributions between hot and cold (x) exp(-H(x)) exp(-H(x)) (x) = Z() Normalizing factor = partition function Z()= exp(-H(x)) x
Partition function Z()= exp(-H(x)) x have: Z(0) = || want: Z() = |H-1(0)|
Assumption: we have a sampler oracle for exp(-H(x)) (x) = Z() SAMPLER ORACLE subset of V from graph G
Assumption: we have a sampler oracle for exp(-H(x)) (x) = Z() W
Assumption: we have a sampler oracle for exp(-H(x)) (x) = Z() W X = exp(H(W)( - ))
Assumption: we have a sampler oracle for exp(-H(x)) (x) = Z() W X = exp(H(W)( - )) can obtain the following ratio: Z() E[X] = (s) X(s) = Z() s
Our goal restated Partition function Z() = exp(-H(x)) x Goal: estimate Z()=|H-1(0)| Z(1) Z(2) Z(t) Z() = Z(0) ... Z(0) Z(1) Z(t-1) 0 = 0 < 1 < 2 < ... < t =
Our goal restated Z(1) Z(2) Z(t) Z() = Z(0) ... Z(0) Z(1) Z(t-1) Cooling schedule: 0 = 0 < 1 < 2 < ... < t = How to choose the cooling schedule? minimize length, while satisfying Z(i) V[Xi] =O(1) E[Xi] = E[Xi]2 Z(i-1)
Parameters: A andn n Z() = ak e- k k=0 Z() = exp(-H(x)) x Z(0) = A H: {0,...,n} ak = |H-1(k)|
Parameters Z(0) = A H: {0,...,n} A n 2V E independent sets matchings perfect matchings k-colorings V V! V! V kV E
Previous cooling schedules Z(0) = A H: {0,...,n} 0 = 0 < 1 < 2 < ... < t = “Safe steps” • + 1/n • (1 + 1/ln A) ln A (Bezáková,Štefankovič, Vigoda,V.Vazirani’06) Cooling schedules of length O( n ln A) (Bezáková,Štefankovič, Vigoda,V.Vazirani’06) O( (ln n) (ln A) )
No better fixed schedule possible A 1+a Z(0) = A H: {0,...,n} A schedule that works for all - n Za() = (1 + a e ) (with a[0,A-1]) has LENGTH ( (ln n)(ln A) )
Parameters Z(0) = A H: {0,...,n} Our main result: can get adaptive schedule of length O* ( (ln A)1/2 ) Previously: non-adaptive schedules of length *( ln A )
Existential part can get adaptive schedule of length O* ( (ln A)1/2 ) Lemma: for every partition function there exists a cooling schedule of length O*((ln A)1/2) there exists
Express SCV using partition function Z() E[X] = Z() (going from to ) W X = exp(H(W)( - )) E[X2] Z(2-) Z() C = E[X]2 Z()2
E[X2] Z(2-) Z() C = E[X]2 Z()2 2- f()=ln Z() Proof: C’=(ln C)/2
f is decreasing f is convex f’(0) –n f(0) ln A f()=ln Z() either f or f’ changes a lot Proof: Let K:=f 1 1 (ln |f’|) K
f:[a,b] R, convex, decreasing can be “approximated” using f’(a) (f(a)-f(b)) f’(b) segments
Technicality: getting to 2- Proof: 2-
Technicality: getting to 2- Proof: i 2- i+1
Technicality: getting to 2- Proof: i 2- i+2 i+1
Technicality: getting to 2- Proof: ln ln A extra steps i 2- i+2 i+1 i+3
Existential Algorithmic can get adaptive schedule of length O* ( (ln A)1/2 ) can get adaptive schedule of length O* ( (ln A)1/2 ) there exists
Algorithmic construction exp(-H(x)) (x) = Z() Our main result: using a sampler oracle for we can construct a cooling schedule of length 38 (ln A)1/2(ln ln A)(ln n) Total number of oracle calls 107 (ln A) (ln ln A+ln n)7 ln (1/)
Algorithmic construction current inverse temperature ideally move to such that Z() E[X2] E[X] = B1 B2 Z() E[X]2
Algorithmic construction current inverse temperature ideally move to such that Z() E[X2] E[X] = B1 B2 Z() E[X]2 X is “easy to estimate”
Algorithmic construction current inverse temperature ideally move to such that Z() E[X2] E[X] = B1 B2 Z() E[X]2 we make progress (assuming B1>1)
Algorithmic construction current inverse temperature ideally move to such that Z() E[X2] E[X] = B1 B2 Z() E[X]2 need to construct a “feeler” for this
Algorithmic construction current inverse temperature ideally move to such that Z() E[X2] E[X] = B1 B2 Z() E[X]2 = Z() Z(2-) Z() Z() need to construct a “feeler” for this
Algorithmic construction current inverse temperature bad “feeler” ideally move to such that Z() E[X2] E[X] = B1 B2 Z() E[X]2 = Z() Z(2-) Z() Z() need to construct a “feeler” for this
Rough estimator for n Z() = ak e- k k=0 Z() Z() ak e- k For W we have P(H(W)=k) = Z()
Rough estimator for n Z() = ak e- k k=0 Z() Z() If H(X)=k likely at both , rough estimator ak e- k For W we have P(H(W)=k) = Z() ak e- k For U we have P(H(U)=k) = Z()
Rough estimator for Z() Z() Z() Z() ak e- k For W we have P(H(W)=k) = Z() ak e- k For U we have P(H(U)=k) = Z() P(H(U)=k) P(H(W)=k) = ek(-)
Rough estimator for n Z() = ak e- k k=0 Z() Z() d ak e- k For W we have P(H(W)[c,d]) = k=c Z()
Rough estimator for Z() Z() Z() Z() Z() Z() If |-| |d-c| 1 then 1 P(H(U)[c,d]) P(H(W)[c,d]) ec(-) e e We also need P(H(U) [c,d]) P(H(W) [c,d]) to be large.
Split {0,1,...,n} into h 4(ln n) ln A intervals [0],[1],[2],...,[c,c(1+1/ ln A)],... for any inverse temperature there exists a interval with P(H(W) I) 1/8h We say that I is HEAVY for