1 / 54

Adaptive annealing: a near-optimal connection between sampling and counting

Adaptive annealing: a near-optimal connection between sampling and counting. Daniel Štefankovič (University of Rochester) Santosh Vempala Eric Vigoda (Georgia Tech). Counting. independent sets spanning trees matchings perfect matchings k-colorings.

norah
Download Presentation

Adaptive annealing: a near-optimal connection between sampling and counting

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Adaptive annealing: a near-optimal connectionbetween sampling and counting Daniel Štefankovič (University of Rochester) Santosh Vempala Eric Vigoda (Georgia Tech)

  2. Counting independent sets spanning trees matchings perfect matchings k-colorings

  3. (approx) counting  sampling Valleau,Card’72 (physical chemistry), Babai’79 (for matchings and colorings), Jerrum,Valiant,V.Vazirani’86, the outcome of the JVV reduction: random variables: X1 X2 ... Xt such that E[X1 X2 ... Xt] 1) = “WANTED” 2) the Xi are easy to estimate V[Xi] squared coefficient of variation (SCV) = O(1) E[Xi]2

  4. (approx) counting  sampling E[X1 X2 ... Xt] 1) = “WANTED” 2) the Xi are easy to estimate V[Xi] = O(1) E[Xi]2 Theorem (Dyer-Frieze’91) O(t2/2) samples (O(t/2) from each Xi) give 1 estimator of “WANTED” with prob3/4

  5. JVV for independent sets GOAL: given a graph G, estimate the number of independent sets of G 1 # independent sets = P( )

  6. P(AB)=P(A)P(B|A) JVV for independent sets ? ? ? ? ? ? P() = P() P() P( ) P( ) X1 X2 X3 X4 V[Xi] Xi [0,1] and E[Xi] ½  = O(1) E[Xi]2

  7. JVV: If we have a sampler oracle: random independent set of G SAMPLER ORACLE graph G then FPRAS using O(n2) samples.

  8. JVV: If we have a sampler oracle: random independent set of G SAMPLER ORACLE graph G then FPRAS using O(n2) samples. ŠVV: If we have a sampler oracle: SAMPLER ORACLE set from gas-model Gibbs at  , graph G then FPRAS using O*(n) samples.

  9. Application – independent sets O*( |V| ) samples suffice for counting Cost per sample (Vigoda’01,Dyer-Greenhill’01) time = O*( |V| ) for graphs of degree  4. Total running time: O* ( |V|2 ).

  10. Other applications matchings O*(n2m) (using Jerrum, Sinclair’89) spin systems: Ising model O*(n2) for <C (using Marinelli, Olivieri’95) k-colorings O*(n2) for k>2 (using Jerrum’95) total running time

  11. easy = hot hard = cold

  12. Hamiltonian 4 2 1 0

  13. Big set =  Hamiltonian H :   {0,...,n} Goal: estimate |H-1(0)| |H-1(0)| = E[X1] ... E[Xt]

  14. Distributions between hot and cold • = inverse temperature • = 0 hot uniform on  • = cold uniform on H-1(0)  (x)  exp(-H(x)) (Gibbs distributions)

  15. Distributions between hot and cold  (x)  exp(-H(x)) exp(-H(x))  (x) = Z() Normalizing factor = partition function Z()=  exp(-H(x)) x

  16. Partition function Z()=  exp(-H(x)) x have: Z(0) = || want: Z() = |H-1(0)|

  17. Assumption: we have a sampler oracle for  exp(-H(x))  (x) = Z() SAMPLER ORACLE subset of V from  graph G 

  18. Assumption: we have a sampler oracle for  exp(-H(x))  (x) = Z() W 

  19. Assumption: we have a sampler oracle for  exp(-H(x))  (x) = Z() W  X = exp(H(W)( - ))

  20. Assumption: we have a sampler oracle for  exp(-H(x))  (x) = Z() W  X = exp(H(W)( - )) can obtain the following ratio: Z() E[X] = (s) X(s) = Z() s

  21. Our goal restated Partition function Z() =  exp(-H(x)) x Goal: estimate Z()=|H-1(0)| Z(1) Z(2) Z(t) Z() = Z(0) ... Z(0) Z(1) Z(t-1) 0 = 0 < 1 < 2 < ... < t = 

  22. Our goal restated Z(1) Z(2) Z(t) Z() = Z(0) ... Z(0) Z(1) Z(t-1) Cooling schedule: 0 = 0 < 1 < 2 < ... < t =  How to choose the cooling schedule? minimize length, while satisfying Z(i) V[Xi] =O(1) E[Xi] = E[Xi]2 Z(i-1)

  23. Parameters: A andn n Z() = ak e- k  k=0 Z() =  exp(-H(x)) x Z(0) = A H:  {0,...,n} ak = |H-1(k)|

  24. Parameters Z(0) = A H:  {0,...,n} A n 2V E independent sets matchings perfect matchings k-colorings V V! V! V kV E

  25. Previous cooling schedules Z(0) = A H:  {0,...,n} 0 = 0 < 1 < 2 < ... < t =  “Safe steps” •  + 1/n •  (1 + 1/ln A) ln A  (Bezáková,Štefankovič, Vigoda,V.Vazirani’06) Cooling schedules of length O( n ln A) (Bezáková,Štefankovič, Vigoda,V.Vazirani’06) O( (ln n) (ln A) )

  26. No better fixed schedule possible A 1+a Z(0) = A H:  {0,...,n} A schedule that works for all - n Za() = (1 + a e ) (with a[0,A-1]) has LENGTH ( (ln n)(ln A) )

  27. Parameters Z(0) = A H:  {0,...,n} Our main result: can get adaptive schedule of length O* ( (ln A)1/2 ) Previously: non-adaptive schedules of length *( ln A )

  28. Existential part can get adaptive schedule of length O* ( (ln A)1/2 ) Lemma: for every partition function there exists a cooling schedule of length O*((ln A)1/2) there exists

  29. Express SCV using partition function Z() E[X] = Z() (going from  to ) W  X = exp(H(W)( - )) E[X2] Z(2-) Z() C = E[X]2 Z()2

  30. E[X2] Z(2-) Z() C = E[X]2 Z()2   2- f()=ln Z() Proof: C’=(ln C)/2

  31. f is decreasing f is convex f’(0)  –n f(0)  ln A f()=ln Z() either f or f’ changes a lot Proof: Let K:=f 1 1 (ln |f’|)  K

  32. f:[a,b]  R, convex, decreasing can be “approximated” using f’(a) (f(a)-f(b)) f’(b) segments

  33. Technicality: getting to 2- Proof:   2-

  34. Technicality: getting to 2- Proof: i   2- i+1

  35. Technicality: getting to 2- Proof: i   2- i+2 i+1

  36. Technicality: getting to 2- Proof: ln ln A extra steps i   2- i+2 i+1 i+3

  37. Existential  Algorithmic can get adaptive schedule of length O* ( (ln A)1/2 ) can get adaptive schedule of length O* ( (ln A)1/2 ) there exists

  38. Algorithmic construction exp(-H(x))  (x) = Z() Our main result: using a sampler oracle for  we can construct a cooling schedule of length  38 (ln A)1/2(ln ln A)(ln n) Total number of oracle calls  107 (ln A) (ln ln A+ln n)7 ln (1/)

  39. Algorithmic construction current inverse temperature  ideally move to  such that Z() E[X2] E[X] = B1  B2 Z() E[X]2

  40. Algorithmic construction current inverse temperature  ideally move to  such that Z() E[X2] E[X] = B1  B2 Z() E[X]2 X is “easy to estimate”

  41. Algorithmic construction current inverse temperature  ideally move to  such that Z() E[X2] E[X] = B1  B2 Z() E[X]2 we make progress (assuming B1>1)

  42. Algorithmic construction current inverse temperature  ideally move to  such that Z() E[X2] E[X] = B1  B2 Z() E[X]2 need to construct a “feeler” for this

  43. Algorithmic construction current inverse temperature  ideally move to  such that Z() E[X2] E[X] = B1  B2 Z() E[X]2 = Z() Z(2-) Z() Z() need to construct a “feeler” for this

  44. Algorithmic construction current inverse temperature  bad “feeler” ideally move to  such that Z() E[X2] E[X] = B1  B2 Z() E[X]2 = Z() Z(2-) Z() Z() need to construct a “feeler” for this

  45. Rough estimator for n Z() = ak e- k  k=0 Z() Z() ak e- k For W  we have P(H(W)=k) = Z()

  46. Rough estimator for n Z() = ak e- k  k=0 Z() Z() If H(X)=k likely at both ,  rough estimator ak e- k For W  we have P(H(W)=k) = Z() ak e- k For U  we have P(H(U)=k) = Z()

  47. Rough estimator for Z() Z() Z() Z() ak e- k For W  we have P(H(W)=k) = Z() ak e- k For U  we have P(H(U)=k) = Z() P(H(U)=k) P(H(W)=k) = ek(-)

  48. Rough estimator for n Z() = ak e- k  k=0 Z() Z() d  ak e- k For W  we have P(H(W)[c,d]) = k=c Z()

  49. Rough estimator for Z() Z() Z() Z() Z() Z() If |-| |d-c|  1 then 1 P(H(U)[c,d]) P(H(W)[c,d]) ec(-)  e e We also need P(H(U)  [c,d]) P(H(W)  [c,d]) to be large.

  50. Split {0,1,...,n} into h  4(ln n) ln A intervals [0],[1],[2],...,[c,c(1+1/ ln A)],... for any inverse temperature  there exists a interval with P(H(W) I)  1/8h We say that I is HEAVY for 

More Related