Efficient Simulated Annealing for Convex Optimization

Simulated annealing for convex optimization Adam Tauman Kalai, TTI-Chicago Santosh Vempala, MIT

Three points of this talk • Design efficient algorithm for a convex optimization problem • We get current best (worst-case) bounds • Analysis of simulated annealing showing provable efficiency • Better understand simulated annealing • Simulated annealing is also atype of interior point algorithm • Rapid convergence to local/global min(we do not say anything about local vs global min)

Outline • The optimization problem • Previous approaches • Simulated annealing • Results • Simulated annealing works fast • Geometric “cooling schedule” is optimal • Issues with shape/covariance

The optimization problem • Linear optimization (f(x) = c¢x) over convex set K • x* = argminx2K c¢x • Inputs: • n = number of dimensions (large) • unit vector c 2<n • accuracy  > 0 • convex set K ½<n • membership oracle K(x) = 1 if x 2 K, 0 otherwise • starting point x02 K • K contains radius-r ball, contained in radius-R ball • Goal: output x where c¢x · c¢x* +  K r R x0 c x*

The optimization problem • Linear optimization (f(x) = c¢x) over convex set K • x* = argminx2K c¢x • Inputs: • n = number of dimensions (large) • unit vector c 2<n • accuracy  > 0 • convex set K ½<n • membership oracle K(x) = 1 if x 2 K, 0 otherwise • starting point x02 K • K contains radius-r ball, contained in radius-R ball • Goal: output x where c¢x · c¢x* +  K c x0

Previous approaches • minx2K c¢x, c 2<n, convex K ½<n • Ellipsoid method can solve this problem in O*(n10) membership queries • O*(nS) Bertsimas-Vempala stochastic search • Use “uniform sample from convex set” subroutine We get O*(n½S) Given a “good” starting point, random walk finds almost uniformly random point in K in S=O*(n4) steps K x1 x2 x3 c * hides logarithmic factors, O*(n10)=O(n10 logc(nR/red)) Cut off sections…

O*(nS) algorithm [BV03] • Elegant analysis • Requires (n) phases in worst case In n-dimensional cone, most of mass is within 1/n of top n-dim. cone c ) ¼n phases cuts height in half

Simulated annealing Completely random discrete or continuous T=1 • Goal: minimize f(x) over set K • Approach: decreasing temp 0 < T < 1 • Phase i, temp Ti = Ti-1, T0 large • Biased random walk • During phase i, stationary distribution is di(x) / exp(-f(x)/Ti) “Geometric” cooling schedule ( <1) T=0 Global minimum x* x’ x Fill in graph

Simulated annealing alg. for our problem K • T0 = R (radius of containing ball) • Temperature Ti, sample from density di(x)/ exp(–(c ¢ x)/Ti) • Repeat “hit and run” random walk S times: • At x, pick random line L passing through x • Pick random x’ on K Å L with prob. / exp(–(c¢x’)/Ti) • Ti+1=(1-n-½)Ti • Stop at Tfinal=/n x x’ L Temperature is cut in half every ¼ n½ phases

Analysis • Sampling at temperature Tfinal=/n brings you within  of opt=c¢x* • With a “good” starting point, after S=O*(n4) steps, hit-and-run is located in K according to density di(x) / exp(-(c¢x)/Ti) (true for any log-concave density) [LV03] • “Good” start technical condition • di(x) and di-1(x) must be close

Uniform distribution over truncated cone has small std. dev. i-1 i c di(x)/ exp(-(c ¢ x)/T) has much larger std. dev. (factor of n½ larger) i-1 i

Optimal distributions and schedule • Cannot do better than n1/2 phases • Assumptions • Using a sequence of probability densities di(x) • di(x) is log-concave, i.e. log(di(x)) is concave • Variation distance |di-di-1| · 1-1/poly(n) • Boltzmann distributions with geometric cooling schedule are worst-case optimal for this class of stochastic search strategies

Shape estimation and covariance I lied • To do random walk, it’s important to have estimate of shape of object • For “isotropic” shapes, can just step in random direction • For non-isotropic shapes • Maintain a sample of n points at all times • Use covariance matrix of current sample to bias direction selection

Conclusions • In addition to possibly helping avoid local optima, S.A. converges rapidly to local opt • Simulated annealing » interior point method • Justification for Boltmann distributions with geometric cooling schedule • Future work: same analysis for convex functions • Future work: understand how simulated annealing helps avoid local minima… • Reverse-annealing used for volume estimation [LV04]

Efficient Simulated Annealing for Convex Optimization

Efficient Simulated Annealing for Convex Optimization

Presentation Transcript

Simulated Annealing

Simulated Annealing

Simulated Annealing

MonteCarlo Optimization Simulated Annealing

Simulated Annealing

SIMULATED ANNEALING

Simulated annealing

Simulated Annealing

Simulated Annealing

Simulated Annealing

Simulated Annealing Methods

Simulated Annealing

Simulated Annealing

Stochastic Optimization and Simulated Annealing

IE 607 Heuristic Optimization Simulated Annealing

SIMULATED ANNEALING

MonteCarlo Optimization (Simulated Annealing)

Simulated Annealing

Simulated annealing for convex optimization

Simulated annealing for convex optimization

Simulated Annealing

Simulated Annealing