280 likes | 671 Views
Blind online optimization Gradient descent without a gradient. Abie Flaxman CMU Adam Tauman Kalai TTI Brendan McMahan CMU. Goal: find x f(x) ¸ max z 2 S f(z) – = f(x*) - . } . R d. Standard convex optimization. Convex feasible set S ½ < d Concave function f : S ! <. x*.
E N D
Blind online optimizationGradient descent without a gradient Abie Flaxman CMU Adam Tauman Kalai TTI Brendan McMahan CMU
Goal: find x f(x) ¸ maxz2Sf(z) – = f(x*) - } Rd Standard convex optimization Convex feasible set S ½<d Concave function f : S !< x*
Steepest ascent • Move in the direction of steepest ascent • Compute f’(x) (rf(x) in higher dimensions) • Works for convex optimization (and many other problems) x1 x2 x3 x4
Typical application • Company produces certain numbers of cars per month • Vector x 2<d (#Corollas, #Camrys, …) • Profit of company is concave function of production vector • Maximize total (eq. average) profit PROBLEMS
Problem definition and results • Sequence of unknown concave functions • period t:pick xt2 S, find out only ft(xt) • convex Theorem:
Online model expected regret • Holds for arbitrary sequences • Stronger than stochastic model: • f1, f2, …, i.i.d. from D • x* = arg minx2S ED[f(x)]
Outline • Problem definition • Simple algorithm • Analysis sketch • Variations • Related work & applications
First try Zinkevich ’03: If we could only compute gradients… f4(x4) f3(x3) f2(x2) f4 PROFIT f1(x1) f3 f2 f1 x4 x3 x2 x* x1 #CAMRYS
Idea: one point gradient With probability ½, estimate = f(x + )/ With probability ½, estimate = –f(x – )/ PROFIT E[ estimate ] ¼ f’(x) x x- x+ #CAMRYS
d-dimensional online algorithm x3 x4 x1 x2 S
Outline • Problem definition • Simple algorithm • Analysis sketch • Variations • Related work & applications
Analysis ingredients • E[1-point estimate] is gradient of • is small • Online gradient ascent analysis [Z03] • Online expected gradient ascent analysis • (Hidden complications)
1-pt gradient analysis PROFIT x- x+ #CAMRYS
1-pt gradient analysis (d-dim) • E[1-point estimate] is gradient of • is small 2 • 1
Online gradient ascent [Z03] (concave, bounded gradient)
Expected gradient ascent analysis • Regular deterministic gradient ascent on gt (concave, bounded gradient)
Hidden complication… Thin sets are bad S
Hidden complication… Round sets are good …reshape into “isotropic position” [LV03]
Outline • Problem definition • Simple algorithm • Analysis sketch • Variations • Related work & applications
Variations diameter • Works against adaptive adversary • Chooses ft knowing x1, x2, …, xt-1 • Also works if we only get a noisy estimate of ft(xt), i.e. E[ht(xt)|xt]=ft(xt) gradient bound
Related convex optimization Gradient descent, ... Ellipsoid, Random walk [BV02], Sim. annealing [KV05], Finite difference Gradient descent (stoch.) 1-pt. gradient appx. [G89,S97] Finite difference Gradient descent (online) [Z03] 1-pt. gradient appx. [BKM04] Finite difference [Kleinberg04]
S Multi-armed bandit (experts) 2 3 5 1 2 3 5 0 2 2 5 0 2 3 5 0 [R52,ACFS95,…]
Driving to work (online routing) [TW02,KV02, AK04,BM04] 25 Exponentially many paths… Exponentially many slot machines? Finite dimensions Exploration/exploitation tradeoff S
Conclusions and future work • Can “learn” to optimize a sequence of unrelated functions from evaluations • Answer to:“What is the sound of one hand clapping?” • Applications • Cholesterol • Paper airplanes • Advertising • Future work • Many players using same algorithm (game theory)