1 / 30

Sampling-based Approximation Algorithms for Multi-stage Stochastic Optimization

Sampling-based Approximation Algorithms for Multi-stage Stochastic Optimization. Chaitanya Swamy University of Waterloo Joint work with David Shmoys Cornell University. Stochastic Optimization. Way of modeling uncertainty .

lyneth
Download Presentation

Sampling-based Approximation Algorithms for Multi-stage Stochastic Optimization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sampling-based Approximation Algorithms for Multi-stage Stochastic Optimization Chaitanya Swamy University of Waterloo Joint work with David Shmoys Cornell University

  2. Stochastic Optimization • Way of modeling uncertainty. • Exact data is unavailable or expensive – data is uncertain, specified by a probability distribution. Want to make the best decisions given this uncertainty in the data. • Applications in logistics, transportation models, financial instruments, network design, production planning, … • Dates back to 1950’s and the work of Dantzig.

  3. Stochastic Recourse Models Given : Probability distribution over inputs. Stage I : Make some advance decisions – plan ahead or hedge against uncertainty. Uncertainty evolves through various stages. Learn new information in each stage. Can take recourse actions in each stage – canaugment earlier solution paying a recourse cost. Choose initial (stage I) decisions to minimize (stage I cost) + (expected recourse cost).

  4. 2-stage problem º 2 decision points k-stage problem º k decision points stage I 0.2 0.4 0.3 stage II 0.5 scenarios in stage k stage I 0.2 0.02 0.3 0.1 stage IIscenarios

  5. 2-stage problem º 2 decision points k-stage problem º k decision points stage I stage I 0.2 0.2 0.4 0.3 stage II 0.02 0.3 0.1 0.5 stage IIscenarios scenarios in stage k Choose stage I decisions to minimize expected total cost = (stage I cost) + Eall scenarios [cost of stages 2 … k].

  6. stage I A1ÍU AkÍU Stochastic Set Cover (SSC) Universe U = {e1, …, en }, subsets S1, S2, …, SmÍ U, set S has weight wS. Deterministic problem: Pick a minimum weight collection of sets that covers each element. • Stochastic version: Target set of elements to be covered is given by a probability distribution. • target subset A Í U to be covered (scenario) is revealed after k stages • choose some sets Si initially – stage I • can pick additional sets Si in each stage • paying recourse cost. MinimizeExpected Total cost= Escenarios A ÍU [cost of sets picked for scenario A in stages1, … k].

  7. A B D C stage I 0.8 0.2 D C B A stage II 0.5 0.5 0.3 0.7 stage III B A D C

  8. A B D C stage I 0.8 0.2 D C B A stage II 0.5 0.5 0.3 0.7 stage III B A D C

  9. Stochastic Set Cover (SSC) Universe U = {e1, …, en }, subsets S1, S2, …, SmÍ U, set S has weight wS. Deterministic problem: Pick a minimum weight collection of sets that covers each element. Stochastic version: Target set of elements to be covered is given by a probability distribution. • How is the probability distribution on subsets specified? • A short (polynomial) list of possible scenarios • Independent probabilities that each element exists • A black box that can be sampled.

  10. Approximation Algorithm • Hard to solve the problem exactly. • Even special cases are #P-hard. • Settle for approximate solutions. Give polytime algorithm that always finds near-optimal solutions. • A is a a-approximation algorithm if, • A runs in polynomial time. • A(I) ≤ a.OPT(I) on all instances I, • ais called the approximation ratio of A.

  11. Previous Models Considered • 2-stage problems • polynomial scenario model: Dye, Stougie & Tomasgard; Ravi & Sinha; Immorlica, Karger, Minkoff & Mirrokni. • Immorlica et al.: also consider independent activation model proportional costs: (stage II cost) = l(stage I cost), • e.g., wSA = l.wSfor each set S, in each scenario A. • Gupta, Pál, Ravi & Sinha: black-box model but also with proportional costs. • Shmoys, S (SS04): black-box model with arbitrary costs. • gave an approximation scheme for 2-stage LPs + • rounding procedure that “reduces” stochastic problems to their deterministic versions.

  12. Previous Models (contd.) • Multi-stage problems • Hayrapetyan, S & Tardos: O(k)-approximation algorithm for k-stage Steiner tree. • Gupta, Pál, Ravi & Sinha: also other k-stage problems. 2k-approximation algorithm for Steiner tree factors exponential in k for vertex cover, facility location. Both only consider proportional, scenario-dependent costs.

  13. Our Results • Give the first fully polynomial approximation scheme (FPAS) for a broad class of k-stage stochastic linear programs for any fixed k. • black-box model:arbitrary distribution. • no assumptions on costs. • algorithm is the Sample Average Approximation (SAA) method. First proof that SAA works for (a class of) k-stage LPs with poly-bounded sample size. Shapiro ’05: k-stage programs but with independent stages Kleywegt, Shapiro & Homem De-Mello ’01: bounds for 2-stage programs S, Shmoys ’05: unpublished note that SAA works for 2-stage LPs Charikar, Chekuri & Pál ’05: another proof that SAA works for (a class of) 2-stage programs

  14. Results (contd.) • FPAS + rounding technique of SS04 gives approximation algorithms for k-stage stochastic integer programs. • no assumptions on distribution or costs • improve upon various results obtained in more restricted models: e.g., O(k)-approx. for k-stage vertex cover (VC) , facility location. Munagala;Srinivasan: improved factor for k-stage VC to 2.

  15. stage I 0.2 0.02 0.3 A Linear Program for 2-stage SSC pA : probability of scenario A Í U. Let cost wAS = WS for each set S, scenario A. stage IIscenario A Í U xS: 1 if set S is picked in stage I yA,S : 1 if S is picked in scenario A Minimize ∑SwSxS +∑AÍU pA ∑S WSyA,S s.t. ∑S:eÎS xS + ∑S:eÎS yA,S ≥ 1for each A Í U, eÎA xS, yA,S ≥ 0 for each S, A Exponentially many variables and constraints. Equivalent compact, convex program: Minimize h(x) =∑SwSxS +∑AÍU pAfA(x) s.t. 0 ≤ xS ≤ 1for each S fA(x) =min{∑S WSyA,S : ∑S:eÎS yA,S ≥ 1– ∑S:eÎS xSfor each eÎA yA,S ≥ 0 for each S}

  16. Sample Average Approximation • Sample Average Approximation (SAA) method: • Sample some N times from distribution • Estimate pA by qA = frequency of occurrence of scenario A = nA/N. • True problem: minxÎP(h(x) = w.x + ∑AÍU pA fA(x)) (P) • Sample average problem: minxÎP(h'(x) = w.x + ∑AÍU qA fA(x)) (SA-P) • Size of (SA-P) as an LP depends on N – how large should N be? Wanted result: With poly-bounded N, x solves (SA-P)Þh(x)≈OPT. Possible approach: Try to show that h'(.) and h(.) take similar values. Problem: Rare scenarios can significantly influence value of h(.), but will almost never be sampled.

  17. Sample Average Approximation • Sample Average Approximation (SAA) method: • Sample some N times from distribution • Estimate pA by qA = frequency of occurrence of scenario A = nA/N. • True problem: minxÎP(h(x) = w.x + ∑AÍUpAfA(x)) (P) • Sample average problem: minxÎP(h'(x) = w.x + ∑AÍUqAfA(x)) (SA-P) • Size of (SA-P) as an LP depends on N – how large should N be? h(x) h'(x) x Wanted result: With poly-bounded N, x solves (SA-P) Þ h(x) ≈ OPT. x* x • Possible approach: Try to show that h'(.) and h(.) take similar values. • Problem: Rare scenarios can significantly influence value of h(.), but will almost never be sampled. • Key insight: Rare scenarios do not much affect the optimal first-stage decisions x* • Þ instead of function value, look at how function varies with x Þ show that “slopes” of h'(.) and h(.) are “close” to each other

  18. Closeness-in-subgradients True problem: minxÎP(h(x) = w.x + ∑AÍU pA fA(x)) (P) Sample average problem: minxÎP(h'(x) = w.x + ∑AÍU qA fA(x)) (SA-P) Slope ºsubgradient dÎÂmis a subgradientof h(.) at u, if "v, h(v) – h(u) ≥ d.(v–u). d is an e-subgradientof h(.) at u, if "vÎP, h(v) – h(u) ≥ d.(v–u) – e.h(v) – e.h(u). Closeness-in-subgradients: At “most” points u in P, $vector d'usuch that (*) d'u is a subgradient of h'(.) at u, AND an e-subgradient of h(.) at u. True with high probability for h(.) and h'(.). Lemma: For any convex functions g(.), g'(.), if (*) holds then, x solves minxÎP g'(x) Þ x is a near-optimal solution to minxÎP g(x).

  19. Closeness-in-subgradients dÎÂmis a subgradientof h(.) at u, if "v, h(v) – h(u) ≥ d.(v–u). d is an e-subgradientof h(.) at u, if "vÎP, h(v) – h(u) ≥ d.(v–u) – e.h(v) – e.h(u). Closeness-in-subgradients: At “most” points u in P, $vector d'u such that (*) d'u is a subgradient of h'(.) at u, AND an e-subgradient of h(.) at u. Lemma: For any convex functions g(.), g'(.), if (*) holds then, x solves minxÎP g'(x) Þ x is a near-optimal solution to minxÎP g(x). • Intuition: • Minimizer of convex function is determined by subgradient. • Ellipsoid-based algorithm of SS04 for convex minimization • only uses (e-) subgradients: uses (e-) subgradient to cut • ellipsoid at a feasible point u in P • (*) Þ can run SS04 algorithm on bothminxÎP g(x) and • minxÎP g'(x) using same vectord'u to cut ellipsoid at uÎP • Þ algorithm will returnxthat is near-optimal for both problems. P g(x) ≤ g(u) u du

  20. Proof for 2-stage SSC True problem: minxÎP(h(x) = w.x + ∑AÍU pA fA(x)) (P) Sample average problem: minxÎP(h'(x) = w.x + ∑AÍU qA fA(x)) (SA-P) • Let l= maxS WS /wS, zAº optimal dual solution for scenario A at point uÎP. • Facts from SS04: • vector du = {du,S} with du,S = wS – ∑ApA ∑eÎAÇS zA is subgradient of h(.) at u; can write du,S = E[XS] where XS = wS – ∑eÎAÇS zAin scenario A • XSÎ [–WS, wS] Þ Var[XS] ≤ WS2for every set S • if d' = {d'S} is a vector such that |d'S – du,S| ≤ e.wSfor every set S then, • d' is an e-subgradient of h(.) at u. AÞ vector d'u with components d'u,S = wS – ∑AqA ∑eÎAÇS zA = Eq[XS] is a subgradient of h'(.) at u B, CÞ with poly(l2/e2.log(1/d)) samples, d'uis an e-subgradient of h(.) at u with probability ≥1– d Þpolynomial samples ensure that with high probability, at “most” points uÎP, d'u is an e-subgradient of h(.) at u property (*)

  21. stage I stage I qA pA stage II stage II A A qA,B TA pA,B TA stage IIIscenario (A,B) specifies set of elements to cover stage IIIscenario (A,B) specifies set of elements to cover 3-stage SSC True distribution Sampled distribution • True distribution pA is estimated by qA • True distribution {pA,B} in TA is only estimated by distribution {qA,B} • ÞTrue and sample average problems solvedifferent recourse problemsfor a given scenario A True problem: minxÎP (h(x) = w.x + ∑A pA fA(x)) (3-P) Sample avg. problem: minxÎP (h'(x) = w.x + ∑A qA gA(x)) (3SA-P) fA(x), gA(x) º2-stage set-cover problems specified by tree TA

  22. 3-stage SSC (contd.) True problem: minxÎP (h(x) = w.x + ∑A pA fA(x)) (3-P) Sample avg. problem: minxÎP (h'(x) = w.x + ∑A qA gA(x)) (3SA-P) main difficulty: h(.) and h'(.) solve different recourse problems • From current 2-stage theorem, can infer that for “most” xÎP, • any second-stage soln. y that minimizes gA(x) also “nearly” minimizes fA(x) – is this enough to prove desired theorem for h(.) and h'(.)? • Suppose H(x) = miny a(x,y) • H'(x) = miny b(x,y) • s.t. "x, each y that minimizes b(x,.) also minimizes a(x,.) • If x minimizes H'(.), does it also approximately minimize H(.)? • NO: e.g., a(x,y) = A(x)+(y – y0)2b(x,y)= B(x)+(y – y0)2 • where A(.)­ f’n of x, B(.)¯ f’n of x a(.), b(.) are convex f’ns.

  23. Proof sketch for 3-stage SSC True problem: minxÎP(h(x) = w.x + ∑A pA fA(x)) (3-P) Sample avg. problem: minxÎP(h'(x) = w.x + ∑A qA gA(x)) (3SA-P) main difficulty: h(.) and h'(.) solve different recourse problems Will show that h(.) and h'(.) are close in subgradient. Subgradient of h(.) at u is du ; du,S = wS – ∑A pA(dual soln. to fA(u)) Subgradient of h'(.) at u is d'u ; d'u,S = wS – ∑A qA(dual soln. to gA(u)) To show d' is an e-subgradient of h(.) need that: (dual soln. to gA(u)) is a near-optimal (dual soln. to fA(u)) This is a Sample average theorem for the dual of a 2-stage problem!

  24. Proof sketch for 3-stage SSC True problem: minxÎP(h(x) = w.x + ∑A pA fA(x)) (3-P) Sample average problem: minxÎP(h'(x) = w.x + ∑A qA gA(x)) (3SA-P) Subgradient of h(.) at u is du with du,S = wS – ∑A pA(dual soln. to fA(u)) Subgradient of h'(.) at u is d'u with d'u,S = wS – ∑A qA(dual soln. to gA(u)) To show d'u is an e-subgradient of h(.) need that: (dual soln. to gA(u)) is a near-optimal (dual soln. to fA(u)) Idea: Show that the two dual objective f’ns. are close in subgradients Problem: Cannot get closeness-in-subgradients by looking at standardexponential size LP-dualof fA(x), gA(x)

  25. stage I pA stage II A pA,B TA stage IIIscenario (A,B) specifies set of elements to cover fA(x) = min ∑SwASyA,S + ∑scenarios (A,B), S pA,B.wAB,SzA,B,S s.t. ∑S:eÎSyA,S + ∑S:eÎSzA,B,S ≥ 1 – ∑S:eÎSxS"scenarios (A,B), "eÎE(A,B) yA,S, zA,B,S ≥ 0 "scenarios (A,B), "S Dual is max ∑A,B,e (1 – ∑S:eÎS xS)aA,B,e s.t. ∑scenarios (A,B), eÎSaA,B,e ≤ wAS"S ∑eÎSaA,B,e ≤ pA,B.wAB,S"scenarios (A,B), "S aA,B,e ≥ 0 "scenarios (A,B), "eÎE(A,B)

  26. Proof sketch for 3-stage SSC True problem: minxÎP(h(x) = w.x + ∑A pA fA(x)) (3-P) Sample average problem: minxÎP(h'(x) = w.x + ∑A qA gA(x)) (3SA-P) • Subgradient of h(.) at u is du with du,S = wS – ∑A pA(dual soln. to fA(u)) • Subgradient of h'(.) at u is d'u with d'u,S = wS – ∑A qA(dual soln. to gA(u)) • To show d'u is an e-subgradient of h(.) need that: • (dual soln. to gA(u)) is a near-optimal (dual soln. to fA(u)) • Idea: Show that the two dual objective f’ns. are close in subgradients • Problem: Cannot get closeness-in-subgradients by looking at standardexponential size LP-dualof fA(x), gA(x) • formulate a new compact non-linear dual of polynomial size. • (approximate) subgradient of dual objective function comes from (near-) optimal solution to a2-stage primal LP: use earlier SAA result. • Recursively apply this idea to solve k-stage stochastic LPs.

  27. Summary of Results • Give the first approximation scheme to solve a broad class of k-stage stochastic linear programs for any fixed k. • prove that Sample Average Approximation method works for our class of k-stage programs. • Obtain approximation algorithms for k-stage stochastic integer problems – no assumptions on costs or distribution. • k.log n-approx. for k-stage set cover. (Srinivasan: log n) • O(k)-approx. for k-stage vertex cover, multicut on trees, uncapacitated facility location (FL), some other FL variants. • (1+e)-approx. for multicommodity flow. Results improve previous results obtained in restricted k-stage models.

  28. Open Questions • Obtain approximation factors independent of k for k-stage (integer) problems: e.g., k-stage FL, k-stage Steiner tree • Improve analysis of SAA method, or obtain some other (polynomial) sampling algorithm: • any a-approx. solution to constructed problem gives (a+e)-approx. solution to true problem • better dependence on k – are exp(k) samples required? • improved sample-bounds when stages are independent?

  29. Thank You.

More Related