230 likes | 413 Views
Sampling-based Approximation Algorithms for Multi-stage Stochastic Optimization. Chaitanya Swamy Caltech and U. Waterloo Joint work with David Shmoys Cornell University. Stochastic Optimization. Way of modeling uncertainty .
E N D
Sampling-based Approximation Algorithms for Multi-stage Stochastic Optimization Chaitanya Swamy Caltech and U. Waterloo Joint work with David Shmoys Cornell University
Stochastic Optimization • Way of modeling uncertainty. • Exact data is unavailable or expensive – data is uncertain, specified by a probability distribution. Want to make the best decisions given this uncertainty in the data. • Applications in logistics, transportation models, financial instruments, network design, production planning, … • Dates back to 1950’s and the work of Dantzig.
Stochastic Recourse Models Given : Probability distribution over inputs. Stage I : Make some advance decisions – plan ahead or hedge against uncertainty. Uncertainty evolves through various stages. Learn new information in each stage. Can take recourse actions in each stage – canaugment earlier solution paying a recourse cost. Choose initial (stage I) decisions to minimize (stage I cost) + (expected recourse cost).
2-stage problem º 2 decision points k-stage problem º k decision points stage I 0.2 0.4 0.3 stage II 0.5 scenarios in stage k stage I 0.2 0.02 0.3 0.1 stage IIscenarios
2-stage problem º 2 decision points k-stage problem º k decision points stage I stage I 0.2 0.2 0.4 0.3 stage II 0.02 0.3 0.1 0.5 stage IIscenarios scenarios in stage k Choose stage I decisions to minimize expected total cost = (stage I cost) + Eall scenarios [cost of stages 2 … k].
stage I A1ÍU AkÍU Stochastic Set Cover (SSC) Universe U = {e1, …, en }, subsets S1, S2, …, SmÍ U, set S has weight wS. Deterministic problem: Pick a minimum weight collection of sets that covers each element. • Stochastic version: Set of elements to be covered is given by a probability distribution. • subset A Í U to be covered (scenario) is revealed after k stages • choose some sets initially – stage I • can pick additional sets in each stage • paying recourse cost. MinimizeExpected Total cost= Escenarios AÍU [cost of sets picked for scenario A in stages1, … k].
Stochastic Set Cover (SSC) Universe U = {e1, …, en }, subsets S1, S2, …, SmÍ U, set S has weight wS. Deterministic problem: Pick a minimum weight collection of sets that covers each element. Stochastic version: Set of elements to be covered is given by a probability distribution. • How is the probability distribution on subsets specified? • A short (polynomial) list of possible scenarios • Independent probabilities that each element exists • A black box that can be sampled.
Approximation Algorithm • Hard to solve the problem exactly. • Even special cases are #P-hard. • Settle for approximate solutions. Give polytime algorithm that always finds near-optimal solutions. • A is a a-approximation algorithm if, • A runs in polynomial time. • A(I) ≤ a.OPT(I) on all instances I, • ais called the approximation ratio of A.
Previous Models Considered • 2-stage problems • polynomial scenario model: Dye, Stougie & Tomasgard; Ravi & Sinha; Immorlica, Karger, Minkoff & Mirrokni. • Immorlica et al.: also consider independent activation model proportional costs: (stage II cost) = l(stage I cost), • e.g., wSA = l.wSfor each set S, in each scenario A. • Gupta, Pál, Ravi & Sinha: black-box model but also with proportional costs. • Shmoys, S (SS04): black-box model with arbitrary costs. • gave an approximation scheme for 2-stage LPs + • rounding procedure that “reduces” stochastic problems to their deterministic versions.
Previous Models (contd.) • Multi-stage problems • Hayrapetyan, S & Tardos: 2k-approximation algorithm for k-stage Steiner tree. • Gupta, Pál, Ravi & Sinha: also other k-stage problems. 2k-approximation algorithm for Steiner tree factors exponential in k for vertex cover, facility location. Both only consider proportional, scenario-dependent costs.
Results from S, Shmoys ’05 • Give the first fully polynomial approximation scheme (FPAS) for a large class of k-stage stochastic linear programs for any fixed k. • black-box model:arbitrary distribution. • no assumptions on costs. • algorithm is the Sample Average Approximation (SAA) method. First proof that SAA works for (a class of) k-stage LPs with poly-bounded sample size. Shapiro ’05: k-stage programs but with independent stages Kleywegt, Shapiro & Homem De-Mello ’01: bounds for 2-stage programs Charikar, Chekuri & Pál ’05: another proof that SAA works for (a class of) 2-stage programs.
Results (contd.) • FPAS + rounding technique of SS04 gives approximation algorithms for k-stage stochastic integer programs. • no assumptions on distribution or costs • improve upon various results obtained in more restricted models: e.g., O(k)-approx. for k-stage vertex cover (VC) , facility location. Munagala has improved factor for k-stage VC to 2.
stage I 0.2 0.02 0.3 A Linear Program for 2-stage SSC pA : probability of scenario A Í U. Let cost wSA = WS for each set S, scenario A. wS= stage I cost of set S stage IIscenario A Í U xS: 1 if set S is picked in stage I yA,S : 1 if S is picked in scenario A Minimize ∑SwSxS +∑AÍU pA ∑S WSyA,S s.t. ∑S:eÎS xS + ∑S:eÎS yA,S ≥ 1for each A Í U, eÎA xS, yA,S ≥ 0 for each S, A Exponentially many variables and constraints. Equivalent compact, convex program: Minimize h(x) =∑SwSxS +∑AÍU pAfA(x) s.t. 0 ≤ xS ≤ 1for each S fA(x) =min{∑S WSyA,S : ∑S:eÎS yA,S ≥ 1– ∑S:eÎS xSfor each eÎA yA,S ≥ 0 for each S}
Sample Average Approximation • Sample Average Approximation (SAA) method: • Sample some N times from distribution • Estimate pA by qA = frequency of occurrence of scenario A = nA/N. • True problem: minxÎP(h(x) = w.x + ∑AÍU pA fA(x)) (P) • Sample average problem: minxÎP(h'(x) = w.x + ∑AÍU qA fA(x)) (SA-P) • Size of (SA-P) as an LP depends on N – how large should N be?
Sample Average Approximation • Sample Average Approximation (SAA) method: • Sample some N times from distribution • Estimate pA by qA = frequency of occurrence of scenario A = nA/N. • True problem: minxÎP(h(x) = w.x + ∑AÍU pA fA(x)) (P) • Sample average problem: minxÎP(h'(x) = w.x + ∑AÍU qA fA(x)) (SA-P) • Size of (SA-P) as an LP depends on N – how large should N be? Wanted result: With polynomial N, x solves (SA-P)Þh(x)≈OPT. • Possible approach: Try to show that h'(.) and h(.) take similar values. • Problem: Rare scenarios can significantly influence value of h(.), but will almost never be sampled. • Key insight: Rare scenarios do not much affect the optimal solution x* • Þ instead of function value, look at how function varies with x Þ show that “slopes” of h'(.) and h(.) are “close” to each other
Closeness-in-subgradients True problem: minxÎP(h(x) = w.x + ∑AÍU pA fA(x)) (P) Sample average problem: minxÎP(h'(x)= w.x + ∑AÍU qA fA(x)) (SA-P) Slope ºsubgradient dÎÂmis a subgradientof h(.) at u, if "v, h(v) – h(u) ≥ d.(v–u). d is an e-subgradientof h(.) at u, if "vÎP, h(v) – h(u) ≥ d.(v–u) – e.h(v) – e.h(u). Closeness-in-subgradients: At “many” points u in P, $vector d'u s.t. (*) d'u is a subgradient of h'(.) at u, AND an e-subgradient of h(.) at u. True with high probability for h(.) and h'(.). Lemma: For any convex functions g(.), g'(.), if (*) holds then, x solves minxÎP g'(x) Þ x is a near-optimal solution to minxÎP g(x).
P g(x) ≤ g(u) u du Closeness-in-subgradients d is a subgradientof h(.) at u, if "v, h(v) – h(u) ≥ d.(v–u). d is an e-subgradientof h(.) at u, if "vÎP, h(v) – h(u) ≥ d.(v–u) – e.h(v) – e.h(u). Closeness-in-subgradients: At “many” points u in P, $vector d'u s.t. (*) d'u is a subgradient of h'(.) at u, AND an e-subgradient of h(.) at u. Lemma: For any convex functions g(.), g'(.), if (*) holds then, x solves minxÎP g'(x) Þ x is a near-optimal solution to minxÎP g(x). • Intuition: • subgradient determines minimizer of convex function. • ellipsoid-based algorithm of SS04 for convex minimization only uses (e-) subgradients: uses (e-) subgradient to cut ellipsoid at a feasible point u in P • (*) Þ can run algorithm on bothminxÎP g(x) and • minxÎP g'(x) using same vectord'u at uÎP P g(x) ≤ g(u) u du Þ algorithm will returnxthat is near-optimal for both problems.
Proof for 2-stage SSC True problem: minxÎP(h(x) = w.x + ∑AÍU pA fA(x)) (P) Sample average problem: minxÎP(h'(x) = w.x + ∑AÍU qA fA(x)) (SA-P) • Let l= maxS WS /wS, zAºoptimal solution to dual of fA(x) at point x=uÎP. • Facts from SS04: • vector du = {du,S} with du,S = wS – ∑ApA ∑eÎAÇS zA is subgradient of h(.) at u; can write du,S = E[XS] where XS = wS – ∑eÎAÇS zAin scenario A • XSÎ [–WS, wS] Þ Var[XS] ≤ WS2for every set S • if d' = {d'S} is a vector such that |d'S – du,S| ≤ e.wSfor every set S then, • d' is an e-subgradient of h(.) at u. AÞ vector d'u with components d'u,S = wS – ∑AqA ∑eÎAÇS zA = Eq[XS] is a subgradient of h'(.) at u B, CÞ with poly(l2/e2.log(1/d)) samples, d'uis an e-subgradient of h(.) at u with probability ≥1– d Þpolynomial samples ensure that with high probability, at “many” points uÎP, d'u is an e-subgradient of h(.) at u property (*)
stage I stage I qA pA stage II stage II A A qA,B TA pA,B TA stage IIIscenario (A,B) specifies set of elements to cover stage IIIscenario (A,B) specifies set of elements to cover 3-stage SSC True distribution Sampled distribution True distribution {pA,B} in TA is only estimated by distribution {qA,B} Þ True and sample average problems solve different recourse problems for a given scenario A True problem: minxÎP(h(x) = w.x + ∑A pA fA(x)) (3-P) Sample avg. problem: minxÎP(h'(x) = w.x + ∑A qA gA(x)) (3SA-P) fA(x), gA(x) º2-stage set-cover problems specified by tree TA
Proof sketch for 3-stage SSC True problem: minxÎP(h(x) = w.x + ∑A pA fA(x)) (3-P) Sample avg. problem: minxÎP(h'(x) = w.x + ∑A qA gA(x)) (3SA-P) Want to show that h(.) and h'(.) are close in subgradients. main difficulty: h(.) and h'(.) solve different recourse problems Subgradient of h(.) at u is du ; du,S = wS – ∑A pA(dual soln. to fA(u)) Subgradient of h'(.) at u is d'u ; d'u,S = wS – ∑A qA(dual soln. to gA(u)) To show d' is an e-subgradient of h(.) need that: (dual soln. to gA(u)) is a near-optimal (dual soln. to fA(u)) This is a Sample average theorem for the dual of a 2-stage problem!
Proof sketch for 3-stage SSC True problem: minxÎP(h(x) = w.x + ∑A pA fA(x)) (3-P) Sample average problem: minxÎP(h'(x) = w.x + ∑A qA gA(x)) (3SA-P) • Subgradient of h(.) at u is du with du,S = wS – ∑A pA(dual soln. to fA(u)) • Subgradient of h'(.) at u is d'u with d'u,S = wS – ∑A qA(dual soln. to gA(u)) • To show d'u is an e-subgradient of h(.) need that: • (dual soln. to gA(u)) is a near-optimal (dual soln. to fA(u)) • Idea: Show that the two dual objective f’ns. are close in subgradients • Problem: Cannot get closeness-in-subgradients by looking at standardexponential size LP-dualof fA(x), gA(x) • formulate a new compact non-linear dual of polynomial size. • (approximate) subgradient of dual objective function comes from (near-) optimal solution to a2-stage primal LP: use earlier SAA result. • Recursively apply this idea to solve k-stage stochastic LPs.
Summary of Results • Give the first approximation scheme to solve a broad class of k-stage stochastic linear programs for any fixed k. • prove that Sample Average Approximation method works for our class of k-stage programs. • Obtain approximation algorithms for k-stage stochastic integer problems – no assumptions on costs or distribution. • k.log n-approx. for k-stage set cover. • O(k)-approx. for k-stage vertex cover, multicut on trees, uncapacitated facility location (FL), some other FL variants. • (1+e)-approx. for multicommodity flow. Results generalize and/or improve previous results obtained in restricted k-stage models.