Dependent Randomized Rounding in Matroid Polytopes (& Related Results)

Dependent Randomized Rounding in Matroid Polytopes(& Related Results) Chandra Chekuri Jan Vondrak Rico Zenklusen IBM Research MIT Univ. of Illinois

Example: Congestion Minimization Choose a path for each pair Minimize max number of paths using any edge (congestion) Special case: Edge-Disjoint Paths t3 s2 G t1 s1 t2 s3

Example: Congestion Minimization 0.7 • Choose a path for each pair • Minimize max number of paths using any edge (congestion) • Special case: Edge-Disjoint Paths • [Raghavan-Thompson’87] • Solve mc-flow relaxation (LP) • Randomly pick a path according to fractional solution • Chernoff bounds to show approx ratio of O(logn/log log n) 0.5 t3 0.25 0.3 s2 G 0.1 0.15 t1 s1 0.2 0.15 0.65 t2 s3

Chernoff-Hoeffding Concentration Bounds • X1, X2, ..., Xnindependent {0,1} random variables • E[Xi] = Pr[Xi = 1] = xi • a1, a2, ..., annumbers in [0,1] • μ = E[Σiai Xi] = Σiai xi Theorem: • Pr[Σiai Xi > (1+δ)μ] ≤ ( eδ / (1+δ)δ) μ • Pr[Σiai Xi < (1 - δ)μ] ≤ exp(-μδ2/2)

Example: Multipath Routing k2 = 1 Choose ki paths for pair (si, ti) (assume paths for pair disjoint) Minimize max number of paths using any edge (congestion) t3 s2 G t1 k1 = 2 s1 k3 = 2 t2 s3

Example: Multipath Routing 0.7 • Choose ki paths for pair (si, ti) • (assume paths for pair disjoint) • Minimize max number of paths using any edge (congestion) • [Srinivasan’99] • Solve mc-flow relaxation (LP) • Randomized pipage rounding • O(logn/log log n) approx via negative correlation 0.5 t3 0.3 0.3 s2 G 0.25 0.95 t1 s1 0.7 0.5 0.8 t2 s3

Dependent Randomized Rounding Randomized rounding while maintaining some dependency/correlation between variables

Dependent Randomized Rounding Randomized rounding while maintaining some dependency/correlation between variables Several variants in literature This talk: dependent randomized rounding to satisfy a matroid base constraint while retaining concentration bounds similar to independent rounding Briefly, related work on matroid intersection and non-bipartite graph matchings

Crossing Spanning Trees and ATSP Undir graph G=(V,E) Cuts S1, S2, …, Sm Find spanning tree T that minimizes max # of edges crossing a given cut [Bilo-Goyal-Ravi-Singh-’04] [Fekete-Lubbecke-Meijer’04]

Crossing Spanning Trees and ATSP Undir graph G=(V,E) Cuts S1, S2, …, Sm Find spanning tree T that minimizes max # of edges crossing a given cut [Asadpouretal] • Solve LP: x point in spanning tree polytope of G • Dependent rounding via maximum entropy sampling • O(log m/log log m) approx • Also O(log n/log log n) for ATSP (several other ideas) 0.6 0.4 0.7 1 0.3 0.7 0.9 1 0.4

Tool: Negative Correlation • X1, X2two binary ({0,1}) random variables • X1, X2arenegatively correlated if E[X1 X2] ≤ E[X1] E[X2] • That is, Pr[X1 = 1 | X2 = 1] ≤ Pr[X1 = 1] and Pr[X2 = 1 | X1 = 1] ≤ Pr[X2 = 1]

Tool: Negative Correlation • X1, X2two binary random variables • X1, X2arenegatively correlated if E[X1 X2] ≤ E[X1] E[X2] • That is, Pr[X1 = 1 | X2 = 1] ≤ Pr[X1 = 1] and Pr[X2 = 1 | X1 = 1] ≤ Pr[X2 = 1] • Also implies (1-X1), (1-X2) are negatively correlated

Negative Correlation • X1, X2, ..., Xnbinary random variables • X1, X2, ..., Xnarenegatively correlated if for any index set J  {1,2, ..., n} • E[ i J Xi ] ≤ i J E[Xi ] and • E[ i J (1-Xi)] ≤ i J E[(1-Xi)]

Negative Correlation and Concentration • X1, X2, ..., Xnbinary random variables that are negatively correlated (can be dependent) • E[Xi] = Pr[Xi = 1] = xi • a1, a2, ..., annumbers in [0,1] • μ = E[Σiai Xi] = Σiai xi Theorem:[Panconesi-Srinivasan’ 97] • Pr[Σiai Xi > (1+δ)μ] ≤ ( eδ / (1+δ)δ) μ • Pr[Σiai Xi < (1 - δ)μ] ≤ exp(-μδ2/2)

Connecting the dots ... What is common between the two applications? Integer Program: min λ s.t A x ≤ λb x is a base in a matroid congestion A non-neg matrix, packing constraints Multipath: x corresponds to choosing ki paths for pair sitifrom Pi Crossing tree: x induces a spanning tree

Matroids M=(N, I ) where N is a finite ground set and I 2Nis a set of independent sets such that • I is not empty • Iis downward closed: B Iand A  B  A I • A, B Iand |A| < |B| impliesthere is iB\A such that A+iI

Matroid Examples • Uniform matroid: I = { S : |S| ≤ k } • Partition matroid: I = { S : |S Nj| ≤ kj, 1 ≤ i ≤ h } where N1, ..., Nhpartition N, and kj are integers • Graphic matroid: G = (V, E) is a graph and M=(E, I) where I = { S  E : S induces a forest }

Bases in Matroid • B I is a base of a matroidM=(N, I) if B is a maximal independent set • All bases have same cardinality • Matroids can also be defined via bases • Example: spanning trees in a graph

Base Exchange Theorem B’ and B’’ are distinct bases in a matroidM=(N, I) Strong Base Exchange Theorem: There are elements iB’\B’’ and jB’’\B’ such that B’-i+jand B’’-j+iare both bases. B’ B’’ i j B’-i+j and B’’-j+i are both bases B’B’’ B’B’’

Dependent Rounding in Matroids • M = (N, I ) is a matroid with |N| = n • B(M) is the base polytope: conv{1B : B is a base} • x is a fractional point in B(M) • Round x to a random base B such that • Pr[i B] = xifor each i N • Xi(indicator for i B ) variables are negatively correlated

Our Work Two methods for arbitrary matroids: • Randomized pipage rounding for matroids [Calinescu-C-Pal-Vondrak’07,’09] • Randomized swap rounding [C-Vondrak-Zenklusen’09] This talk: randomized swap rounding

Randomized Swap Rounding • Express x = mj=1βi Bi (convex comb. of bases) • C1 = B1 , β = β1 • For k = 1 to m-1 do • Randomly Mergeβ Ck&βk+1 Bk+1into(β+βk+1) Ck+1 • Output Cm

Swap Rounding x = 0.2 B1 + 0.1 B2 + 0.5 B3 + 0.15 B4 + 0.05 B5 0.2 C1 + 0.1 B2+ 0.5 B3 + 0.15 B4 + 0.05 B5 0.3 C2 + 0.5 B3+ 0.15 B4 + 0.05 B5 0.8 C3 + 0.15 B4 + 0.05 B5 0.95 C4 + 0.05 B5 C5

0.6 0.3 0.6 0.4 0.7 1 0.3 0.7 0.9 1 0.4 0.1

Merging two Bases MergeB’ and B’’ into a random B that looks like B’ with probability p and like B’’ with probability (1-p)

Merging two Bases MergeB’ and B’’ into a random B that looks like B’ with probability p and like B’’ with probability (1-p) Option: Pick B’ with prob. p and B’’ with prob. (1-p) ? Will not have negative correlation properties!

Merging two Bases Base ExchangeTheorem: B’-i+j and B’’-j+i are both bases B’ B’’ i j B’B’’ B’B’’

Merging two Bases B’ B’’ p probp i i 1-p B’ B’’ B’B’’ B’B’’ i j B’B’’ B’B’’ B’ B’’ p prob1-p j j 1-p B’B’’ B’B’’

Merging Spanning Trees 0.3 0.6

Merging Spanning Trees 0.3 0.6 0.6/(0.3+0.6) 0.3/(0.3+0.6) 0.3 0.3 0.6 0.6

Swap Rounding for Matroids Theorem: Randomized-Swap-Rounding with xB(M) outputs a random base B such that • Pr[i B] = xifor each i N • Xi(indicator for i B ) variables are negatively correlated Negative correlation gives concentration bounds for linear functions of the Xi s

Swap Rounding for Matroids Theorem: Randomized-Swap-Rounding with xB(M) outputs a random base B such that • Pr[i B] = xifor each i N • Xi(indicator for i B ) variables are negatively correlated Additional properties for submodular functions: • E[f(B)] ≥ F(x) where F is multilinear extension of f • Pr[ f(B) < (1-δ) F(x)] ≤ exp(- F(x) δ2/8) (concentration for lower tail of submod functions)

Several Applications Can handle matroid constraint plus packing constraints xB(M) and Ax ≤ b • (1-1/e) approximation for submodular functions subject to a matroid plus O(1) knapsack/packing constraints (or many “loose” packing constraints) • Simpler rounding and proof for “thin” spanning trees in ATSP application ([Asadpour etal’10]) • ...

Proof idea for Negative Correlation Process is a vector-valued martingale: • each iteration merges two bases • merging bases involves swapping elements in each step In each step only two elements i and j involved

Proof idea for Negative Correlation In each step only two elements i and j involved Xi, Xjbefore swap step and X’i, X’jafter swap step • E[X’i| Xi, Xj ] = Xi and E[X’j| Xi, Xj ] = Xj • X’i + X’j = Xi + Xj

Proof idea for Negative Correlation In each step only two elements i and j involved Xi, Xjbefore swap step and X’i, X’jafter swap step • E[X’i| Xi, Xj ] = Xi and E[X’j| Xi, Xj ] = Xj • X’i + X’j = Xi + Xj E[X’iX’j|Xi,Xj] = ¼ E[(X’i+X’j)2| Xi,Xj] − ¼ E[(X’i - X’j)2| Xi,Xj] = ¼ (Xi+Xj)2− ¼ E[(X’i - X’j)2| Xi, Xj] ≤ ¼ (Xi+Xj)2− ¼ (Xi - Xj)2 ≤ XiXj

Beyond matroids? Question: Can we obtain negative correlation for other combinatorial structures/polytopes?

Beyond matroids? Question: Can we obtain negative correlation for other combinatorial structures/polytopes? Answer: No. Negative correlation implies the polytope is “essentially” a matroid base polytope

Other Comments • Swap rounding advantage: • identifies exchange property as the key • Idea generalizes/inspires work for other structures such as matroid intersection, and b-matchings with some restrictions • Lower tail for submodular functions uses martingale analysis (does not follow from negative correlation) • Negative correlation not needed for concentration

Do we need negative correlation for concentration? No. • Lower tail for submodular functions shown via martingale method • Also can show concentration for linear functions in the matroid intersection polytope and non-bipartite matching (a the loss of a bit in expectation)

Example: Rounding in bipartite-matching polytope xe = ½ on each edge Can we round x to a matching?

Example: Rounding in bipartite-matching polytope xe = ½ on each edge Can we round x to a matching? If we want to preserve expectation of x only choice is to pick one of two perfect matchings, each with prob½ Large positive correlation!

Informal Statements For any point x in the bipartite matching polytope • Can round x to a matching preserving expectation and negative correlation holds for edge variables incident to any vertex [Srinivasan’99] • Can round x to a matching x’s.tE[x’] = (1-γ) x and concentration holds for any linear functions of x (the exponent in tail bound depends on γ) [CVZ] • Above results generalize to matroid intersection and non-bipartite matchings[CVZ]

Questions?

Thanks!

Submodular Functions • Non-negative submodular set functions f(A) ≥ 0 for all A • Monotone submodular set functions f(ϕ) = 0 and f(A) ≤ f(B) for all A  B • Symmetric submodular set functions f(A) = f(N\A) for all A

Multilinear Extension of f [CCPV’07] inspired by [Ageev-Sviridenko] For f: 2N R+define F:[0,1]N R+ as x = (x1, x2, ..., xn) [0,1]|N| F(x) = Expect[ f(x) ] = S Nf(S) px(S) =S Nf(S) iS xiiN\S (1-xi)

Multilinear Extension of f For f: 2N R+define F:[0,1]N R+ as F(x) =S N f(S) iS xiiN\S (1-xi) F is smooth submodular([Vondrak’08]) • F/xi ≥ 0 for all i (monotonicity) • 2F/xixj ≤ 0 for all i,j (submodularity)

Optimizing F(x) [Vondrak’08] Theorem: For any down-monotone polytope P [0,1]nmax F(x)s.tx P can be optimized to within a (1-1/e) approximation if we can do linear optimization over P Algorithm: Continuous-Greedy

Dependent Randomized Rounding in Matroid Polytopes (& Related Results)