820 likes | 1.06k Views
Discrepancy and SDPs. Nikhil Bansal (TU Eindhoven). Outline. Discrepancy: definitions and applications Basic results: upper/lower bounds Partial Coloring method (non-constructive) SDPs: basic method Algorithmic six std. deviations Lovett- Meka result
E N D
Discrepancy and SDPs Nikhil Bansal (TU Eindhoven)
Outline Discrepancy: definitions and applications Basic results: upper/lower bounds Partial Coloring method (non-constructive) SDPs: basic method Algorithmic six std. deviations Lovett-Mekaresult Lower bounds via SDP duality (Matousek)
Material Classic: Geometric Discrepancy by J. Matousek Papers: • Bansal.Constructive algs. for disc. minimization, FOCS 2010 • Matousek. The determinant lower bound is almost tight. Arxiv’11 • Lovett, Meka. Disc. minimization by walking on edges. Arxiv’12 • Other related recent works. Survey (main ideas): Bansal. Semidefinite opt. and disc. theory.
Discrepancy: What is it? Study of gaps in approximating the continuous by the discrete. Original motivation: Numerical Integration/ Sampling How well can you approximate a region by discrete points ? n 0 Estimate as Error Discrepancy: Max over intervals I |(# points in I) – (length of I)|
Discrepancy: What is it? Problem: How uniformly can you distribute points in a grid. “Uniform” : For every axis-parallel rectangle R | (# points in R) - (Area of R) | should be low. Discrepancy: Max over rectangles R |(# points in R) – (Area of R)| n1/2 n1/2
Distributing points in a grid Problem: How uniformly can you distribute points in a grid. “Uniform” : For every axis-parallel rectangle R | (# points in R) - (Area of R) | should be low. n= 64 points Van der Corput Set Uniform Random n1/2 discrepancy n1/2 (loglog n)1/2 O(log n) discrepancy!
Quasi-Monte Carlo Methods With N random samples: Error Quasi-Monte Carlo Methods: Can discrepancy be O(1) for 2d grid? No. [Schmidt’77] d-dimensions: [Halton-Hammersely’60] [Roth’64] n) [Bilyk,Lacey,Vagharshakyan’08]
Discrepancy: Example 2 Input: n points placed arbitrarily in a grid. Color them red/blue such that each rectangle is colored as evenly as possible Discrepancy: max over rect. R ( | # red in R - # blue in R | ) Continuous: Color each element 1/2 red and 1/2 blue (0 discrepancy) Discrete: Random has about O(n1/2 log1/2 n) Can achieve O(log2.5 n)
Discrepancy: Example 2 Input: n points placed arbitrarily in a grid. Color them red/blue such that each rectangle is colored as evenly as possible Discrepancy: max over rect. R ( | # red in R - # blue in R | ) Discrete: Can achieve O(log2.5 n) Exercise: O(log4 n) Optional: O(log2.5 n) Why do we care?
S3 S4 S1 S2 Combinatorial Discrepancy Universe: U= [1,…,n] Subsets: S1,S2,…,Sm Color elements red/blue so each set is colored as evenly as possible. Find : [n] ! {-1,+1} to Minimize |(S)|1 = maxS| i2 S(i) | If A is a incidence matrix. Disc(A) =
Applications CS: Computational Geometry, Comb. Optimization, Monte-Carlo simulation, Machine learning, Complexity, Pseudo-Randomness, … Math: Dynamical Systems, Combinatorics, Mathematical Finance, Number Theory, Ramsey Theory, Algebra, Measure Theory, …
1 2 … n 1’ 2’ … n’ S1 S2 … S’1 S’2 … Hereditary Discrepancy Discrepancy a useful measure of complexity of set system Hereditary discrepancy: herdisc (U,S) = maxU’ ½ U disc (U’, S|U’) Robust version of discrepancy (usually same as discrepancy) But not so robust
Rounding Lovasz-Spencer-Vesztermgombi’86: Given any matrix A, and can round x to s.t. Proof: Round the bits of x one by one. Remark: A is TU iff it is integer matrix with herdisc(A) =1. Ghouila-Houri test for TU matrices.
Rounding LSV’86 result guarantees existence. How to find it efficiently? Nothing known until recently. Thm [B’10]. Can round efficiently so that
More rounding approaches Bin Packing [Eisenbrand, Palvolgyi, Rothvoss’11] OPT LP + O(1) ? Yes. For constant item sizes, if k-permutation conjecture is true. (Recently, Newman-Nikolov’11 disproved the k-permutation conjecture) Technique refined further by Rothvoss’12. (Entropy rounding method)
Dynamic Data Structures N points in a 2-d region. weights updated over time. Query: Given an axis-parallel rectangle R, determine the total weight on points in R. Preprocess: • Low query time • Low update time (upon weight change)
Example Line:Interval queries Trivial: Query = O(n) Update = 1 Query = 1 Update = O() Query = 2 Update = O(n) Query = O(log n) Update = O(log n) Recursively for 2-d.
What about other queries? Circles arbitrary rectangles aligned triangle Turns out Reason: Set system S of query objects + points has large discrepancy (about ) Larsen-Green’11
Idea points Any data structure is maintaining D A good data structure implicitly computes: D = AP A = row sparse P = Column sparse (low query time) (low update time) w D Query A P Aggregator Precompute
Outline Discrepancy: definitions and applications Basic results: upper/lower bounds Partial Coloring method (non-constructive) SDPs: basic method Algorithmic six std. deviations Lovett-Mekaresult Lower bounds via SDP duality (Matousek)
Basic Results What is the discrepancy of a general system on m sets?
Best Known Algorithm Random: Color each element i independently as x(i) = +1 or -1 with probability ½ each. Thm: Discrepancy = O (n log m)1/2 Pf: For each set, expect O(n1/2) discrepancy Standard tail bounds: Pr[ | i2 S x(i) | ¸c n1/2 ] ¼e-c2 Union bound + Choose c ¼ (log m)1/2 Analysis tight: Random actually incurs ((n log m)1/2). Henceforth, focus on m=n case.
Better Colorings Exist! [Spencer 85]: (Six standard deviations suffice) Always exists coloring with discrepancy ·6n1/2 (In general for arbitrary m, discrepancy = O(n1/2log(m/n)1/2) Tight: For m=n, cannot beat 0.5 n1/2 (HadamardMatrix) Will explore further in exercises. For matrix A where is the least eigenvalue of
Better Colorings Exist! [Spencer 85]: (Six standard deviations suffice) Always exists coloring with discrepancy ·6n1/2 (In general for arbitrary m, discrepancy = O(n1/2log(m/n)1/2) Inherently non-constructive proof (pigeonhole principle on exponentially large universe) Challenge: Can we find it algorithmically ? Certain algorithms do not work [Spencer] Conjecture[Alon-Spencer]: May not be possible.
S3 S4 S1 S2 Beck Fiala Thm U = [1,…,n] Sets: S1,S2,…,Sm Suppose each element lies in at most t sets (t << n). [Beck Fiala’ 81]: Discrepancy 2t -1. (elegant linear algebraic argument, algorithmic result) (note: random does not work) Beck Fiala Conjecture: O(t1/2) discrepancy possible Other results: O( t1/2 log t log n ) [Beck] O( t1/2 log n ) [Srinivasan] O( t1/2 log1/2 n ) [Banaszczyk] Non-constructive
1 2 … n 1’ 2’ … n’ S1 S2 … S’1 S’2 … Approximating Discrepancy Question: If a set system has low discrepancy (say << n1/2) Can we find a good discrepancy coloring ? [Charikar, Newman, Nikolov 11]: Even 0 vs. O (n1/2) is NP-Hard (Matousek): What if system has low Hereditary discrepancy? herdisc (U,S) = maxU’ ½ U disc (U’, S|U’) Useful for the rounding application.
Two Results Thm1: For any set system, can find Discrepancy ·Hereditary discrepancy. Thm 2: Can get Spencer’s bound constructively. That is, O(n1/2) discrepancy for m=n sets. Other Problems: Constructive bounds (matching current best) k-permutation problem [Spencer, Srinivasan,Tetali] geometric problems , Beck Fiala setting (Srinivasan’s bound) …
Relaxations: LPs and SDPs Not clear how to use. Linear Program is useless. Can color each element ½ red and ½ blue. Discrepancy of each set = 0! SDPs(LP on vi¢ vj, cannot control dimension of v’s) | i 2 S vi |2· n 8 S |vi|2 = 1 Intended solution vi = (+1,0,…,0) or (-1,0,…,0). Trivially feasible: vi = ei (all vi’s orthogonal) Yet, SDPs will be a major tool.
Punch line SDP very helpful if “tighter” bounds needed for some sets. |i 2 S vi |2· 2 n | i 2 S’ vi|2· n/log n |vi|2· 1 Not apriori clear why one can do this. Entropy Method. Algorithm will construct coloring over time and use several SDPs in the process. Tighter bound for S’
A Question -n n
A Question -n n
A Question -n n
An Improvement Can be improved to For arandom {-1,1} coloring s, with prob. There are {-1,1} colorings s, with
Algorithmically ? Easy: 1/poly(n) Answer: Pick any poly(n) colorings. [Karmarkar-Karp’81]: Huge gap: Major open question Remark: {-1,+1} not enough. Really need to allow 0 also. E.g
Yet another enhancement There is a {-1,0,1} coloring with at least n/2 {-1,1}’s s.t. Split [-n,n] into buckets of size At least sums fall in same bucket Claim: Some two s’ and s’’ lie in same bucket and differ in at least n/2 coordinates. Again consider s = (s’-s’’)/2 s’ = (1,-1, 1 , …, 1,-1,-1) s’’ = (-1,-1,-1, …, 1,1, 1)
Proof of Claim [Kleitman’66] Isoperimetry for cube. Hamming ball B(v,r) has the smallest diameter for a given number of vertices. |B(v,n/4)| < i.e. in any set of {-1,1} vectors, some two at hamming distance >= n/2.
Spencer’s proof Thm: For any set system on n sets. Disc(S) = O
Spencer’s O(n1/2) result Partial Coloring Lemma: For any system with m sets, there exists a coloring on ¸ n/2 elements with discrepancy O(n1/2 log1/2 (2m/n)) [For m=n, disc = O(n1/2)] Algorithm for total coloring: Repeatedly apply partial coloring lemma Total discrepancy O( n1/2 log1/2 2 ) [Phase 1] + O( (n/2)1/2 log1/2 4 ) [Phase 2] + O((n/4)1/2 log1/2 8 ) [Phase 3] + … = O(n1/2) Let us prove the lemma for m = n
X1 = ( 1,-1, 1 , …,1,-1,-1) X2 = (-1,-1,-1, …,1, 1, 1) X = ( 1, 0, 1 , …,0,-1,-1) Proving Partial Coloring Lemma Call two colorings X1 and X2“similar” for set S if |X1(S) – X2(S) | · 20 n1/2 Key Lemma: There exist k=24n/5 colorings X1,…,Xk such that every two Xi, Xj are similar for every set S1,…,Sn. Some X1,X2 differ on ¸ n/2 positions Consider X = (X1 – X2)/2 Pf: X(S) = (X1(S) – X2(S))/2 2 [-10 n1/2 , 10 n1/2]
Proving Partial Coloring Lemma -30 n1/2 -10 n1/2 10 n1/2 30 n1/2 -2 -1 0 1 2 Pf: Associate with coloring X, signature = (b1,b2,…,bn) (bi = bucket in which X(Si) lies ) Wish to show: There exist 24n/5 colorings with samesignature Note: Number of possible signatures about and colorings, so naïve counting does not work. Idea: Not all signatures equally likely. Entropy
Entropy For a discrete random variable X. H(X) = 1. Uniform distribution on k points: H(X) = k 2. If X distributed on k points H(X) log(k) If H(X k, then some . 3. Subadditivity: H(X,Y) H(X) + H(Y)
Ent(b1) · 1/5 Proving Partial Coloring Lemma -30 n1/2 -10 n1/2 10 n1/2 30 n1/2 -2 -1 0 1 2 Pf: Associate with coloring X, signature = (b1,b2,…,bn) (bi = bucket in which X(Si) lies ) Wish to show: There exist 24n/5 colorings with same signature Choose X randomly: Induces distribution on signatures. Entropy () · n/5 implies some signature has prob. ¸ 2-n/5. Entropy ( ) ·i Entropy( bi) [Subadditivity of Entropy] bi = 0 w.p. ¼ 1- 2 e-50, = 1 w.p. ¼ e-50 = 2 w.p. ¼ e-450 ….
For each set S, consider the “bucketing” -2 -1 2 0 1 S -3S -S 3S 5S Bucket of n1/2/100 has penalty ¼ ln(100) A useful generalization Partial coloring with non-uniform discrepancy S for set S Suffices to have sEnt (bs) · n/5 Or, if S = s|S|1/2, then s g(s) · n/5 g() ¼ e-2/2 > 1 ¼ln(1/) < 1
General Partial Coloring Thm: There is a partial coloring with discrepancy for set S, provided Very flexible. In Spencer’s setting, can require sets to have discrepancy 0. Exercise: Prove disc. partial coloring for Beck Fiala. sg(S/|S|1/2) · n/5 g() ¼ e-2/2 > 1 ¼ln(1/) < 1
Recap Partial Coloring:S¼ 10 n1/2 gives low entropy n/5 ) 24n/5 colorings exist with same signature. ) some X1,X2 with large hamming distance. (X1 – X2) /2 gives the desired partial coloring. Trouble: 24n/5/2n is an exponentially small fraction. Only if we could find the partial coloring efficiently…
Outline Discrepancy: definitions and applications Basic results: upper/lower bounds Partial Coloring method (non-constructive) SDPs: basic method Algorithmic six std. deviations Lovett-Mekaresult Lower bounds via SDP duality (Matousek)
Algorithms Thm (B’10): For any matrix A, there is a polynomial time algorithm to find a -1,+1 coloring x, s.t. = Herdisc(A) Corollary: Rounding with error =Herdisc(A).
start finish Algorithm (at high level) Each dimension: An Element Each vertex: A Coloring Cube: {-1,+1}n Algorithm: “Sticky” random walk Each step generated by rounding a suitable SDP Move in various dimensions correlated, e.g. t1 + t2¼ 0 Analysis: Few steps to reach a vertex (walk has high variance) Disc(Si) does a random walk (with low variance)