1 / 64

Discrepancy and SDPs

Discrepancy and SDPs. Nikhil Bansal (TU Eindhoven). Outline. Discrepancy: definitions and applications Basic results: upper/lower bounds Partial Coloring method (non-constructive) SDPs: basic method Algorithmic Spencer’s Result Lovett-Meka result Lower bounds via SDP duality (Matousek).

gilead
Download Presentation

Discrepancy and SDPs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Discrepancy and SDPs Nikhil Bansal (TU Eindhoven)

  2. Outline Discrepancy: definitions and applications Basic results: upper/lower bounds Partial Coloring method (non-constructive) SDPs: basic method Algorithmic Spencer’s Result Lovett-Meka result Lower bounds via SDP duality (Matousek)

  3. Material Classic: Geometric Discrepancy by J. Matousek Papers: Bansal. Constructive algorithms for discrepancy minimization, FOCS 2010 Matousek. The determinant lower bound is almost tight Lovett, Meka. Discrepancy minimization by walking on the edges Survey with fewer technical details: Bansal. …

  4. Discrepancy: What is it? Study of gaps in approximating the continuous by the discrete. Original motivation: Numerical Integration/ Sampling Problem: How well can you approximate a region by discrete points Discrepancy: Max over intervals I |(# points in I) – (length of I)|

  5. Discrepancy: What is it? Study of gaps in approximating the continuous by the discrete. Problem: How uniformly can you distribute points in a grid. “Uniform” : For every axis-parallel rectangle R | (# points in R) - (Area of R) | should be low. Discrepancy: Max over rectangles R |(# points in R) – (Area of R)| n1/2 n1/2

  6. Distributing points in a grid Problem: How uniformly can you distribute points in a grid. “Uniform” : For every axis-parallel rectangle R | (# points in R) - (Area of R) | should be low. n= 64 points Van der Corput Set Uniform Random n1/2 discrepancy n1/2 (loglog n)1/2 O(log n) discrepancy!

  7. Quasi-Monte Carlo Methods With N random samples: Error \prop 1/\sqrt{n} Quasi-Monte Carlo Methods: \prop Disc/n Can discrepancy be O(1) for 2d grid? No. \Omega(log n) [Schmidt …] d-dimensions: O(log^{d-1} n) [Halton-Hammersely ] \Omega(log^{(d-1)/2} n) [Roth ] \Omega(log^{(d-1)/2 + \eta} n [Bilyk,Lacey,Vagharshakyan’08]

  8. Discrepancy: Example 2 Input: n points placed arbitrarily in a grid. Color them red/blue such that each rectangle is colored as evenly as possible Discrepancy: max over rect. R ( | # red in R - # blue in R | ) Continuous: Color each element 1/2 red and 1/2 blue (0 discrepancy) Discrete: Random has about O(n1/2 log1/2 n) Can achieve O(log2.5 n)

  9. S3 S4 S1 S2 Combinatorial Discrepancy Universe: U= [1,…,n] Subsets: S1,S2,…,Sm Color elements red/blue so each set is colored as evenly as possible. Find : [n] ! {-1,+1} to Minimize |(S)|1 = maxS | i 2 S(i) | If A is m \times n incidence matrix. Disc(A) = min_{x \in {-1,1}^n} |Ax|_\infty

  10. Applications CS: Computational Geometry, Comb. Optimization, Monte-Carlo simulation, Machine learning, Complexity, Pseudo-Randomness, … Math: Dynamical Systems, Combinatorics, Mathematical Finance, Number Theory, Ramsey Theory, Algebra, Measure Theory, …

  11. Hereditary Discrepancy

  12. Rounding Lovasz-Spencer-Vesztermgombi’86 Given any matrix A, and x \in R^n can round x to \tilde{x} \in Z^n s.t. |Ax – A\tilde{x}|_\infty < Herdisc(A) Proof: Round the bits one by one.

  13. Can we find it efficiently? Nothing known until recently. Thm [B’10]. Can efficiently round so that Error \leq O(\sqrt{log m log n}) Herdisc(A)

  14. More rounding approaches Bin Packing Refined further by Rothvoss(Entropy rounding method)

  15. Dynamic Data Structures N points in a 2-d region. Weights update over time. Query: Given an axis-parallel rectangle R, determine the total weight on points in R. Preprocess: • Low query time • Low update time (upon weight change)

  16. Example Line: Query = O(n) Update = 1 Query = 1 Update = O(n^2) Query = 2 Update = O(n) Query = O(log n) Update = O(log n) Recursively can get for 2-d.

  17. What about other objects? Query Circles arbitrary rectangles aligned triangle Turns out t_q t_u \geq n^{1/2}/log^2 n ? Larsen-Green: t_q t_u \geq disc(S)^n/log^2 n

  18. Sketch of idea A good data structure implies D = A P A = row sparse P = Column sparse (low query time) (low update time)

  19. Outline again

  20. Basic Results

  21. Best Known Algorithm Random: Color each element i independently as x(i) = +1 or -1 with probability ½ each. Thm: Discrepancy = O (n log n)1/2 Pf: For each set, expect O(n1/2) discrepancy Standard tail bounds: Pr[ | i 2 S x(i) | ¸c n1/2 ] ¼e-c2 Union bound + Choose c ¼ (log n)1/2 Analysis tight: Random actually incurs ((n log n)1/2).

  22. Better Colorings Exist! [Spencer 85]: (Six standard deviations suffice) Always exists coloring with discrepancy ·6n1/2 (In general for arbitrary m, discrepancy = O(n1/2log(m/n)1/2) Tight: For m=n, cannot beat 0.5 n1/2 (Hadamard Matrix, “orthogonal” sets) Inherently non-constructive proof (pigeonhole principle on exponentially large universe) Challenge: Can we find it algorithmically ? Certain algorithms do not work [Spencer] Conjecture[Alon-Spencer]: May not be possible.

  23. S3 S4 S1 S2 Beck Fiala Thm U = [1,…,n] Sets: S1,S2,…,Sm Suppose each element lies in at most t sets (t << n). [Beck Fiala’ 81]: Discrepancy 2t -1. (elegant linear algebraic argument, algorithmic result) Beck Fiala Conjecture: O(t1/2) discrepancy possible Other results: O( t1/2 log t log n ) [Beck] O( t1/2 log n ) [Srinivasan] O( t1/2 log1/2 n ) [Banaszczyk] Non-constructive

  24. 1 2 … n 1’ 2’ … n’ S1 S2 … S’1 S’2 … Approximating Discrepancy Question: If a set system has low discrepancy (say << n1/2) Can we find a good discrepancy coloring ? [Charikar, Newman, Nikolov 11]: Even 0 vs. O (n1/2) is NP-Hard (Matousek): What if system has low Hereditary discrepancy? herdisc (U,S) = maxU’ ½ U disc (U’, S|U’) Robust measure of discrepancy (often same as discrepancy) Widely used: TU set systems, Geomety, …

  25. Our Results Thm 1: Can get Spencer’s bound constructively. That is, O(n1/2) discrepancy for m=n sets. Thm 2: If each element lies in at most t sets, get bound of O(t1/2 log n) constructively (Srinivasan’s bound) Thm 3: For any set system, can find Discrepancy ·O(log (mn))Hereditary discrepancy. Other Problems: Constructive bounds (matching current best) k-permutation problem [Spencer, Srinivasan,Tetali] Geometric problems , …

  26. Relaxations: LPs and SDPs Not clear how to use. Linear Program is useless. Can color each element ½ red and ½ blue. Discrepancy of each set = 0! SDPs(LP on vi¢ vj, cannot control dimension of v’s) | i 2 S vi |2· n 8 S |vi|2 = 1 Intended solution vi = (+1,0,…,0) or (-1,0,…,0). Trivially feasible: vi = ei (all vi’s orthogonal) Yet, SDPs will be a major tool.

  27. Punch line SDP very helpful if “tighter” bounds needed for some sets. |i 2 S vi |2· 2 n | i 2 S’ vi|2· n/log n |vi|2· 1 Not apriori clear why one can do this. Entropy Method. Algorithm will construct coloring over time and use several SDPs in the process. Tighter bound for S’

  28. Talk Outline Introduction The Method Low Hereditary discrepancy -> Good coloring Additional Ideas Spencer’s O(n1/2) bound

  29. Partial Coloring Method

  30. A Question -n n

  31. Slight improvement Can be improved to O(\sqrt{n})/2^n If you pick a random {-1,1} coloring s w.p. say >= ½ |a \cdot s| \leq c \sqrt{n} 2^{n-1} colorings s, with |a\cdot s| \leq c \sqrt{n}

  32. Algorithmically Easy: 1/poly(n) (How?) Answer: Pick any poly(n) colorings. [Karmarkar-Karp’81]: \approx 1/n^log n Huge gap: Major open question Remark: {-1,+1} not enough. Really need color 0 also. E.g. a_1 = 1, a_2=…=a_n = 1/(2n)

  33. Yet another enhancement There is a {-1,0,1} coloring with at least n/2 {-1,1}’s s.t. \sum_i a_i s_i \leq n/2^{n/5} Make buckets of size 2n/2^{n/5} At least 2^{4n/5} sums fall in same bucket Claim: Some two s’ and s’’ in same bucket and differ in at least n/2 coordinates Again consider s = (s’-s’’)/2

  34. Proof of Claim Claim: Any set of 2^{4n/5} vertices of the boolean cube has [Kleitman’66] Isoperimetry for cube. Hamming ball B(v,r) has the smallest diameter for a given number of vertices. |B(v,n/4)| < 2^{4n/5}

  35. Spencer’s proof

  36. Our Approach

  37. start finish Algorithm (at high level) Each dimension: An Element Each vertex: A Coloring Cube: {-1,+1}n Algorithm: “Sticky” random walk Each step generated by rounding a suitable SDP Move in various dimensions correlated, e.g. t1 + t2¼ 0 Analysis: Few steps to reach a vertex (walk has high variance) Disc(Si) does a random walk (with low variance)

  38. An SDP Hereditary disc. ) the following SDP is feasible SDP: Low discrepancy: |i 2 Sj vi |2 ·2 |vi|2 = 1 Obtain vi2 Rn Rounding: Pick random Gaussian g = (g1,g2,…,gn) each coordinate gi is iid N(0,1) For each i, consider i = g¢ vi

  39. Properties of Rounding Lemma: If g 2 Rn is random Gaussian. For any v 2 Rn, g ¢ v is distributed as N(0, |v|2) Pf: N(0,a2) + N(0,b2) = N(0,a2+b2) g¢ v = i v(i) gi» N(0, i v(i)2) Recall: i = g ¢ vi • Each i» N(0,1) • For each set S, • i 2 Si = g ¢ (i2 S vi) » N(0, ·2) • (std deviation ·) SDP: |vi|2 = 1 |i2S vi|2·2 ’s mimics a low discrepancy coloring (but is not {-1,+1})

  40. +1 time -1 Algorithm Overview Construct coloring iteratively. Initially: Start with coloring x0 = (0,0,0, …,0) at t = 0. At Time t: Update coloring as xt = xt-1 +  (t1,…,tn) ( tiny: 1/n suffices) xt(i) = (1i + 2i + … + ti) Color of element i: Does random walk over time with step size ¼ N(0,1) x(i) Fixed if reaches -1 or +1. Set S: xt(S) = i 2 S xt(i) does a random walk w/ step N(0,·2)

  41. Analysis Consider time T = O(1/2) Claim 1: With prob. ½, at least n/2 elements reach -1 or +1. Pf: Each element doing random walk with size ¼. Recall: Random walk with step 1, is ¼ O(t1/2) away in t steps. A Trouble: Various element updates are correlated Consider basic walk x(t+1) = x(t) 1 with prob ½ Define Energy (t) = x(t)2 E[(t+1)] = ½ (x(t)+1)2 + ½ (x(t)-1)2 = x(t)2 + 1 = (t)+1 Expected energy = n at t= n. Claim 2: Each set has O() discrepancy in expectation. Pf: For each S, xt(S) doing random walk with step size ¼

  42. Analysis Consider time T = O(1/2) Claim 1: With prob. ½, at least n/2 variables reach -1 or +1. ) Everything colored in O(log n) rounds. Claim 2: Each set has O() discrepancy in expectation per round. ) Expected discrepancy of a set at end = O( log n) Thm: Obtain a coloring with discrepancy O( log (mn)) Pf: By Chernoff, Prob. that disc(S) >= 2 Expectation + O( log m) = O( log (mn)) is tiny (poly(1/m)).

  43. Recap At each step of walk, formulate SDP on unfixed variables. Use some (existential) property to argue SDP is feasible Rounding SDP solution -> Step of walk Properties of walk: High Variance -> Quick convergence Low variance for discrepancy on sets -> Low discrepancy

  44. Refinements Spencer’s six std deviations result: Goal: Obtain O(n1/2) discrepancy for any set system on m = O(n) sets. Random coloring has n1/2(log n)1/2 discrepancy Previous approach seems useless: Expected discrepancy for a set O(n1/2), but some random walks will deviate by up to (log n)1/2 factor Need an additional idea to prevent this.

  45. Spencer’s O(n1/2) result Partial Coloring Lemma: For any system with m sets, there exists a coloring on ¸ n/2 elements with discrepancy O(n1/2 log1/2 (2m/n)) [For m=n, disc = O(n1/2)] Algorithm for total coloring: Repeatedly apply partial coloring lemma Total discrepancy O( n1/2 log1/2 2 ) [Phase 1] + O( (n/2)1/2 log1/2 4 ) [Phase 2] + O((n/4)1/2 log1/2 8 ) [Phase 3] + … = O(n1/2)

  46. X1 = ( 1,-1, 1 , …,1,-1,-1) X2 = (-1,-1,-1, …,1, 1, 1) X = ( 1, 0, 1 , …,0,-1,-1) Proving Partial Coloring Lemma Beautiful Counting argument (entropy method + pigeonhole) Idea: Too many colorings (2n), but few “discrepancy profiles” Key Lemma: There exist k=24n/5 colorings X1,…,Xk such that every two Xi, Xj are “similar” for every set S1,…,Sn. Some X1,X2 differ on ¸ n/2 positions Consider X = (X1 – X2)/2 Pf: X(S) = (X1(S) – X2(S))/2 2 [-10 n1/2 , 10 n1/2]

  47. A useful generalization There exists a partial coloring with non-uniform discrepancy bound S for set S Even if S = ( n1/2) in some average sense

  48. An SDP Suppose there exists partial coloring X: 1. On ¸ n/2 elements 2. Each set S has |X(S)| ·S SDP: Low discrepancy: |i 2 Sj vi |2·S2 Many colors:i |vi|2¸ n/2 |vi|2· 1 Pick random Gaussian g = (g1,g2,…,gn) each coordinate gi is iid N(0,1) For each i, consider i = g ¢ vi Obtain vi2 Rn

  49. Algorithm Initially write SDP with S = c n1/2 Each set S does random walk and expects to reach discrepancy of O(DS) = O(n1/2) Some sets will become problematic. Reduce their S on the fly. Not many problematic sets, and entropy penalty low. Danger 3 … Danger 1 Danger 2 … 35n1/2 0 30n1/2 20n1/2

  50. Concluding Remarks Construct coloring over time by solving sequence of SDPs (guided by existence results) Works quite generally Can be derandomized[Bansal-Spencer] (use entropy method itself for derandomizing + usual tech.) E.g. Deterministic six standard deviations can be viewed as a way to derandomize something stronger than Chernoff bounds.

More Related