440 likes | 590 Views
Solving SDD Linear Systems in Nearly mlog 1/2 n Time. Richard Peng M.I.T. A&C Seminar, Sep 12, 2014. Outline. The Problem Approach 1 Approach 2 Combining these. Large Graphs. Images. Roads. Social networks. Meshes. Algorithmic challenges: How to store? How to analyze?
E N D
Solving SDD Linear Systems in Nearly mlog1/2n Time Richard Peng M.I.T. A&C Seminar, Sep 12, 2014
Outline • The Problem • Approach 1 • Approach 2 • Combining these
Large Graphs Images Roads Social networks Meshes • Algorithmic challenges: • How to store? • How to analyze? • How to optimize?
Laplacian paradigm Electrical network Graph 1Ω 1Ω 2Ω Linear systems
Graph Laplacian • Row/column vertexOff-diagonal -weight • Diagonal weighted degree 1Ω 1Ω 2Ω • Properties • Symmetric • Row/column sums = 0 • Non-positive off-diagonals Electrical network/ Weighted graph
Core Routine Lx=b Input: graph Laplacian L, vector b Output: vector xs.t.Lx ≈ b • To measure performance: • n: dimension, # vertices • m: nnz, O(# of edges)
Applications Elliptic PDEs Directly related: SDD, M matrices Lx=b Few iterations: eigenvectors, heat kernels Clustering Inference Image processing Many iterations: Combinatorial graph problems Random trees Flow / matching
General Linear System Solves Ax=b • Oldest studied algorithmic problem • [Liu 179]…[Newton `1710][Gauss 1810]: O(n3) • [HS `52]: conjugate gradient, O(nm)* • [Strassen `69] O(n2.8) • [Coopersmith-Winograd `90] O(n2.3755) • [Stothers `10] O(n2.3737) • [Vassilevska Williams`11] O(n2.3727)
Combinatorial Preconditioning Lx=b 1Ω 1Ω 2Ω • Use the connection to graphs to design numerical solvers • [Vaidya`91]: O(m7/4) • [Boman-Hendrickson`01] O(mn) • [Spielman-Teng `03] O(m1.31) • [Spielman-Teng `04] O(mlogcn) • Subsequent improvements: • [Elkin-Emek-Spielman-Teng `05] • [Andersen-Chung-Lang `06] • [Koutis-Miller `07] • [Spielman-Srivastava `08] • [Abraham-Bartal-Neiman `08] • [Andersen-Perez `09]… • [Batson-Spielman-Srivastava `09] • [Kolla-Makarychev-Saberi-Teng `10] • [Koutis-Miller-P `10, `11] • [Orecchia-Vishnoi `11] • [Abraham-Neiman `12] • [OveisGharan-Trevisan `12] • …
Zeno’s Dichotomy Paradox O(mlogcn) Fundamental theorem of Laplacian solvers:improvements decrease c by factor between [2,3] 2010: 6 2004: 70 2010: 2 2006: 32 2011: 1 2009: 15 2014: 1/2 c OPT: 0? [Miller]: Instance of speedup theorem?
Outline • The Problem • Approach 1 • Approach 2 • Combining these
What is a Solve? x: voltages at vertices What is b = Lx? x2 1Ω 1Ω x3 2Ω Flows on edges 3x3 – x2 – 2x1 = (x3 – x2) + 2(x3 – x1) Ohm’s law: current = voltage × conductance b: residue of electrical current
What is a Solve? Lx=b: find voltages x whose flow meets demand b Intermediate object: flows on edge, f [Kirchoff 1845, Doyle-Snell `84]:f minimizes energy dissipation Energy of f = Σer(e)f(e)2
What Makes Solves Hard Densely connected graphs, need: • sampling • error reduction Long paths, need: • Divide-and-conquer • Data structures Solvers need to combine both
Kelner-Orecchia-Sidford-Zhu `13:Electrical Flow Solver Start with some flow f meeting demand b Push flows along cycles in the graph to reduce energy [KOSZ `13]: energy of f approaches optimum ` Algorithmic challenges: How many cycles? Ω(m) How big is each cycle? Ω(n)
How To Find Cycles? ` ` Cycle = tree + edge: • Pick a tree • Sample off-tree edges `
How to Sample Edges [KMP `10, `11, KOSZ `13]: Sample edges w.p. proportional to stretch ` Stretch = 4 • Stretch = length of edge / length of path in tree • Unweighted graph: length of tree path Stretch = 3 Key quantity: total stretch, S [KOSZ `13]: O(m + S) cycles halves error in expectation
What’s a Low Stretch Tree? • n1/2-by-n1/2 unit weighted mesh Candidate 2: recursive C ‘fractal’ • Candidate 1: ‘haircomb’: stretch(e)= O(1) • Stretch = n1/2 stretch(e)=Ω(n1/2) • But only n1/2 such edges, • Contribution = O(n) • total stretch = Ω(n3/2) • shortest path tree • max weight spanning tree • O(logn) such layers due to fractal, • Total = O(nlogn)
Low Stretch Trees [KOSZ `13]: embeddable trees are ok! [CKMPX`14]:Construction in O(mloglogn) time, will assume S = O(mlogn)(actual bounds more intricate) Bartal / FRT trees have steiner vertices
[KOSZ `13] in a nutshell O(logn) using data structures O(mlog2n)
Outline • The Problem • Approach 1 • Approach 2 • Combining these
Numerical View of Solvers Iterative methods: given H similar to G, solve LGx = b by solving several LHy = r Similar: LG ≼LH ≼kLG ≼ : Loewner ordering, A≼B xTAx ≤ xTBx for all x Chebyshev iteration: If LG ≼LH ≼kLG, can halve error in LGx = bby solving O(k1/2) problems in H to O(1/poly(k)) accuracy
Numerical Views of Solvers Chebyshev iteration (ignoring errors): If LG ≼LH ≼kLG, can solve problem in G by solving O(k1/2) problems in H • Preconditioner construction view: • Given G, find H that’s easier to solve such that LG ≼LH ≼kLG • Perturbation stability view: • Can adjust edge weights by factor of [1,k] • Need to solve O(k1/2) problems on resulting graph
Generating H [KOSZ `13]: sample 1 edge, reduce error by 1-1/(m + S) O(m + S) samples halves error ` Matrix Chernoff: O(S logn) samples gives LG ≼2LH ≼3LG . ` Can halve error in LGx=b via. O(1) solves in H
How does this Help? Go from G to H with m’ = O(Slogn) off tree edges S = O(mlogn), m’ > m ` • [KMP `10]: take perturbation stability view, scale up tree by some factor k • factor k distortion, O(k1/2) iterations • S’ = S/k, total stretch decrease by factor of k `
Size Reduction • Scale up tree by factor of k so S’ = S/k • Sample m’ = O(S’logn) = O(Slogn/k) edges ` • n - 1 tree edges + m’ off-tree edges • Repeatedly remove degree 1 and 2 vertices • New size: O(Slogn/k) ` T(m, s) = O(k1/2)(m + T(Slogn / k, S/k)) (can show: errors don’t accumulate in recursion)
Two term Recurrence? W-cycle algorithm: T(m, s) = O(k1/2)(m + T(Slogn / k, S/k)) These are really the same parameter! Key invariant from [KMP `11]: ‘spine-heavy’: S = m/O(logn)
Time on Spine Heavy Graphs T(m) = O(k1/2)(m + T(m / k)) = O(m) Low stretch spanning tree: Sini = mlogn Initial scaling: O(log2n), O(logn) overhead More important runtime parameter: how small of S to get O(m) runtime?
Numerical Solver *All updates are matrix-vector multiplications Byproduct of this view: solving to O(logcn) error is `easy’, expected convergence become w.h.p. via. checking with O((loglogn)c) overhead
Commonality: Randomness Reason: need to handle complete graph, easiest expander constructions are randomized Stretch is the ‘right’ parameter when we use trees due to the Sherman-Morrison formula
Numerical vs. Combinatorial • + Sublinear sample count • + Recursive, O(1) per udpate • - LG ≼LH ≼kLG,O(logn) overhead • + Overhead: (S / m)1/2 • - Recursive error propagation • - Ω(m) samples • - Data structure, O(logn) • + Adaptive convergence • - Linear dependence on S • + Simple error propagation
Combine these? Can the adaptive convergence guarantees work with the recursive Chebyshev? Consequence : T(m, s) = O(k1/2)(m + T(S / k, S / k)))
Outline • The Problem • Approach 1 • Approach 2 • Combining these
Direct Modification Use Chebyshev iteration outside of [KOSZ `13]? ???????????? O(k1/2) O(mlog3/2n) Within (loglogn)c of [LS`13], which also uses Chebyshev-like methods Our analyses do make this rigorous
Challenges 1 2 3 Multiple updates, instead of a few Flows vs. voltages Remove data structure overhead
Batched Cycle Updates This only gives flows, Chebyshev iteration works with voltages ` ` • Each cycle updates lowers flow energy • Changing one after another is still valid flow update when considering both • Updating together can only be better!
Multi-Edge Sampling Expected error reduction from [KOSZ `13] extends to multiple edges, in voltage space ` • Major modifications: • Voltages instead of flows • Sublinear # of samples • O(m + S) O(S) • O(m) term goes into iterative method • Only prove this for iterative refinement • Reduce to black-box Chebyshev • Simpler analysis, at cost of (loglogn)c
What Happened to the Data Structure? Path updates/queries in a tree in O(logn) time • Amortized into: • Vertex elimination • Recursion
Getting Rid of Log Recursive Chebyshev: T(m) = k1/2(m + T(m / k)) • Call structure is ‘bottom light’: • Total size of bottom level calls is o(m) • Cost dominated by top level size, O(m), instead of being same at all O(logn) levels
Mlog1/2n • Expected convergence as shown in [KOSZ `13] • Is a general phenomenon • Interacts well with numerical algorithms in recursive settings
Hidden Under the Rug • Error propagation in recursion • Vector based analysis of Chebyshev iteration (instead of operators) • Numerical stability analysis • Construction of better trees by discounting higher stretch
Open Problems • loglogn factors: how many are real? • Extend this analysis? • Numerical stability: double precision ok? • O(m) time algorithm? (maybe with O(mlogn) pre-processing?) • Other notions of stretch? • Beyond Laplacians • Code packages 2.0.
Thank You! Questions?