290 likes | 519 Views
Support-Graph Preconditioning. John R. Gilbert MIT and U. C. Santa Barbara coauthors: Marshall Bern, Bruce Hendrickson, Nhat Nguyen, Sivan Toledo authors of other work surveyed: Erik Boman, Doron Chen, Keith Gremban, Bruce Hendrickson, Gary Miller, Sivan Toledo, Pravin Vaidya, Marco Zagha.
E N D
Support-Graph Preconditioning John R. Gilbert MIT and U. C. Santa Barbara coauthors: Marshall Bern, Bruce Hendrickson, Nhat Nguyen, Sivan Toledo authors of other work surveyed: Erik Boman, Doron Chen, Keith Gremban, Bruce Hendrickson, Gary Miller, Sivan Toledo, Pravin Vaidya, Marco Zagha
Systems of Linear Equations: Ax = b • A is large & sparse • say n = 105 to 108, # nonzeros = O(n) • Physical setting sometimes implies G(A) has separators • O(n1/2) in 2D, O(n2/3) in 3D • Here: A is symmetric and positive (semi)definite • Direct methods: A=LU • Iterative methods:y(k+1) = Ay(k)
3 7 1 3 7 1 6 8 6 8 4 10 4 10 9 2 9 2 5 5 Graphs and Sparse Matrices[Parter, … ] Fill:new nonzeros in factor Symmetric Gaussian elimination: for j = 1 to n add edges between j’s higher-numbered neighbors G+(A)[chordal] G(A)
Fill-reducing matrix permutations • Theory: approx optimal separators => approx optimal fill and flop count • Orderings: nested dissection, minimum degree, hybrids • Graph partitioning: spectral, geometric, multilevel
Conjugate gradient iteration x0 = 0, r0 = b, p0 = r0 for k = 1, 2, 3, . . . αk = (rTk-1rk-1) / (pTk-1Apk-1) step length xk = xk-1 + αk pk-1 approx solution rk = rk-1 – αk Apk-1 residual βk = (rTk rk) / (rTk-1rk-1) improvement pk = rk + βk pk-1 search direction • One matrix-vector multiplication per iteration • Two vector dot products per iteration • Four n-vectors of working storage
Conjugate gradient: Convergence • In exact arithmetic, CG converges in n steps (completely unrealistic!!) • Accuracy after k steps of CG is related to: • consider polynomials of degree k that are equal to 1 at 0. • how small can such a polynomial be at all the eigenvalues of A? • Thus, eigenvalues close together are good. • Condition number:κ(A) = ||A||2 ||A-1||2 = λmax(A) / λmin(A) • Residual is reduced by a constant factor by O(κ1/2(A)) iterations of CG.
Preconditioners • Suppose you had a matrix B such that: (1) condition number κ(B-1A) is small (2) By = z is easy to solve • Then you could solve (B-1A)x = B-1b instead of Ax = b(actually (B-1/2AB-1/2) B1/2 x = B-1/2 b, but never mind) • B = A is great for (1), not for (2) • B = I is great for (2), not for (1)
Support Graph Preconditioning +: New analytic tools, some new preconditioners +: Can use existing direct-methods software -: Current theory and techniques limited • Define a preconditioner B for matrix A • Explicitly compute the factorization B = LU • Choose nonzero structure of B to make factoring cheap (using combinatorial tools from direct methods) • Prove bounds on condition number using both algebraic and combinatorial tools
Spanning Tree Preconditioner [Vaidya] • A is symmetric positive definite with negative off-diagonal nzs • B is a maximum-weight spanning tree for A (with diagonal modified to preserve row sums) • factor B in O(n) space and O(n) time • applying the preconditioner costs O(n) time per iteration G(A) G(B)
Spanning Tree Preconditioner [Vaidya] • support each edge of A by a path in B • dilation(A edge) = length of supporting path in B • congestion(B edge) = # of supported A edges • p = max congestion, q = max dilation • condition number κ(B-1A) bounded by p·q (at most O(n2)) G(A) G(B)
Spanning Tree Preconditioner [Vaidya] • can improve congestion and dilation by adding a few strategically chosen edges to B • cost of factor+solve is O(n1.75), or O(n1.2) if A is planar • in experiments by Chen & Toledo, often better than drop-tolerance MIC for 2D problems, but not for 3D. G(A) G(B)
Support Graphs [after Gremban/Miller/Zagha] Intuition from resistive networks:How much must you amplify B to provide the conductivity of A? • The support of B for A is σ(A, B) = min{τ : xT(tB– A)x 0 for all x, all t τ} • In the SPD case, σ(A, B) = max{λ : Ax = λBx} = λmax(A, B) • Theorem:If A, B are SPD then κ(B-1A) = σ(A, B) · σ(B, A)
Splitting and Congestion/Dilation Lemmas • Split A = A1+ A2 + ··· + Ak and B = B1+ B2 + ··· + Bk • Ai and Bi are positive semidefinite • Typically they correspond to pieces of the graphs of A and B (edge, path, small subgraph) • Lemma: σ(A, B) maxi {σ(Ai , Bi)} • Lemma: σ(edge, path) (worst weight ratio) · (path length) • In the MST case: • Ai is an edge and Bi is a path, to give σ(A, B) p·q • Bi is an edge and Ai is the same edge, to give σ(B, A) 1
B has positive (dotted) edges that cancel fill B has same row sums as A Strategy: Use the negative edges of B to support both the negative edges of A and the positive edges of B. -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 B A .5 .5 .5 .5 .5 .5 .5 .5 .5 A = 2D model Poisson problem B = MIC preconditioner for A Support-graph analysis of modified incomplete Cholesky
Every dotted (positive) edge in B is supported by two paths in B Each solid edge of B supports one or two dotted edges Tune fractions to support each dotted edge exactly 1/(2n – 2) of each solid edge is left over to support an edge of A Supporting positive edges of B
Analysis of MIC: Summary • Each edge of A is supported by the leftover 1/(2n – 2) fraction of the same edge of B. • Therefore σ(A, B) 2n – 2 • Easy to show σ(B, A) 1 • For this 2D model problem, condition number is O(n1/2) • Similar argument in 3D gives condition number O(n1/3) or O(n2/3) (depending on boundary conditions)
Support-graph analysis for better preconditioners? • For model problems, the preconditioners analyzed so far are not as efficient as multigrid, domain decomposition, etc. • Gremban/Miller: a hierarchical support-graph preconditioner, but condition number still not polylogarithmic. • We analyze a multilevel-diagonal-scaling-like preconditioner in 1D . . . • . . . but we haven’t proved tight bounds in higher dimensions.
-1 -1 -2 B A -1 -1 -1 -1 -1 -1 Hierarchical preconditioner (1D model problem) • Good support in both directions: σ(A, B) = σ(B, A) = O(1) • But, B is a mesh => expensive to factor and to apply
.5 .5 -1 -1 -2 B A -1 -1 -1 -1 -1 -1 Hierarchical preconditioner continued • Drop fill in factor of B (i.e. add positive edges) => factoring and preconditioning are cheap • But B cannot support both A and its own positive edges;σ(A, B) is infinite
-.5 -.5 -1 .5 .5 -1 -1 -2 B A -1 -1 -1 -1 -1 -1 Hierarchical preconditioner continued • Solution: add a coarse mesh to support positive edges • Now σ(A, B) log2(n+1) • Elimination/splitting analysis gives σ(B, A) = 1 • Therefore condition number = O(log n)
Generalization to higher dimensions? • Idea: mimic the 1D construction • Generate a coarsening hierarchy with overlapping subdomains • Use σ(B, A) = 1 as a constraint to choose weights • Defines a preconditioner for regular or irregular problems • Sublinear bounds on σ(A, B) for some 2D model problems. • (Very preliminary) experiments show slow growth in κ(B-1A)
Algebraic framework • The support of B for A is σ(A, B) = min{τ : xT(tB– A)x 0 for all x, all t τ} • In the SPD case, σ(A, B) = max{λ : Ax = λBx} = λmax(A, B) • If A, B are SPD then κ(B-1A) = σ(A, B) · σ(B, A) • [Boman/Hendrickson] If V·W=U, then σ(U·UT, V·VT) ||W||22
Algebraic framework [Boman/Hendrickson] Lemma: If V·W=U, then σ(U·UT, V·VT) ||W||22 Proof: • take t ||W||22 = λmax(W·WT) = max x0 { xTW·WTx / xTx } • then xT (tI - W·WT) x 0 for all x • letting x = VTy gives yT (tV·VT - U·UT) y 0 for all y • recall σ(A, B) = min{τ : xT(tB– A)x 0 for all x, all t τ} • thus σ(U·UT, V·VT) t
[ ] a2 +b2-a2 -b2 -a2 a2 +c2 -c2-b2 -c2 b2 +c2 [ ] a2 +b2-a2 -b2 -a2 a2 -b2 b2 B =VVT A =UUT ] [ ] ] [ [ 1-c/a1 c/b/b ab-a c-b ab-a c-b-c = x U V W -a2 -c2 -a2 -b2 -b2 σ(A, B) ||W||22 ||W||x ||W||1 = (max row sum) x (max col sum) (max congestion) x (max dilation)
Open problems I • Other subgraph constructions for better bounds on||W||22? • For example [Boman], ||W||22 ||W||F2= sum(wij2) = sum of (weighted) dilations, and [Alon, Karp, Peleg, West]show there exists a spanning tree with average weighted dilation exp(O((log n loglog n)1/2)) = o(n ); this gives condition number O(n1+) and solution time O(n1.5+), compared to Vaidya O(n1.75) with augmented spanning tree • Is there a construction that minimizes ||W||22directly?
Open problems II • Make spanning tree methods more effective in 3D? • Vaidya gives O(n1.75) in general, O(n1.2) in 2D • Issue: 2D uses bounded excluded minors, not just separators • Spanning tree methods for more general matrices? • All SPD matrices? ([Boman, Chen, Hendrickson, Toledo]: different matroid for all diagonally dominant SPD matrices) • Finite element problems? ([Boman]: Element-by-element preconditioner for bilinear quadrilateral elements) • Analyze a multilevel method in general?
n1/2 n1/3 Complexity of linear solvers Time to solve model problem (Poisson’s equation) on regular mesh
References • M. Bern, J. Gilbert, B. Hendrickson, N. Nguyen, S. Toledo. Support-graph preconditioners. (Submitted for publication, 2001.) ftp://parcftp.xerox.com/gilbert/support-graph.ps • K. Gremban, G. Miller, M. Zagha. Performance evaluation of a parallel preconditioner. IPPS 1995. • D. Chen, S. Toledo. Implementation and evaluation of Vaidya’s preconditioners. Preconditioning 2001. (Submitted for publication, 2001.) • E. Boman, B. Hendrickson. Support theory for preconditioning. (Submitted for publication, 2001.) • E. Boman, D. Chen, B. Hendrickson, S. Toledo. Maximum-weight-basis preconditioners. (Submitted for publication, 2001.) • Bruce Hendrickson’s support theory web page: http://www.cs.sandia.gov/~bahendr/support.html