170 likes | 188 Views
Explore advancements in matrix scaling and balancing using Box-Constrained Newton’s Method and Interior Point Methods, crucial for scientific computing and permanent approximation. Learn about Generalized Matrix Balancing via Convex Optimization and its applications.
E N D
Much Faster Algorithms for Matrix Scaling Matrix Scaling and Balancing via Box-Constrained Newton’s Method and Interior Point Methods Zeyuan Allen-Zhu, YuanzhiLi, Rafael Oliveira,AviWigderson Michael Cohen, Aleksander Mądry, Dimitris Tsipras, Adrian Vladu
Matrix Scaling M 1 = r MT1 = c .5 1 = 1 1 .5 2 Y X A M Matrix Balancing M1 = MT1 1 1 = 1 1 .5 2 X-1 X A M
Why Care? • Preconditioning linear systems (XAY) Y-1z = Xb A z = b • Approximating the permanent of nonnegative matrices Per(A) = Per(XAY) /(Per(X) Per(Y)) exp(-n) ≤ Per(XAY) ≤ 1 XAY doubly stochastic • Detecting perfect matchings A : adjacency matrix of bipartite graph ∃ perfect matching Per(A) ≠ 0
Why Care? • Intensively studied in scientific computing literature • [Wilkinson ’59], [Osborne ’60], [Sinkhorn’64], [Parlett, Reinsch’69], [Kalantari, Khachiyan’15], [Schulman, Sinclair ’15], … • Matrix balancing routines implemented in MATLAB, R • Generalizations (operator scaling) are related to Polynomial Identity Testing • [Gurvits’04],[Garg, Gurvits, Oliveira, Wigderson’17] , …
Generalized Matrix Balancing Via Convex Optimization • Captures the problem’s difficulty • Solves matrix scaling via simple reduction rM = M 1 cM= MT1 1 1 = 1 1 .5 2 exp(-X) exp(X) A M X-1 X Goal: rM-cM=0 d f(x) = ∑ijAijexp(xi-xj) - ∑i dixi nice convex function ∇f(x) = rM - cM - d
Equivalent Nonlinear Flow Problem “Nonlinear Ohm’s Law”: fuv = Auvexp(xu- xv) Ohm’s Law: fuv = Auv(xu- xv) 1 2 e/2 .5 0 e/2 3 .5 e 1 e/2 3e/2 t e s .5 -2e e +2e 1.5 1 2 1 1 * edge weights = capacitances
Generalized Matrix Balancing Via Convex Optimization • Captures difficulty of both problems • Solves matrix scaling via simple reduction rM = M 1 cM= MT1 1 1 = 1 1 .5 2 A exp(X) M exp(-X) Goal: rM-cM=d Goal: |rM-cM-d|≤ ε f(x) = nice convex function ∇f(x) = rM - cM - d
Generalized Matrix Balancing Via Convex Optimization f(x) = nice convex function ∇f(x) = r - c - d General Convex Optimization Framework: f(x + Δ) = f(x) + ∇f(x)TΔ + ½ ΔTHxΔ + … Δ= argmin|Δ|≤c… Δ= argmin|Δ|≤c… Second order methods First order methods [Ostrovsky, Rabani, Yousefi’17] Matrix Balancing O(m+nε-2) • Sinkhorn/Osborne iterations are instantiations of this framework (coordinate descent) [Kalantari,Khachiyan, Shokoufandeh’97] Õ(n4 log ε-1)
Our Results • [AZLOW ’17 ] • [CMTV ’17 ] First Order Methods Second Order Methods Accelerated Gradient DescentO(mn1/3ε-2/3) Interior Point Method Õ(m3/2 log ε-1) Box-Constrained Newton Method New second-order framework (essentially identical in both papers) Õ((m+n4/3) log κ(X*)) Õ(m log κ(X*)) κ(X*) = condition number of matrix that yields perfect balancing
Generalized Matrix Balancing Via Convex Optimization f(x) = nice convex function • Can we use second order information to obtain a good solution in few iterations? ∇f(x) = rM - cM - d Hx)= diag(rM+cM) - (M+MT) f(x + Δ) ≈ f(x) + ∇f(x)TΔ + ½ ΔTHxΔ (*) • Hessian matrix is a graph Laplacian • Can compute Hx-1b in Õ(m) time [Spielman-Teng’08, …] M = exp(X) A exp(-X) rM = M 1 cM= MT1 • If |Δ|∞≤ 1then Hx ≈O(1)Hx+Δ (* whenever the Hessian does not change too much along the line between x and x+Δ)
Box-Constrained Newton’s Method f(x + Δ) ≈ f(x) + ∇f(x)TΔ + ½ ΔTHxΔ Key idea: • If |Δ|∞≤ 1then Hx ≈O(1)Hx+Δ • Suppose we can exactly minimize the second order approximation over |Δ|∞≤ 1 • Goal: show that moving to minimizer inside box makes a lot of progress • f(x+Δ)-f(x*) ≥ 1/10 (f(x+Δ*)-f(x*)) Minimizer of quadratic approximationin L∞ region • Minimizer of f in L∞ region
R∞ = maxx:f(x)≤f(x0) |x-x*|∞ Box-Constrained Newton’s Method • f(O)-f(O) ≥ f(O)-f(O) • f(O)-f(O) ≥ (f(O)-f(O)) / |O-O|∞ • absolute upper bound R ∞ • arbitrarily close to O in Õ(R ∞) iterations
R∞ = maxx:f(x)≤f(x0) |x-x*|∞ Box-Constrained Newton’s Method f(x + Δ) ≈ f(x) + ∇f(x)TΔ + ½ ΔTHxΔ Key idea: • If |Δ|∞≤ 1then Hx ≈O(1)Hx+Δ • Õ(R∞) box constrained quadratic minimizations • Suppose we can exactly minimize the second order approximation over |Δ|∞≤ 1 • f(x+Δ)-f(x*) ≥ 1/10 (f(x+Δ*)-f(x*)) Minimizer of quadratic approximationin L∞ region • Minimizer of f in L∞ region
R∞ = maxx:f(x)≤f(x0) |x-x*|∞ Box-Constrained Newton’s Method f(x + Δ) ≈ f(x) + ∇f(x)TΔ + ½ ΔTHxΔ Key idea: • If |Δ|∞≤ 1then Hx ≈O(1)Hx+Δ • Õ(R∞) box constrained quadratic minimizations • Õ(kR∞) box constrained quadratic minimizations • Suppose we can exactly minimize the second order approximation over |Δ|∞≤ 1 • Unclear how to solve this fast • Instead, relax the L∞ constraint by a factor of k • outsource to k-oracle
k-oracle Input: graph Laplacian L, vector b Ideally: output Instead: output • [AZLOW ’17 ] • [CMTV ’17 ] based on Laplacian solver [LPS ’15] based on approximate max flow algorithm [CKMST’11] Õ(m) Õ(m+n4/3)
Conclusions and Future Outlook • Nearly-linear time algorithms for matrix scaling and balancing • New framework for second order optimization • Used Hessian smoothness while avoiding self-concordance • Can we use any of these ideas for faster interior point methods? • Dependence in condition number log κ(X*) given by the R∞ bound • If we want to detect perfect matchings, R∞ = Θ(n) • Is there a way to improve this dependence? (log κ(X*))1/2 • We saw an extension of Laplacian solving. What else is there? • Better primitives for convex optimization?