180 likes | 264 Views
Murat Manguo ğ lu * , Mehmet Koyut ü rk ** , Ananth Grama * and Ahmed Sameh * * Purdue University ** Case Western Reserve University. Weighted Matrix Reordering and Parallel Banded Preconditioners for Nonsymmetric Linear Systems. Support: DARPA, NSF, Intel, NCSA.
E N D
Murat Manguoğlu*, Mehmet Koyutürk**, Ananth Grama* and Ahmed Sameh* *Purdue University **Case Western Reserve University Weighted Matrix Reordering and Parallel Banded Preconditioners for Nonsymmetric Linear Systems Support: DARPA, NSF, Intel, NCSA
A computational loop Integration Newton Iteration Linear system solvers k k t
Motivation • New architectures increasingly rely on parallelism • Concurrency and localization play an important role • Algorithms for such platforms must account for concurrency and memory references
Implications: General Sparse Solvers • Maximal use of dense kernels • Development of methods that optimize concurrency • A banded matrix is a natural candidate as a preconditioner
Preprocessing to Obtain the Preconditioner (BiCGStab/GMRES is used as the iterative solver) • ILUPACK : Multilevel ILU [Bollhöfer] • http://www.math.tu-berlin.de/ilupack/ • ILUT : Incomplete LU Factorization from Sparsekit [Saad] • http://www-users.cs.umn.edu/~saad/software/SPARSKIT/sparskit.html • ILUT-I : Improved ILUT [Benzi, et. al. ] • reorder using HSL-MC64 to maximize the product of the diagonals and scale the matrix • apply symmetric RCM reordering • get the incomplete factorization via ILUT
WSO : Our proposed method • reorder using HSL-MC64 to make the diagonal zero free • reorder |A| + |AT | using HSL-MC73 to place larger elements closest to the main diagonal • extract a banded preconditioner, such that %99.9 percent of the weight is inside the band • factorize the banded preconditioner
Comparison to ILUPACK AMF/PQ preconditoners on an uniprocessor [of Sgi-Altix] Outer Iterative Solver: unrestarted GMRES ILUPACK Parameters : droptol : 1e-1 , bound for inv(L), inv(U) : 10 , elbow space : 100
Comparison to ILUT and Improved-ILUT Preconditioners on an uniprocessor [of Clovertown] Outer Iterative Solver : BiCGStab
WSO: Factorization+Solve time Scalability Speed improvement over uniprocessor timing on Sgi-Altix
Reordering and Solve Times of 3 Different Systems on an Uniprocessor
Reservoir Simulation(SPE10 benchmarks) • Problem #1 : N= 2,244,000 • Problem #2 : N= 2,462,265 • “banded systems” →Simple/no reordering to extract a central band as a preconditioner • Results on an SGI-Altix
Reservoir Simulation #1 33 Algebraic Multigrid time: 31.4 seconds (AMD dual core) 21 11 5 2.5 1.4
Reservoir Simulation #2 58 29 14 7 3 2 1
Summary and Future Work • Weighted reordering is an effective method for obtaining a banded preconditioner • Overall the method we propose is both reliable and scalable • Spectral reordering is relatively inexpensive for extracting banded preconditioners for solving several systems with “roughly the same” matrix of coefficients. • Parallel weighted reordering schemes needs to be developed