120 likes | 268 Views
New Features in ML. 2004 Trilinos Users Group Meeting November 2-4, 2004 Jonathan Hu , Ray Tuminaro, Marzio Sala, Michael Gee, Haim Waisman.
E N D
New Features in ML 2004 Trilinos Users Group Meeting November 2-4, 2004 Jonathan Hu, Ray Tuminaro, Marzio Sala, Michael Gee, Haim Waisman Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,for the United States Department of Energy under contract DE-AC04-94AL85000.
Overview • Multigrid Options • ParMETIS • Zoltan • Repartitioning • Analysis Tools • GGB method • Memory usage • Visualization • Documentation
Traditional Coarsening • Coarsening rate fixed: h/H 3n in n-d problem • What can go wrong? • AMG complexity goes up ∑[nnz(A(j))] / nnz(A(1)) • result: more time per iteration • In parallel, each coarse grid has latency penalty
Aggressive Coarsening • Idea: use graph partitioner to make larger aggregates • METIS / ParMETIS • Coarsening rate: user-determined • Fewer levels: mitigates coarse grid latency • Smaller + fewer coarse grids → lower complexity • Convergence rate could suffer --with-ml_metis --with-ml_parmetis3x
App: MPSalsa Airport Simulation 3D transient LES (13M DOFs/1K node Cplant) Aggressive coarsening
Coarsening with Zoltan • Main idea • App provides coordinates on fine level (only) • Call to Zoltan for coarsening (RCB algorithm) • ML internally creates coordinates for coarser levels • Centers of mass • Status: still in testing phase -- with-ml_zoltan
P Proc. 1 Proc. 2 Proc. 1 Proc. 2 Proc. 3 Repartitioning to Improve Parallel Performance • Load balances operators in multigrid hierarchy • Motivation • App load balancing may be non-optimal for linear solver • App may take large % of memory (e.g., multiphysics) • Linear solver gets remaining memory • Result: low parallel efficiency • Coarsening rate may slow as get to few unknowns / proc • Main idea • Determine “good” partitioning with ParMETIS • Construct permutation matrix P based on partitioning • Apply to multigrid coarse grid operators A A
Repartitioning applied toZpinch simulation • Before repartitioning on Janus… • 210+ processor simulations failed • App-supplied linear system already imbalanced
Adaptive AMG GMRES \ QMR MG GGB Find modes not captured by MG adaptive filter extra coarse grid GMRES(20) + GGB/ML GMRES(150) + ML GGB GB
Analysis / Profiling Tools • Aggregate visualization • Assess aggregate quality • User provides fine-level coordinates • CoM used as coordinates on coarser levels • Stats calculated on avg size, diameters • Currently using 3rd party package, OpenDX • Error visualization
Analysis/Profiling Tools (cont’d) • Matrix performance • Matrix statistics • Eigen analysis • Detailed operator profiling • Apply & communication time MultilevelPreconditioner::AnalyzeMatrixCheap() ML_Operator_Profile() • Internal memory profiling • Lightweight • Highwater mark, largest free block • Postprocessing for plotting
Updated Documentation • ML User’s Guide, version 3.0 • Configure & build information • MultilevelPreconditioner() class intro • Exhaustive options list • ML Developer’s Guide • Configuration, building, testing details • Suggested practices • Intro to tools on software.sandia.gov • Updated web pages • Now built automatically each night • Incorporates doxygen comments • http://software.sandia.gov/trilinos/packages/ml