340 likes | 360 Views
Parallel Algorithm Oriented Mesh Datastructure. Jean-François Remacle, Joe E. Flaherty and Mark S. Shephard Rensselaer Polytechnic Institute remacle@scorec.rpi.edu. Outline Basics of Mesh Representation Parallel Extensions Software issues Examples http://www.scorec.rpi.edu/AOMD.
E N D
Parallel Algorithm Oriented Mesh Datastructure • Jean-François Remacle, • Joe E. Flaherty and Mark S. Shephard • Rensselaer Polytechnic Institute • remacle@scorec.rpi.edu • Outline • Basics of Mesh Representation • Parallel Extensions • Software issues • Examples • http://www.scorec.rpi.edu/AOMD
AOMD-PAOMD • AOMD and PAOMD deals with meshes • A core for basic mesh representation (topological adjacencies) • Advanced design patterns : Iterator, Observer, Visitor... • Some extensions • Parallel services : message passing, load balancing, partitioning • Meshing Toolbox : quality measures, mesh modifications, cavity mesher,… • Calculus : coordinate systems, integration <> • AOMD supports • Geometry based analysis design, classification • Parasolid, ProE, STL (not open source yet) • Hybrid meshes • Hexes, Tets, Quads … • Non conforming meshes (hanging nodes, AMR) • Curved meshes (Bézier) • AOMD is an Open Source Project
Motivations • Advanced analysis techniques • Are automated with automatic mesh generation from geometric models • Employ variable order elements based on other than Lagrange basis • Use various weak forms that required alternative mesh relationships • Adaptive the mesh as the simulation proceeds • To meet these needs the mesh data structure must • Understand the relationship of mesh entities with the geometric model - ensure mesh validity and associate physical parameters with the mesh • Understand the interactions of various mesh entities - different relationships used during automatic mesh generation and analysis • Support assignment of independent geometry to the entities in the mesh - must control geometric approximation with higher order methods • Be able to associate dof to various mesh entities - provides flexibility to support variable p-order and different collections of dofs • Effectively maintain relationships during modification - needed in mesh generation, mesh adaptation and mesh modification for evolving geometry
Automated Adaptive Analysis MEGA RPM Trellis AOMD-MeshSim AOMD-MeshSim AOMD Trellis FEM PUM
Topological Mesh Data Structure • Classic node point coordinates / element connectivities do not meet this need • Mesh representations based on topological entities and their adjacencies fill the need: • Can be proven to be complete and unique - can effectively support all relationships and associations needed by any mesh generation or analysis procedure • Provide a shape independent abstraction for associating geometry • Effectively supports the linkage to the geometric model since systems maintain a model topology • Various approaches to support meshtopology have been taken • A new approach considered here curved mesh forp-version analysis
Modern Geometric Modeling Systems • Employ non-manifold boundary representation • Provide access to model and its geometry through a geometric modeling kernel driven by topological entities • Simulation processes (mesh generation, p-version analysis...) can directly interact with the modeler
Basics of mesh representation • A geometrical domain G • Is the highest level representation of the domain • is composed of geometrical entities Gid • A Mesh M • is a discrete representation of G • is composed of mesh entities Mid, i=1,… Nd(M) together with their adjacencies Mid {M q } • Mesh entities Mid • 4 different topological kinds • Vertices (d=0), edges (d=1), faces (d=2) and regions (d=3) • The unique association of a mesh entity, Midi, to a geometric model entity, Gjdj, where di<dj is denoted by • Midi Gjdj
Adjacencies sets Examples of complete adjacencies sets Circular One-Level Upward Adjacenties Downward Adjacenties Face to Edge Incomplete Complete
Higher order adjacencies • First order adjacencies • Direct access, unit cost • Full representation : only way to have all adjacencies as first order • Higher order adjacencies sets • Regions know faces : Mi3 {M 2 } • Faces know edges : Mi2 {M 1 } • Edges know vertices : Mi2 {M 1 } • Third order adjacencies sets : Mi3 {M 2 } {M 1 } {M 0 }
Functionally complete representation • Minimum information • Equally dimension classified entities must be present • All vertices, all regions, all edges classified on model edgesand all faces classified on model faces • This is a sufficient minimum, not necessary but this choice allow to complete the representation without geometrical checks
Basics of the Algorithm Oriented Mesh Database • Mesh entity description • A mesh entity described by a set of lower dimension entities : Mid {M q } , d > q • All vertices are always required • Vertices are atomic mesh entities, must be differentiated (e.g., using iD (Mi0) • Mesh entities comparison • Two entities are equal if their set of vertices are equal • allows to compare mesh entities (<,>,=) • Not absolutely general but key to practical implementation
Downward adjacencies ordering : templates • Entity described using their boundaries • unique description i.e. non ambiguous shape • Weaker hypothesis • Need for ordering, templates • Used for computing uses • T ev and T fe • Invert templates Same vertices but different entities
Mesh Entity iD • Need of search in AOMD • Add, search and remove operations are crucial • Comparing entities is always possible but ... • std::set<mEntity*, lessThanEntity>log behavior is not acceptable • Hash tables • Elements in a hash table not sorted • complexity : worst is linear, average is constant • std::hash_set<mEntity*, hashEntity, equalEntity> • Hash function needed, deterministic and stateless, a mesh entity iD • iD(M1) = iD(M2) M1 = M2 true • iD(M1) = iD(M2) M1 = M2 false • iD is a function of vertices for being independent of the representation • iD(M1) = iD(M2) and M1 M2 should not happen too often, efficiency of the hash table because equalEntity is to be used in this case
Choice if the iD • Efficiency • Neis the number of elements • Nkis the number of keys
Higher order finite elements • Counting of degrees of freedom (Szabo basis) • Stokes problem, tet mesh • Number of dofs, use previous statistics
Basics of the Parallel AOMD • Basics of parallel AOMD • Partition boundaries treated like model boundaries • Equal order mesh entities must exist on partition boundaries (partition faces, edges and vertices) • Mesh vertices must have a unique global label • On processor : serial AOMD • Implementation aspects • Simplicity, no master, no owner • Round of communication standardized, no MPI calls visible, messages automatically packed
Parallel AOMD - Mesh Adaptation • Target is transient applications with thousands of mesh adaptation steps • Want fast and simple adaptation • Need efficient interprocessor communications • Mesh Refinement • Apply templates • Include support of non-conforming meshes • Refined entities with remote copies must be split on all partitions • Round of communication needed to ensure unique vertex ID’s • Mesh Coarsening (will be same for local mesh modifications) • Collect all mesh entities involved onto one partition • Carry out operation using serial operators on processor
Dynamic Load Balancing and Mesh Migration • Need dynamic load balancing after mesh adaptation • Procedures build on balancing procedures in Zoltan (from Sandia) • PAOMD used to provide Zoltan needed entities and connections • Load balancing procedure indicates which mesh entities are to be migrated to which processor • PAOMD only migrates minimum set, unless user specifically asks to migrate other entities classification after load balancing and before migration configuration after migration
Steps in process Collect the mesh entities to be migrated to another partition Determine needed higher order mesh entities to be migrated (use AOMD to determine minimal set needed) Collect entities and any user attached data Perform communications to send entities and update links (following methods of the Rensselaer Partition Model (RPM)) Message passing At PAOMD operator level it appears messages are sent one at a time This would lead to unacceptable communication costs Message packing used - AUTOPACK (from Argonne) Automatically controls message packing process Includes information and tools to optimize message size for network architecture used Communications costs In the examples that follow communications wereon the order of 1% of the total costs Mesh Migration
Implementation issues • Orientation of entities computed on the fly • Language • C++ and generic programming • STL, significant new feature of the language • Programming with concepts • C++ and OO programming • generic and OO are complementary • Trade off efficiency vs. flexibility • We believe not • Templates, functors… generic programming is efficient • Classical example, quick sort • stl::sort is 4 times faster (with VC6) than C qsort • Parallel • Autopack, automatic message packing • Zoltan, dynamic load balancing and partitioning • STL, associative containers, algorithms
Mesh refinement • class AOMD_RefCallback • {public : • virtual int operator () (const meshEntity *) const = 0; • virtual void callback (std::list<meshEntity *> &before, • std::list<meshEntity *> &after • ) const = 0; • }; • Conformal or not (hanging nodes or mixed meshes) • Typically • class myAOMD_RefCallback : public AOMD_RefCallback; • AOMD:: RefUnref(theMesh, myAOMD_RefCallback);
Communications • class AOMD_RoundOfComm • {public : • virtual char * sendBuffer (const meshEntity *, • int dest_proc, • size_t &sizebuf) const = 0; • virtual char * recvBuffer (const meshEntity *, • int src_proc, • size_t sizebuf) const = 0; • }; • Messages are packed (autopack) • Typically • class myAOMD_RoundOfComm : public AOMD_RoundOfComm; • AOMD::roundOfComm(theMesh, myAOMD_RoundOfComm);
Load balancing • class AOMD_LBCallback • {public : • virtualchar * sendBuffer (const meshEntity *, • int dest_proc, • size_t &sizebuf) const = 0; • virtualchar * recvBuffer (const meshEntity *, • int src_proc, • size_t sizebuf) const = 0; • }; • Messages are packed (autopack) • Typically • class myAOMD_LBCallback : public AOMD_LBCallback; • AOMD::LB(theMesh, myAOMD_LBCallback);
Refined to moving level function - non-conforming triangular mesh Demonstration of Load Balancing
http://www.scorec.rpi.edu/DG • Desing specifications • Solve any conservation law, in any dimension, in parallel and adaptively, using any system of curvilinear coordinates, any discretization, any spatial basis and any time stepping scheme class ConservationLaw : fluxes, numerical fluxes fn and g, right hand side r, equation of state, initial and boundary conditions : the physics class Integrator : specialization's for geometrical elements class Metric : Euclidian, axisymmetric class FunctionSpace : Orthogonal or not, class Solver : forward Euler, Runge-Kutta, matrix free GMRES. • Examples • Navier-Stokes, Euler, Burgers, Maxwell-Boltzmann...
2-D Examples • 2D Rayleigh Taylor • Red (heavy), blue (light), • On left, 4 levels of refinement,On right, 2 levels of refinement, • Faster growth with more refinement as expected
Linear DG elements, 30,000 to 800,000 dof Atwood Number, A = 1/3 10 fourier modes in “random” distribution time for the bubble to reach top of the window (y = 0.5) : 5 sec This calculation: alpha = 0.06 Experiments: alpha = 0.058 - 0.065 Theory (Glimm, et al) alpha 0.045 = 0.06 2-D Animation of Instability
Refined 3-D Meshes for Rayleigh Taylor Instability non-conforming hexahedron mesh light fluid heavy fluid 24 steps of refinement 104 steps of refinement 72 steps of refinement
Rayleigh Taylor Instability 128 processors of Blue Horizon 108 dof’s 64 processors of the PSC alpha cluster 1 106 to 2.0107 dof’s
Conclusions • PAOMD advantages • Quite small piece of software, documented • Focused, mesh management only • Asks for minimum user knowledge about parallel issues • Efficient implementation • Future work • Terascale computers, more than 1000 processors (in progress, ASCI & SciDAC projects) • Anisotropic mesh refinement (in progress, with X Li) • TSTT Mesh component • Hardware heterogeneity, machine and network models have to be added in partitioners (in progress) • Modification of the design, storingless