380 likes | 492 Views
CCNI HPC 2 Activities. HPC 2 Activities. NYS High Performance Computation Consortium funded by NYSTAR at $1M/year for 3 years Goal is to provide NY State users support in the application of HPC technologies in: Research and discovery Product development
E N D
HPC2 Activities • NYS High Performance Computation Consortium funded by NYSTAR at $1M/year for 3 years • Goal is to provide NY State users support in the application of HPC technologies in: • Research and discovery • Product development • Improved engineering and manufacturing processes • The HPC2 is a distributed activity - participants • Rensselaer, Stony Brook/Brookhaven, SUNY Buffalo, NYSERNET
NY State Industrial Partners • Xerox • Corning • ITT Fluid Technologies: Goulds Pumps • Global Foundries
Modeling Two-phase Flows Objectives • Demonstrate end-to-end solution of two-phase flow problems. • Couple with structural mechanics boundary condition. • Provide interfaced, efficient and reliable software suite for guiding design. Tools • Simmetrix SimAppS Graphical Interface – mesh generation and problem definition • PHASTA – two-phase level set flow solver • PhParAdapt – solution transfer and mesh adaptation driver • KitwareParaview – visualization Systems • CCNI BG/L, CCNI Opterons Cluster
Modeling Two-phase Flows3D Example Simulation REPLACE WITH ANIMATION Fluid ejected into air. Ran on 4000 CCNI BG/L cores.
Two-phase Automated Mesh Adaptation Six iterations of mesh adaptation on two-phase simulation. Autonomously ran on 128 cores of CCNI Opterons for approximately 4 hours
Modeling Two-phase FlowsSoftware Support for Fluid Structure Interactions • Initial work interfaces simulations through serial file formats for displacement and pressure data. • Structural mechanics simulation runs in serial. PHASTA simulation runs in parallel. • Distribute serial displacement data to partitioned PHASTA mesh. • Aggregate partitioned PHASTA nodal pressure data to serial input file. • Modifications to automated mesh adaptation Perl script. Structural Mechanics Mesh of Input Face PHASTA Partitioned Mesh of Input Face
Modeling Free Surface Flows Objectives • Demonstrate capability of available computational tools/resources for parallel simulation of highly viscous sheet flows. • Solve a model sheet flow problem relevant to the actual process/geometry. • Develop and define processes for high fidelity twin screw extruder parallel CFD simulation. Investigated Tools (to date) • ACUSIM AcuConsole and AcuSolve, Simmetrix MeshSim, KitwareParaview Systems • CCNI Opterons Cluster
Parallel 3D Sheet Flow Simulation • High Aspect Ratio Sheet • Aspect ratio : 500:1 • Element count: 1.85 Million • 7 mins on 512 cores • 300 mins on 8 cores
Screw Extruder: Simulation Based Design Tools • Mesh generation in SimmetrixSimAppS graphical interface. • Gaps that are ~1/180 of large feature dimension. • Conceptual Rendering of Single Screw Extruder Assembly* • Single Screw Extruder CAD** • * http://en.wikipedia.org/wiki/Plastics_extrusion • ** https://sites.google.com/site/oscarsalazarcespedescaddesign/project03
Modeling Pump Flows Objectives • Apply HPC systems and software to setup and run 3D pump flow simulations in hours instead of days. • Provide automated mesh generation for fluid geometries with rotating components. Tools • ACUSIM Suite, PHASTA, ANSYS CFX, FMDB, Simmetrix MeshSim, Kitware Paraview Systems • CCNI Opterons Cluster
Modeling Pump FlowsGraphical Interfaces • AcuConsole Interface • Problem definition, mesh generation, runtime monitor, and data visualization
Mesh Generation Tools • Simmetrixprovided customized mesh generation and problem definition GUI after iterating with industrial partner. • Supports automated identification of pump geometric model features and application of attributes • Problem definition with support for exporting data for multiple CFD analysis tools. • Reduced mesh generation time frees engineers to focus on simulation and design optimizations improved products
Scientific ComputationResearch Center • Goal: Develop simulation technologies that allowpractitioners to evaluate systems of interest. • To meet this goal we • Develop adaptive methods for reliable simulations • Develop methods to do all computation on massively parallel computers • Develop multiscale computational methods • Develop interoperable technologies that speed simulation system development • Partner on the construction ofsimulation systems for specific applications in multiple areas
SCOREC Software Components • Software available (http://www.scorec.rpi.edu/software.php) • Some tools not yet linked – email shephard@scorec.rpi.edu with any questions • Simulation Model and Data Management • Geometric model interface to interrogate CAD models • Parallel mesh topological representation • Representation of tensor fields • Relationship manager • Parallel Control • Neighborhood aware message packing • Iterative mesh partition improvement with multiple criteria • Processor mesh entity reordering to improve cache performance
SCOREC Software Components (Continued) • Adaptive Meshing • Adaptive mesh modification • Mesh curving • Adaptive Control • Support for executing parallel adaptive unstructured mesh flow simulations with PHASTA • Adaptive multimodel simulation infrastructure • Analysis • Parallel Hierarchic Adaptive Stabilized Transient Analysis software for compressible or incompressible, laminar or turbulent, steady or unsteady flows on 3D unstructured meshes (with U. Colorado) • Parallel hierarchic multiscale modeling of soft tissues
Interoperable Technologies for AdvancedPetascaleSimulations (ITAPS) Petascale Integrated Tools Build on AMR Front tracking Shape Optimization Solution Adaptive Loop Solution Transfer Petascale Mesh Generation Component Tools Are unified by Front tracking Smoothing Mesh Adapt Swapping Interpolation Kernels Dynamic Services Geom/Mesh Services Common Interfaces Mesh Geometry Relations Field
PHASTA Scalability(Jansen, Shephard, Sahni, Zhou) • Excellent strong scaling • Implicit time integration • Employs the partitioned mesh for system formulation and solution • Specific number of ALL-REDUCE communications also required 105M vertex mesh (CCNI Blue Gene/L) 1 billion element anisotropic mesh on Intrepid Blue Gene/P
Strong Scaling – 5B Mesh up to 288k Cores • AAA 5B elements: full-system scale on Jugene (IBM BG/P system) Without ParMA partition improvement strong scaling factor is 0.88 (time is 70.5 secs). Can yield 43 cpu-years savings for production runs!
Parallel Adaptive Analysis • Requires functional support for • Mesh distribution • Mesh level inter-processor communications • Parallel mesh modification • Dynamic load balancing • Have parallel implementations for each – focusing on increasing scalability
Parallel Mesh Adaptation to 2.2 Billion Elements Mesh size field of air bubbles distributing in a tube (segment of the model – 64 bubbles total) • Initial mesh: uniform, 17 million mesh regions • Adapted mesh: 160 air bubbles 2.2 billion mesh regions • Multiple predictive load balance steps used to make the adaptation possible • Larger meshes possible (not out of memory) Initial and adapted mesh(zoom ofabubble), colored by magnitude of mesh size field
Initial Scaling Studies ofparallel MeshAdapt • Test strong scaling uniform refinement on Ranger 4.3M to 2.2B elements • Nonuniform field driven refinement (with mesh optimization) on Ranger 4.2M to 730M elements (time for dynamic load balancing not included) • Nonuniform field driven refinement (with mesh optimization operations) on Blue Gene/P 4.2M to 730M elements (time for dynamic load balancing not included)
t=0.0 t=2e-4 t=5e-4 Adaptive Loop Construction • Tightly coupled • Adv: Computationally efficient • Disadv: More complex code development • Example: Explicit solution of cannon blasts • Loosely coupled • Adv: Ability to use existing analysis codes • Disadv: Overhead of multiple structures and data conversion • Example: Implicit high-orderActive flow control modeling
File Free Parallel-Adaptive Loop • Adaptive Loop Driver – C++ • Coordinates API calls to execute solve-adapt loop • phSolver – Fortran 90 • Flow solver scalable to 288k cores of BG-P, Field API • phParAdapt – C++ • Invokes parallel mesh adaptation • SCOREC FMDB and MeshAdapt, SimmetrixMeshSim and MeshSimAdapt Control Control phParAdapt Adaptive Loop Driver phSolver Field API Field API Field Data Field Data Compact Mesh and Solution Data Mesh Data Base Solution Fields 27
Mesh Curving for COMPASS Analyses • Mesh curving applied to 8-cavity cryomodule simulations • 2.97 Million curved regions • 1,583 invalid elements corrected – leads to stable simulation and executes 30% faster mesh close-up before and after correcting invalid mesh regions marked in yellow
Moving Mesh Adaptation • FETD for short-range wakefield calculations • Adaptively refined meshes have 1~1.5million curved regions • Uniform refined mesh using small mesh size has 6 million curved regions Electric fields on the three refined curved meshes
Patient Specific Vascular Surgical Planning Initial mesh has 7.1 million regions Initial mesh isisotropic outside boundary layer The adapted mesh: 42.8 million regions 7.1M->10.8M->21.2M->33.0M->42.8M Boundary layer based mesh adaptation Mesh is anisotropic
Multiscale Simulations for Collagen Indentation • Multiscale simulation linking microscale network model to a macroscale finite element continuum model. • Collaborating with experimentalistsat the University of Minnesota Macroscale Model Microscale Model
Concurrent Multiscale: Atomistic-to-Continuum Nano-indentation of a thin film. Concurrent model configuration at 60th load step (3 A indentation displacement). Colors represent the sub-domains in which various models are used. Nano-void subjected to hydrostatic tension. Finite element discretization of the problem domain and dislocation structures.
Mechanics of damage nucleation in devices 1st principlesCMOS modeling Super-resolutionlithography tools Reactive ionetching Device simulation variation-awarecircuit design Simulation Automation Components ParallelComputingMethods Fab-Aware High-PerformanceChip Design design manufacture use/performance Modeling/simulation development atoms/carriers Technology development size scale devices circuits
E Fermi level NU Poisson UI Schrödinger UN First-Principles Modeling for NanoelectronicCMOS (Nayak) • As Si CMOS devices shrink nanoelectronic effects emerge. • Fermi-function based analysis gives way to quantum energy-level analysis. • Poisson and Schrodinger equations reconciled iteratively, allowing for current predictions. • Carrier dynamics respond to strain in increasingly complex ways from mobility changes to tunneling effects. • New functionalities might be exploited • Single-electron transistors • Graphene semiconductors • Carbon nanotube conductors • Spintronics – encoding information into charge carrier’s spin Input to circuit level from atomic level physics
Super-Resolution Lithography Analysis (Oberai) • Motivation: • Reducing feature size in has made the modeling of underlying physics critical. • In projective lithography simple biasesnot adequate • In holographic lithography near-field phenomenon is predominant • Modeling approach must be based on Maxwell’s equations • Goal: • Develop unified computational algorithms for the design and analysis of super-resolution lithographic processes that model the underlying physics with high fidelity Projective Lithography Holographic Lithography
Virtual Nanofabrication: Reactive-Ion Etching Simulation (Bloomfield) • To handle SRAM-scale systems, we expect much larger computational systems, e.g., 105 - 106 surface elements. • Transport tracking scales O(n2) with number of surface elements n. • Parallelizes well – every view factor can be computed completely independently of every other view factor, giving almost linear speed up. • Computational complexity of chemistry solver depends upon particular chemical mechanisms associated with etch recipe. Tend to be O(n2). Cut away view of reactive ion etch simulation of an aspect ratio 1.4 via into a dielectric substrate with 7% porosity, and complete selectivity with respect to the underlying etch stop. A generic ion-radical etch model was used. ~103 surface elements. [Bloomfield et al., SISPAD 2003, IEEE.]
Stress-induced Dislocation Formation in Silicon Devices (Picu) • At 90 nm and below, devices have come to rely on increased carrier mobility produced by strained silicon. • As devices scale down, the relative importance of scattering centers increases. • Can we have our cake and eat it too? How much strain can be built into a given device before processing variations and thermo-mechanical load during use cause critical dislocation shedding? A local atomistic problem is constructed and an MD simulation is run, looking for criticality. Results feed back to continuum. Continuum FEM calculationsautomatically identify critical high-stress regions.
Advanced Meshing Tools for NanoelectronicDesign (Shephard) • Advanced meshing tools and expertise exist at RPI and associated spin-off • Leverage tools to support CCNI projects such as the advanced device-modeling. • Local refinement and adaptivity can help carry the computation resources further. “More bang for the buck.”