What did Flash do? (Flash2 I/O and Visualization)

What did Flash do?(Flash2 I/O and Visualization) By Brad Gallagher

Checkpoint Files: 1. Tree Data 2. Grid Data 3. Physical Variables Data (Paramesh unk) Tree Data: (mostlyParamesh data structures, some data structures derived by Flash) Grid Data: (Paramesh data structures that define the AMR grid, tree data used to properly interpret grid data) Physical Variable Data (unknowns): (5d array containing solution data, only the interior cells are written. There can be up to 24 variables per nxb by nyb by nzb by total_blocks) Stored unk: unk(nvar,nxb,nyb,nzb,total_blocks) Checkpoint Files in Flash2

Checkpoint Files in Flash2 Cont.. • Header data: • Misc header data, setup call, build time, geometry name, Flash modules used etc.. (many more) • Simulation parameters (total_blocks, simulation time, dt, redshift, nxb, nyb,nzb , number of steps) • Timer Data: • Fully nested timer data for various modules and subroutines • Particle Data: • Full snapshot of particle tracers data

Restarting a Flash simulation from Checkpoint • How to do it: • Set “restart” to true in flash.par • Provide the correct checkpoint number (cpnumber) to start from in flash.par • Be sure to provide what will be the right subsequent plotfile number (ptnumber) • What it does: • First the simulation parameters are read • Total_blocks from simulation parameters is used along with the number of processors to calculate the approximate number of blocks to put on each processor • Global offsets are again calculated by determining first the number of blocks on processor i, then the number of blocks on processor i+1 is the number of blocks on i plus the number of blocks left of processor i • Once all the data is read, the gid array (global block ids) is used to re-build the tree structure (paramesh’s neigh, child and parent), then guardcells can be exchanged and you are ready to take another timestep.

Flash2 Plotfiles • Plotfiles: • Contain same tree and grid data as checkpoints • Contain only a subset of the unknown data (as defined by flash.par) • Reals are stored in single precision (both unknowns and real grid data) • Data can be cell-centered or corner interpolated • Plotfiles are dumped based on tplot parameter as specified in flash.par • In general they are dumped more frequently than checkpoints

Particle Plot Files in Flash2 • Outputted like plotfiles, if time_elapsed since last output > ptplot • Can also be outputted every timestep with a filter subset of particles are selected for dumping at each timestep based on user definition • Flexible output style control • Single or multiple files or multiple files after N particles have been dumped • Particle data can be written as a structure of mixed types or as arrays of the same type • Underlying binary I/O libraries restrict the above (pnetcdf must write arrays of the same type) • Runtime parameters that control particleplotfile output • pptype : LOGICAL : if true output to a new file for each particleplot call, if false write to the same file • nppart : INTEGER: if pptype is false still create a new file after nppart particles have been dumped • pfilter : LOGICAL: if true output particleplotfile every timestep (user defined filter in place), if false dump based on ptplot • ptplot : REAL : time elapsed between particleplotfile dumps • pstruct : LOGICAL: if true dump particle data as mixed data type structure, if false split structure into arrays of the same type

Tree Data: lrefine: The level of refinement of a block Dimension: (ndim,total_blocks) nodetype: describes the relationship of a block to others 1 : Block is a leaf node (no children) 2: Block is a parent (at least 1 leaf node) 3: Block has children (no leaf nodes) Dimension: (ndim,total_blocks) gid: global block id array Derived by Flash using child,nfaces, neigh and parent Used upon restart to rebuild parent data structure Parent data structure needed to do guardcell exchange after restart Dimension: (nfaces+1+nchild,total_blocks) Tree Data: oct-tree in 3d quad tree in 2d Details Cont.. (Tree Data)

Grid Data: coordinate data: xyz coordinates of the center of a block Dimension(ndim,total_blocks) size data: The xyz extents of a block Dimension(ndim,total_blocks) bounding box data: the upper and lower edge of a block along the jth coordinate axis Dimension(2,ndim,total_blocks) Details Cont.. (Grid Data)

Flash2 flash.par (sedov explosion) • # Runtime parameters for the Sedov explosion problem. • p_ambient = 1.E-5 • rho_ambient = 1. • exp_energy = 1. • r_init = 0.013671875 • xctr = 0.5 • yctr = 0.5 • zctr = 0.5 • gamma = 1.4 • geometry = "cartesian" • xmin = 0. • xmax = 1. • ymin = 0. • ymax = 1. • zmin = 0. • zmax = 1. • xl_boundary_type = "outflow" • xr_boundary_type = "outflow" • yl_boundary_type = "outflow" • yr_boundary_type = "outflow" • zl_boundary_type = "outflow" • zr_boundary_type = "outflow" • cfl = 0.8 • lrefine_max = 6 • basenm = "sedov_2d_6lev_" • restart = .false. • trstrt = 0.01 • tplot = 0.001 • nend = 10000 • tmax = 0.05 • run_comment = "2D Sedov explosion, from t=0 with r_init = 3.5dx_min" • log_file = "sedov_2d_6lev.log" • eint_switch = 1.e-4

Output Flow in Flash2

Serial vs. Parallel I/O in Flash2 • Serial I/O Master process gathers data from all others in the communicator and dumps the data Supported in HDF5 • Parallel I/O Each process writes its own data in a fully non-contiguous manner using global offsets global offsets calculated by gathering all the blocks from all processes and each process determines the amount of blocks “to the left” of itself Same thing is done for particles Supported in HDF5 and Pnetcdf

How does Flash do parallel I/O really?? • Paramesh data structures are local to a processor • For example the lrefine data structure has a dimension of (ndim,local_blocks) • Using a global offset number that is local to each processor along with the local blocks on each processor the full lrefine(ndim,total_blocks) data can be written to a file in parallel non-contiguously • Real Examples (writing level of refinement in parallel) void FTOC(h5_write_lrefine)(hid_t* file_identifier, int lrefine[], int* local_blocks, int* total_blocks, int* global_offset) { rank = 1; dimens_1d = *total_blocks; dataspace = H5Screate_simple(rank, &dimens_1d, NULL); dataset = H5Dcreate(*file_identifier, "refine level", H5T_NATIVE_INT, dataspace, H5P_DEFAULT); start_1d = (hssize_t) (*global_offset); stride_1d = 1; count_1d = (hssize_t) (*local_blocks); status = H5Sselect_hyperslab(dataspace, H5S_SELECT_SET, &start1d,&stride1d, &count_1d, NULL) dimens_1d = *local_blocks; memspace = H5Screate_simple(rank, &dimens_1d, NULL); status = H5Dwrite(dataset, H5T_NATIVE_INT, memspace, dataspace, H5P_DEFAULT, lrefine) } call h5_write_lrefine(file_id, lrefine, & lnblocks, & tot_blocks, & global_offset)

Parallel I/O: putting it all together

HDF5 vs. Parallel NetCDF • Parallel-NetCDF is more lightweight and more tightly coupled to mpi-io. • Writing attribute/header data in parallel-netcdf is done in one call, while in hdf5 multiple single processor writes must occur to complete these operations • There is some good indication that parallel-netcdf is faster

Visualizing Flash Data • 2d / 3d – slice tool FILDR3 • Uses IDL • Can make 2d images and 3d-slices of Flash data • Has support for particles • Can give 1-d slices of 2d data • Can show the amr grid • Both hdf5 and pnetcdf currently supported • Flashviz • Developed by Mika Papka and Randy Hudson (MCS Argonne) • Renders 3d iso-surfaces of Flash data • Fully parallel (both reading the file and geometry calculations are done in parallel) • Parallelism allows for the ability to look at very large datasets interactively • Supports rendering multiple iso-surfaces of multiple variables or the same variable • Can show the amr grid in 3d • Has xyz cut plane features so that the entire range of data for a variable can be projected on a plane • Future support for parallel rendering • Currently only supports hdf5 format

Example of 3d slice taken by FILDR3

Example of an isosurface (single variable)

Another example with two variables

Another example/ isosurface with cut plane

Isosurface with Grid

Issues with Cell-centered vs. Corner Interpolated Data

Issues with cell-centered vs. corner interpolated data

What did Flash do? (Flash2 I/O and Visualization)