140 likes | 236 Views
Access Patterns, Metadata, and Performance. Alok Choudhary and Wei-Keng Liao Department of ECE, Northwestern University Collaboration with ANL. SDM kickoff meeting July 10-11, 2001. Virtuous Cycle. Simulation (Execute app, Generate data). Problem setup (Mesh, domain Decomposition).
E N D
Access Patterns, Metadata, and Performance Alok Choudhary and Wei-Keng Liao Department of ECE, Northwestern University Collaboration with ANL SDM kickoff meeting July 10-11, 2001 choudhar@ece.nwu.edu 1
Virtuous Cycle Simulation (Execute app, Generate data) Problem setup (Mesh, domain Decomposition) Manage, Visualize, Analyze Measure Results, Learn, Archive choudhar@ece.nwu.edu 2
Data Access Sequence Dependency • Temporal dependency • Access the same data set at different time stamp • Spatial dependency • Access different data sets at the same time stamp • Resolution dependency • Access the same data set at different resolution • Sequence is useful for I/O performance improvement, eg. Pre-fetch, pre-stage, storage continuity choudhar@ece.nwu.edu 4
Spatial Data Access Patterns • Parallel partition patterns: • Regular, irregular • Static, dynamic during simulation • Access sequence • Spatial, temporal, resolution • Access frequency • Once only, multiple times (overwrite for restart) • Access amount • Large, medium, small chunks choudhar@ece.nwu.edu 5
Access Patterns for Visualization/Analysis • Generated from real data during simulation or in post-simulation process • Smaller size than real data • Type conversion, eg. float unsign char • Reduce/increase resolution • Projection 3D to 2D • 3 types of data generate and display sequence choudhar@ece.nwu.edu 6
Architecture Simulation Data Analysis Visualization User Applications I/O func (best_I/O (for these param)) Hint Query Input Metadata Hints, Directives Associations Data OIDs parameters for I/O Schedule, Prefetch, cache Hints (coll I/O) Storage Systems (I/O Interface) MDMS Performance Input System metadata Metadata access pattern, history MPI-IO (Other interfaces..) choudhar@ece.nwu.edu 7
Approach • Management meta data using OR-DBMS • Collect and organize meta data in relation tables • Design meta data query interface using SQL • Access to HSS • Obtain current storage layout, configuration • Native I/O interfaces or MPI-IO • I/O optimization • Determine optimal I/O calls • Overlap I/O with computation, communication, and I/O • Pre-fetch, pre-stage, migrate, purge in HSS • Sub-filing for large file, file container for small files choudhar@ece.nwu.edu 8
Metadata • Application Level • Algorithms, compiling, execution environments • Time stamps, parameters, result summary • Programming Level • Data types, structures, association of datasets, partition patterns • Storage System Level • File locations, file structure, I/O modes, host names, device types, path names, storage hierarchy • Performance Level • I/O bandwidth of HSS for local and remote access • Data access sequence, frequency, other access hints • Collective or non-collection I/O choudhar@ece.nwu.edu 10
Applications • Asto3D -- study the highly turbulent convective layers of late-type star • Write only • regular partition on all data sets • ENZO -- simulate the formation of a cluster of galaxies consisting of gas and stars • Both read and write • Both regular and irregular partition • Adaptive Mesh Refinement dynamic load balancing • Common feature • Checkpoint / restart • Post-simulation data analysis • Visualizing the process of the computation in the form of a movie choudhar@ece.nwu.edu 11
Interface choudhar@ece.nwu.edu 12
Run Application choudhar@ece.nwu.edu 13
Dataset and Access Pattern Table choudhar@ece.nwu.edu 14
Data Analysis choudhar@ece.nwu.edu 15
Integrating Analysis Simulation (Execute app, Generate data) On-line analysis And mining Problem setup (Mesh, domain Decomposition) Manage, Visualize, Analyze Measure Results, Learn, Archive choudhar@ece.nwu.edu 16