190 likes | 307 Views
Using Kepler to Perform Parameter Studies in Subsurface Sciences. Jared Chase Scientific Data Management CET All Hands Meeting 11/28/2007. http://subsurface.pnl.gov. Project Descriptions/Goals. CHIPPS (Tim Scheibe, Environmental Technology)
E N D
Using Kepler to Perform Parameter Studies in Subsurface Sciences Jared Chase Scientific Data Management CET All Hands Meeting 11/28/2007 http://subsurface.pnl.gov
Project Descriptions/Goals • CHIPPS (Tim Scheibe, Environmental Technology) • Project Goal: To develop an integrated multiscale modeling framework with the capability of directly linking different process models at continuum, pore, and sub-pore scales. • SALSSA (Karen Schuchardt, Applied Computer Science) • Project Goal: To develop a process integration framework that combines and extends leading edge technologies for process automation, data and metadata management, and large-scale data visualization. • GWACCAMOLE (Bruce Palmer, High Performance Computing) • Project Goal: To apply a component-based framework to the development of a new hybrid model for performing subsurface simulations that will combine different physical models into a single coherent simulation. 2
Different Scale Models Continuum Pore Sub-Pore 3
Calcite Precipitation Problem • Interested in understanding (and ultimately controlling) the distribution of solid minerals that form from reaction of two dissolved chemicals (solutes). • This study will allow us to gain understanding of the impact of either high- or low-permeability inclusions along a mixing pathway on the effectiveness of mixing. • The results of modeling studies such as this will be used to design mesoscale laboratory experiments to validate our conclusions, which will in turn be used to design field-scale pilot and full-scale implementation strategies. 4
Project SALSSA’s Goals and Requirements • Create a System that … • Automates and integrates research processes. • Provides records for verifiability. • Shares and documents: data, results, tools, and hopefully processes. • Can be used by all types of users; model developers on down to experimentalists. • Has longevity so scientists can modify the system to suit their needs. 7
Continuum Workflow Data Preparation And Management12 Numerical Model Configuration Mathematical Model Definition11 Fixed1 Uncertain1 Parameter Modification10 Initial Grid Generation2 Grid Refinement9 Grid Parameter Specification3 Run Numerical Model4 No – Modify Model Output Visualization5 Output Analysis6 No – Modify Parameter(s) No – Refine Grid Done? Comparative Analysis (with results of previous runs and/or observational data) Qualitative / Quantitative Comparisons7 Horizontal Flow Simulation8 Yes SummaryGraphics14 Stop Simulation Data Management (I/O Documentation and Storage)13 8
Calcite Precipitation Use Case ? 5. Create New Stomp Study Stomp Study 1. Stomp Run: [Permeability (init) = Porosity (init)] • Create stomp study • First run a single job with both the porosity and permeability the same to serve as a base case. • Next run a set of jobs where the fine sand (material 2) becomes progressively less permeable (decrease value by 10, 100, 1000). keep porosity the same as case #1. • Starting with settings from #1, increase permeability by 10, 100,1000. hold porosity the same. • Starting with settings from #1, keep permeability the same but decrease porosity by .05 for a couple of iterations. Again this applies to the find sand. • 5. Take result where we decreased permeability by 10 and use it to create a new study. Its not clear to me why you would start a new study. Maybe its just an artificial case of notion of making a new study? We could also use the case of switching to a finer grid as the cause for a new study if you think its less artificial. 2. Stomp Run: [Permeability = Permeability (init) * 0.1] Stomp Run: [Permeability = Permeability (init) * 0.01] Stomp Run: [Permeability = Permeability (init) * 0.001] 3. Stomp Run: [Permeability = Permeability (init) * 10] Stomp Run: [Permeability = Permeability (init) * 100] Stomp Run: [Permeability = Permeability (init) * 1000] 4. Stomp Run: [Porosity = Porosity (init) * 0.05] Stomp Run: [Porosity = Porosity (init) * 0.10] Stomp Run: [Porosity = Porosity (init) * 0.15] 9
Data Services Provenance and data management RDF, file transfer Organizer • Content management via Alfresco • User services • Pluggable metadata extraction • Provenance in Sesame RDF store • Central organizing tool • Long term interactive workflows • Data organization & access Translation/ Analysis workflows Provenance Store Content Store Archive User Tools Editors Update Messages Provenance Recording (RDF) Automated Workflow Analysis & Visualization • Job Execution • Parameter Studies • Job Monitoring • Data Archiving • Analysis “Jobs” • Multiple viz tools • Techplot, Visit… • Parallel Visualization • Hybrid visualizations • Data Analysis SALSSA Components and Architecture 10
Applying Kepler to Subsurface Research WorkflowUsing Kepler as an End User Tool • Approach • End users are able to add components and tools. • End users can manage their own processes using Kepler. • End users would create their own workflows using pre-made higher level actor abstractions. • ConclusionKepler/Virgil is NOT suitable for end users. Most of this pertains to “workflow designers” as well. • Complex Types • Type Checking • Recording of Provenance • Animation • Creating Actors • Managing technology • Multiple Instances of Kepler • Robustness 11
Applying Kepler to Subsurface Research Workflow Using Kepler for Job Execution • Execute Parameter Studies and Sensitivity Studies • Launch and monitor multiple jobs using various queuing systems: SGE, LSF (mpp2), fork. • Monitor each job within the workflow. • Notify other tools of job state. • Move input/output files. • Workflow Provenance Capture • Working to define an API specific for provenance capture. 12
Issues To Address In Kepler • Performance • You can only have one instance of Kepler running on the client machine at one time. • Kepler takes up a lot of memory. Possibly there could be a mechanism for packaging just the parts you use. • Kepler take a long time to startup. • Building Workflows • No simple plug in model (ala spring). A mechanism to reuse/extend existing code instead of writing new custom classes (i.e. a framework for connecting existing components instead of framework to develop components). • Better documentation for actors so that the end user does is not required to read code to understand components and know which you can hook up. • Components at too low of a level. There is a need for high level components for job launching/monitoring/file movement. • Support for parameter studies including a component for load balancing across machines. • A system built for extensibility to complex and semantic data types. • A set of actors should be built for easy iteration and parameter studies. • More control is needed within execution domains. (i.e. Using PN Directors inside composite actors when a PN Director is used in the parent workflow. http://www.mail-archive.com/ptolemy-hackers@doppler.eecs.berkeley.edu/msg00381.html) 13
Organizer 14
Demo • Workflow Parameters • numInstances: the number of jobs that the workflow will execute. • InputData: Input data for each of the jobs. 16
Next Generation Stomp.in parameters Launch outputs Some Analysis graphics Job Stomp1.in outputs branch … 2. Vary permeability in material 2 Stomp1.in Job outputs outputs Some Analysis parameters Launch Setup Stomp Stomp2.in Job outputs outputs Stomp2.in Job outputs outputs More data more Analysis graphics User works within a “Study” where a Study can be represented as a graph of processes and data inputs/outputs. Some processes are triggered by the user, others appear as by-products of user actions. 1. Baseline computation Setup Stomp 3. Vary other parameters… 17
Future SALSSA Work • Deploy current tools to INEL to support experimental work on calcite precipitation problem • Apply current parameter study workflow to the CCA-based SPH code under development by Bruce Palmer • Integrate SciDAC Visualization and Analysis capabilities • Work with SDM center to developer higher level Kepler components • Job launching, file movement, realtime monitoring • A workflow environment that combines interactive and automated workflow into one environment with appropriate user abstractions • Connect all steps into a meta workflow through provenance • User control over details of view • Different views of data lineage, processing steps • Extend Stomp UI wrapper to support more input options • Support Hybrid model and the additional processing required for setup, execution, and analysis 18
Acknowledgment • Funding for this research is provided by the U. S. Department of Energy through the following programs: • Office of Science, Biological and Environmental Research and Advanced Scientific Computing Research, Scientific Discovery through Advanced Computing (SciDAC) program. • Office of Science, Biological and Environmental Research, Environmental Remediation Sciences Program (ERSP). 19