290 likes | 397 Views
Recent Enhancements to Quality Assurance and Case Management within the Emissions Modeling Framework. Alison Eyth, R. Partheepan, Q. He Carolina Environmental Program University of North Carolina at Chapel Hill Marc Houyoux Emissions Inventory and Analysis Group U.S. EPA OAQPS.
E N D
Recent Enhancements to Quality Assurance and Case Management within the Emissions Modeling Framework Alison Eyth, R. Partheepan, Q. He Carolina Environmental Program University of North Carolina at Chapel Hill Marc Houyoux Emissions Inventory and Analysis Group U.S. EPA OAQPS
OAQPS EMF Goals • Improve timeliness and quality of data used in emissions modeling • Provide transparency and tracking of • Data (by using versions and metadata) • Quality assurance steps on the data • Usage of data for emissions modeling applications • Create tools that can be used by EPA and others • Support criteria and toxics modeling Carolina Environmental Program
EMF Components • Data Management with Versioning • Quality Assurance • Tracking, automating of QA procedures • Case Management • Running SMOKE and other programs • Control Strategy Development • Problem Tracking System • Surrogate and Speciation Tools Carolina Environmental Program
EMF Project Timeline • October, 2004: Design process began • June, 2005: Implementation began • Client-server Java-based system • Spring 2006: Data Management and Quality Assurance Tracking deployed • September, 2006 version included: • Running SQL Quality Assurance Steps • First version of Case Management • First version of Strategy Development Carolina Environmental Program
EMF Architecture at EPA 4 CPU Application &Database Server Shared Disk SMOKE input files Data Management Case Management Quality Assurance Strategy Devel. Clients imports & exports data Compute Cluster starts and tracks runs SMOKE Carolina Environmental Program
Case Management • A Case stores information about SMOKE [and other model] runs • Summary attributes (i.e., metadata) • Inputs to programs • Program/Model Parameters • Programs to run • Outputs from programs • History and results of the runs • Should have all information needed to run SMOKE programs and track results Carolina Environmental Program
Case Manager • Cases are created, edited, copied, and removed from the Case Manager • Summary attributes assist with selection Carolina Environmental Program
Case Editor – Parameters Tab FY07 Parameters table: Parameter Name, Sector, Program, Envt. Var., Type, Required? Carolina Environmental Program
Case Editor – Inputs Tab Specifies Input Datasets and Versions Carolina Environmental Program
Summary of FY06 Features • Case Manager can create, edit, copy, and delete Cases • Summary information (metadata) can be specified for a Case • Inputs to a Case can be specified, including choosing specific versions of Datasets to use in the Case • Specified versions of input Datasets can be exported for use by SMOKE Carolina Environmental Program
Planned FY07 Case Management Enhancements • Finish the Parameters, Programs, Outputs, and History tabs of the Case Editor • Support writing scripts to run SMOKE programs on compute server • Manage runs of SMOKE on compute server • Add problem tracking for Cases • Implement user/group/world permissions • Register outputs of Cases as Datasets Carolina Environmental Program
Goals for Quality Assurance in EMF • Support QA of Datasets prior to their use in SMOKE • Integrate with EMF data management • Specify a list of QA Steps to be performed on each dataset type (i.e., type of data) • Track the QA steps and their results for multiple versions of Datasets • Track information about the progress of the steps: status, who, when, etc. • Automate (speed up) the QA process Carolina Environmental Program
Tracking QA Steps in EMF • First, set up “QA Step Templates” for EMF Dataset Types • Create “QA Steps” using the Templates by copying into the Dataset properties • Add any ad-hoc QA Steps (not from templates) to Dataset properties • Record results of the steps for each version of a Dataset Carolina Environmental Program
Dataset Type Manager Carolina Environmental Program
Setup a QA Step Template • Enter a SQL query; $TABLE[#] syntax allows query to be generic across multiple datasets Carolina Environmental Program
QA Step Templates vs. QA Steps Dataset Type Dataset has a ORL Point Inventory List of Point Specific QA Step Templates NC 2002 NEI Point Inv. List of QA Steps with result, who, when, comment Get list of templates Copy templates to QA Steps Carolina Environmental Program
Summary of QA Steps for all Versions of a Dataset • Add from Template adds steps from the dataset type; Add Custom is for ad-hoc steps Carolina Environmental Program
Summary of FY06 QA Features • Can define required and optional steps for each type of dataset (codifies the QA process) • For a particular dataset, steps can be quickly copied from templates, or custom steps can be added • Tracking is performed for each step: who did it, when, status, comment • SQL steps can be run and results exported Carolina Environmental Program
FY07 Quality Assurance Plans • View results of steps • Analyze results of steps using Analysis Engine (e.g., create plots) • Enhance the SQL syntax to support referencing other steps and Datasets • Support comparison of the results of two similar QA steps(e.g., compare old and new totals) • Support running more types of steps Carolina Environmental Program
EMF Software Requirements • Java 1.4 or 1.5 • PostgreSQL 8.1 • Apache Tomcat • Tested on Linux and Windows • Should run on other operating systems that support Java and other required software • Deployment configuration is flexible: runs on a single computer or several Carolina Environmental Program
Availability of EMF • Public EMF release is not yet funded, but direct arrangements can be made • Source code can be downloaded from SourceForge • May hold training class at 2007 Emissions Inventory Conference Carolina Environmental Program
Case Editor– Programs Tab (FY07) • Shows the: Sector, Program Name, Program Version, Arguments, Run Order, Whether to Run?, Run Status, and Path Carolina Environmental Program
Case Editor – Outputs Tab (FY07) Shows Output Name, Sector, Program, Dataset Name, Dataset Type, Environment Variable, whether it is Required or Available, and if it should be Registered in the EMF as a Dataset Carolina Environmental Program