350 likes | 362 Views
Translation Roundtable. 2010 RCRAInfo National Users Conference. Dwane Young / Joe Carioti EPA – Headquarters Office of Resource Conservation and Recovery. Purpose of this Session.
E N D
Translation Roundtable 2010 RCRAInfo National Users Conference
Dwane Young / Joe CariotiEPA – HeadquartersOffice of Resource Conservation and Recovery
Purpose of this Session • To have an open discussion on the different methods of translation (flat file and XML), answer questions regarding both methods of translation, and discuss any other issues regarding translation • Provide an opportunity for users to discuss tips and tricks for translation and share experiences with one-another
Definition of Terms • Translator Guide: Document developed by the RCRAInfo team that describes the process and methods for translation. It also contains a list of all of the relevant data checks that RCRAInfo will do on any submission. This is probably one of the most important documents for any translator to become familiar with. • Data Exchange Template (DET): Document that describes the XML data elements and their respective order in the XML schema. • Error Report: Reports out the status of any translation (either XML or flat-file). The error report is a key part of any translation. It will tell you whether or not your data loaded, and if not, why not. • Flat-file: A type of translation that uses a series of fixed width text files that mimic the RCRAInfo data structure.
Definition of Terms (cont.) • XML (eXtensible Markup Language): A method of translation that uses defined data elements that conform with a defined schema. This method of translation is designed to be self-describable and to enable computer-to-computer (automated) translation. • XML Schema: Defines the order and relationship between the XML data elements. An agreed upon XML format for communicating information. • Node: A computer that exists on the Exchange Network capable of communicating with other computers on the Exchange Network using an agreed upon set of protocols that enable computer-to-computer interaction. • Flow Configuration Document (FCD): A document that defines the methods of interacting with an Exchange Network flow.
Region 9 Translators • Arizona • V4 Translating Handler through Node • V5 node translation scheduled for 2010 Handler, Permits, Corrective Action, CME, GIS • California • Cal/EPA (CUPA) – CME flat files • DTSC –Permits, Corrective Action, CME flat files • Region 9 • Cleanup projects via flat files • Handler, Permits, Corrective Action, CME, FA
Region 9’s Role • Setting up Accounts • Dissemination of Information • Translation Guidance
Translation Guidance • Dissemination of Information • Translator Guide, Structure Charts and DEDs • IPT and National User Calls – ListServ • Permit Sequence Numbers - GPRA • Status Reports • Interpretation of error codes • Quality Checks • BARRT • Custom Reports
Issue Reconciliation • 2 Organizations vying for State IOR • California - CUPA and DTSC • Agency codes: S versus L • Sequence number ranges • CAFO Sequence numbers
Automation • NEIEN Grants • Infrastucture: Hardware, Database • Contract support • Great when it works! But… • RCRAInfo changes • State lookup codes and sequence numbers • Understanding the data from both State system and RCRAInfo • Need programmatic and IT support
Benefits • States maintain ownership of data • Expanded Dataset • State codes • Electronic reports • Broader audience
Steps for Flat File Translation • Read and study the Flat File Translator Guide – be aware that each new version of RCRAInfo will change the file structures • Write scripts to translate the data in your State database into tables laid out in the Translator Guide format and then query the tables into flat files (or go directly into flat files) • Combine the flat files into a ZIP file as specified in the most recent Translator Guide • Log into RCRAInfo and follow the instructions to submit the flat files. • Iteratively retrieve the error messages from RCRAInfo and tweak the flat files until the translation is successful. • Use the data correction logic to build edits into your State system to prevent errors in future translations.
Create your Flat Files (cont.) • It is just structured text, every database ever written can create Flat Files with no add-ins or expensive contractors needed.
Uploading Data • Data is loading into the staging tables when …
Checking Data • Status Report – Click “Run Report”
Checking Errors • Errors messages
Correcting Errors • Go to the record number (row in the named flat file) indicated in the error report and either manually tweak the data with Notepad – or – rework your SQL to correct multiple errors. • Research the data in question and correct the base data and/or build in edits to your State data system to ensure the data is correct in the future. • Re-submit the corrected flat files to RCRAInfo in a rebuilt ZIP file • Iteratively retrieve the error messages from RCRAInfo and tweak the flat files until the translation is successful. • Run BARRT reports to ensure that the data was successfully translated.
Keeping up with Structure Changes • Read and study the Flat File Translator Guide – be aware that each new version of RCRAInfo will change the file structures. • Re-Write (tweak) the scripts to translate the data in your State database into tables laid out in the new Translator Guide format and then query the tables into flat files. (complexity depends on the changes to the Translator Guide) • Test your changes against the RCRAInfo load edits. • Iteratively retrieve the error messages from RCRAInfo and tweak the flat files until the translation is successful. • Run BARRT reports to ensure that the data was successfully translated.
Dwane Young / Joe CariotiEPA – HeadquartersOffice of Resource Conservation and Recovery
Wrap-up • Pros/Cons of Flat File Translation vs. XML Translation • What are differences in the translation process? • Questions for discussion
Flat Files • Pros • Easy to produce • Matches the RCRAInfo structure • Smaller file size • Cons • It would be difficult to automate the data flow with flat files • Can only be used for submitting data to RCRAInfo • Have to repeat ‘Primary Key’ information in each flat file • Have to include data elements that you don’t use • You have to be able to mimic the RCRAInfo data structure
XML Files • Pros • Designed for automation • Not necessarily impacted by changes to the RCRAInfo data structure • Don’t have to include all of the data elements (leave out what you don’t use) • Only have to include ‘Primary Key’ information once, it is inherited for all of the child tables • Data are self-describable • Cons • Can be difficult to produce (you likely will need a programmer) • Files tend to be larger in size • Can be an inefficient way to store information
Process Comparison Flat File • Data are submitted directly via RCRAInfo User Interface • Data are loaded to the RCRAInfo Staging Tables • RCRAInfo performs edit checks • Error reports can be viewed in RCRAInfo XML • Data are submitted via a state node/node client through the Central Data Exchange (CDX) • CDX performs a ‘Schema Validation’ to ensure that the file is in the correct format • Data are loaded to the RCRAInfo Staging Tables • RCRAInfo performs edit checks • Error reports can be viewed in RCRAInfo, downloaded via CDX, or automatically emailed to a list of recipients
Discussion Question 1 Are there States that are currently doing double data entry?
Discussion Question 2 What can EPA do to help States move away from double data entry?
Discussion Question 3 What are your thoughts on Transactional vs. Partial vs. Full translation?
Discussion Question 4 How can we better coordinate with our respective IT staff?
Discussion Question 5 Is there a value to moving to a computer-to-computer model? What are the hurdles to achieving this?