490 likes | 606 Views
A SUCCESSFUL SYSTEM FOR MANAGING WATER QUALITY AND BIOLOGICAL MONTIORING DATA USING MS ACCESS; CT’S EXPERIENCE. AKA: Database-Smatabase: We don’t need no stinkin’ database. Mike Beauchene CT DEP Shadow IT Division. Where this talk will go…. Pros/cons of different data management systems
E N D
A SUCCESSFUL SYSTEM FOR MANAGING WATER QUALITY AND BIOLOGICAL MONTIORING DATA USING MS ACCESS; CT’S EXPERIENCE AKA: Database-Smatabase: We don’t need no stinkin’ database Mike Beauchene CT DEP Shadow IT Division
Where this talk will go… • Pros/cons of different data management systems • The nuts and bolts of a relational database • CT’s ambient water quality data management system
This presentation is… …to encourage you to develop a relational database so you can better organize, store, maintain, and use your water quality data. This presentation is not … …an endorsement for any particular commercially available Data management product. …an infomercial for a product that can be purchased on your credit card at the poster session for 2 easy payments of $19.95
By a show of hands…. • Who collects samples and then waits patiently for the lab to send the results ? • Do you manage these results ? • In an electronic format ? • Do you enforce referential integrity? • Do you “pivot tables” ?
A Basic Data Management System Stuff Goes IN Stuff Comes OUT
Institutional Knowledge Impossible to summarize Difficult to assimilate Easily lost due to Early Retirement Lotto Greener Pastures “-” No IT support required Usually abundant Operates on coffee not oil “+”
“+” No IT support required Final report may look nice Works well with “Institutional Knowledge” Impossible to summarize Difficult to assimilate Very Dusty The photocopiers are always broken “-” HARD COPY
Electronic Files(spreadsheets, documents, etc.) Easy to use Easy to distribute Can make a report look good “+” False sense of security Easy to shuffle your data Difficult to summarize Difficult to assimilate Impossible to ask “?” “-”
Relational Database Easy to share data Stores lots of metadata Answers complicated “?” Keeps the data safe and secure Never loses or shuffles results Links to mapping software Allows you to sleep at night Helps you look really good Stays with the agency when staff does not “+” There really aren’t any but.. Moderate learning curve Get what you ask for Still need to know your data “-”
WHAT ARE THE NUTS? • Tables • Place holders for information • Organize the information by similarity • Store the information • Queries • make demands upon the tables • manipulate data into ratios, indices, calculations • Add, update and delete records in a table(s)
WHAT ARE THE NUTS? • Forms • User friendly version of a table(s) • Can be a more convient way to enter data • Main form sub form • Can use features to help data entry • Pick lists • Reports • User friendly version of a query • Print out of data • Make labels for sample containers • Send data to the public
WHAT ARE THE BOLTS? • Referential Integrity • Rules that allow your tables to play nice together • Primary Key • A field(s) that makes each row unique • USE A NON-INTELLIGENT CODE
WHAT ARE THE BOLTS? • Input Mask / Validation Rule • Templates for data entry • Dates/times • Appropriate values (between 1.0-14.0) • Cascading Updates & Deletes • Global changes to a dataset • Change a name, sample number, station name • Remove an entire set of data for a sample • .
HOW DO THE NUTS AND BOLTS GO TOGETHER? • “Raw Data” = “Result” • Dissolved oxygen = 8.5 ppm • Pteronarycs spp. = 12 individuals • Fragilaria leptostauron = 5 cells • Instantaneous discharge = 152 cfs • “Metadata” = “Attributes or info to describe a result”
MORE ON METADATA You can never have too much!!! • Provides info to a secondary data user • Establishes data quality • Used in queries • Manipulate data • Restrict or define data limits • Describe data • Jogs your memory when some asks: • Where • When • What • Why
“We do not care as much about the accuracy of a result contained within as we do about not having enough information about the result…. …the metadata allows the secondary user to make the appropriate decision as to whether or not the data will be meaningful for their application.” • Bob King and Lee Manning • STORET Architects and founders.
Then (Pre 1998) • NO AGENCY IT SUPPORT OR VISION (only Ernie’s) • Existed as • Institutional knowledge • Hard copy • Lotus/SAS/word perfect format • STORET as an option?
Between Then & Now NOW Then • NO AGENCY IT SUPPORT OR VISION (only Ernie’s) • The relational ambient monitoring database started in July of 1998 using MS Access 2.0 • It was based upon the STORET model • It would function as our day-to-day working database with periodic uploads to STORET • Staff begin to create innovative nick-names for the DBA (me).
NOW (2006) • Our Agency IT calls us “SHADOW IT” • WE Have a data management policy • reduce reliance on all other data mgt systems • MS Access 2000 • Front end for staff • Data input forms • Generic buttons for query options • STORET has… • Monitoring stations • Beach monitoring data • Lots more to do • The DBA (me) has been removed from staff Christmas card lists
CT’s Data Management System Stuff Goes IN Trip Info Site Info Sample Info Overdue results Stuff Comes OUT Summary Calculations Results Project $$ Raw Data WQS Exceedances QA
Trip Info Sample Info Results CT’s Relationships Stuff Going IN Site Info
CT’s Relational Database Is.. Just Like A Pizza!!!!
Trip Info Site Info Sample Info Results Hierarchal Relationships
Data Management In CT- Now Our database has…. • Station information and lat. & long. (1800 sites) • Physical/Chemical (175,000 data points) • Macroinvertebrate (32,000 names & counts) • Fish (267 samples 12,000 records) • HOBO water temp. (lots and lots) • Lots of other stuff Our system…. • Is an electronic log book of all samples collected • Is linked to ADB for 305(b) assessment updates • Is linked to ArcGIS/ArcView for mapping • Can be linked to SIM for uploads to STORET • Needs IT support to go become a real data management system
IN Out
IS NOT A Relational Database A SERIES OF WORKSHEETS IN A SPREADSHEET OR A SERIES OF SPREADSHEETS ORGANIZED IN A FOLDER
On and on to column AAZZ Down and down to row 63,999
START SIMPLE !!! TABLE # • Trips (date, who, why, what) • Sites (id, location, drainage, lat & long) • Samples (lab number, field methods, gear,) • Results (lab number, value, unit, method) 4 1 2 3
Define your KEY FIELDS • The combination of which are will be unique for that record. USE NON-INTELLIGENT CODING AMAP! • Develop strong RELATIONSHIPS • Enforce REFERENTIAL INTEGRITY • Encourage CASCADING UPDATES • Use validation rules and input masks • Restricts entry to appropriate values
Build based on your needs !!! • Lookup Tables (use in pick lists and queries) • Staff info • Method info • Equipment specs • Ecological attribute stuff
USE YOUR DATA !!! • Queries and reports • Quality control/Quality assurance (DQO’s) • Summary Reports • Water quality assessments • Taxonomic distributions • TMDL development and implementation • Find where or where not to go fishing • Share with others • Budget review and or planning • Staff performance evaluation
TAKE BACK TO WORK MESSAGES • TAKE THE LEAP!!!! IT IS EASIER THAN ONE WOULD THINK • YOU WILL BE SURPRISED AT HOW MANY INCONSISTENCIES YOU ACTUALLY FIND • YOU WON’T BE ABLE TO LIVE WITHOUT ONE • EVERYTHING IN THE WORLD TURNS INTO EITHER A “1” OR A “0”
The Last Word • STORET: http://www.epa.gov/storet/ • National Data Standards: http://wi.water.usgs.gov/methods/tools/wqde/index.htm