300 likes | 458 Views
2014 MO SW Review. Database and Scripts Review Summary of Findings Bubba Brossett , Erich Kessler, and Wade Walker June 2014. Database Issues. The NWIS database is now our primary publishing medium.
E N D
2014 MO SW Review Database and Scripts Review Summary of Findings Bubba Brossett, Erich Kessler, and Wade Walker June 2014
Database Issues • The NWIS database is now our primary publishing medium. • Directly tied to NWISWeb, which has far surpassed the ADR as far as usage (see graph in a few slides…). • We need to treat it as such! • Clearly differentiate between fully QA/QC’d data vetted for publication (official record, historical data - primary data) and “other” ancillary data (temporary, operational, special project, etc… data - working data). • Publish data on NWISWeb meant to be published and don’t publish data on NWISWeb not meant to be published. • Use data-aging appropriately. • Review stations’ NWISWeb content routinely – just like the ADR. • Follow guidance in manuals, tech memos, and also use common sense to make things as straight-forward for users and maintainers 50+ years from now.
Operational Issues • Part of review delves into operational issues beyond database/NWISWeb setup and display • Use “big-picture” script output to identify or semi-quantitatively assess WSC/Field-office wide problems. • Sometimes in support of or in addition to on-site review team findings • Script output isn’t conclusive, but can help affirm or quantitate review team’s findings- measurement and check measurement frequency, shiftcheckanalysis. • Sometimes output is pretty definitive and lets us look at things a review team wouldn’t have time to delve into. • Ratcheck, measurement methods, and meas_latency • Can still be special circumstances or related issues scripts can’t account for
Erroneous NWISWeb Data • Any DV statistic stored in the database should be QC’d • Stats stored are determined by processor setup in DD… • Any DV statistic published on NWISWeb or in the Water Data Report MUST be QC’d and vetted for publication! • Any historical UV’s published on NWISWeb also must be vetted for publication (our most popular data product)! • NWISWeb receives over 1 million hits a day on average – pay attention to what you’re publishing there!
Erroneous Historical Values on NWISWeb • no discharge erroneous values • 4 gage height erroneous values- bad or questionable values • no precipitation erroneous values • no sediment erroneous values • There is definitely a possibility of errors not found by reviewer. (wouldn’t be a bad idea for someone more familiar with stations to check URL’s in write-up routinely if you aren’t already…). • If found need to remove from NWISWeb right away and clean up out of the database • And don’t just clean up the obviously bad data – need to verify entire dataset since bad data likely indicate an entire lack of QA/QC.
Provisional Historical Daily Data • Provisional status is only meant for newly collected real-time (operational) data we haven’t yet had time to fully QA/QC, review, and approve. • Provisional Historical Data likely fall into 2 categories • Non-QC’d data never meant for publication • QC’d data that for some reason never got set to final or approved and have been open to accidental changes • USGS is not in the business of publishing questionable or downright bad data(and NWISWeb was recently changed to reflect this – only 3 years of provisional data are now allowed…) …and when is that going to be??
Provisional Historical Data • 29 DD’s and 45 water years with unapproved mean daily discharge in MO database. • 11 DD’s and 94 water years with unapproved mean daily sediment stations. • Below average amount for database this size • Some of it may legitimately be awaiting approval, which is fine, get things worked, reviewed and approved ASAP... • Still a fair amount of older records – so clean those up ASAP or get them revised ASAP if that’s the case… • Discharge has increased from 120DD’s/1059 water years of Q,– Plenty of work left! • Also a lot of other parameters you’ll probably want to address eventually…
Missing NWISWeb DV’s • Daily values stored in primary DD’s but not published on NWISWeb • 45 discharge DD statistics • 28 precipitation DD statistics • 3 sediment DD statistics • Some stats likely not vetted for publication and should be removed from the primary DD (deleted or moved to working DD…) – some may just be write-protected “blanks” which can be ignored if encountered. • Others likely just slipped through the cracks and should be published. • Ensure a normal process is in place to add these for new sites… • Work through list and publish or remove stats as appropriate (make sure data look OK prior to publishing…).
Historical Flood Measurement Entry • NWISWeb URL useful for looking at measurement coverage: http://waterdata.usgs.gov/nwis/dv?district_cd=29&cb_00060=on&cb_99060=on&format=gif_meas&begin_date=1890-10-01&end_date=2011-09-30&result_md=1&result_md_minutes=43200&Access=0
Historical Flood Measurement Entry • Historical flood measurements can be useful for analyzing/developing ratings in GRSAT and can also be useful to NWISWeb data users. • About 98% of currently operated MO stations appeared to have at least one historical flood measurements entered. • Good job! • An additional 7% of stations with some entry did not appear to have a significant number of measurements entered throughout the entire period of record at the stations. • Very sparse entry or 12+ year gaps in entry during periods of apparent data collection (assuming at least one high flow measurement would be made in that time…) • May be reasons they’re missing at some stations and may not be overly useful for rating analysis at some stations, but still beneficial for data users and may still be of some minor use for rating analysis. • Recognize some sites may not really have many high flow measurements in some cases, too, although you might enter old indirects if they’re available… • New tool to assist with this is available from OSW_Scripts webpage if it’s not being used already (link in write-up) – much easier process than it used to be. • Measurements in poorly defined portions of the rating, unusual low-flow measurements, measurements in control transition zones, and measurements during periods when gage wasn’t in operation may also be beneficial beyond just flood measurements...
NWISWeb/ADR period of record discrepancies • 2 minor discrepancies found. • Good job, but probably wouldn’t be a bad idea to add a one time check of this during manuscript prep/review (before ADR as we know it potentially goes away…).
Ratcheck • Ratcheck designed to find significant measurement coverage problems with the extremes of flows. • Looks at DV’s not UV’s so accounts for flashiness and looks for 15 days of flows significantly higher than the highest measurement (twice as high) before flagging a station to ensure opportunity to make a higher measurement was there and the record would be affected (pretty forgiving…) • Can still be problems with instantaneous peaks that ratcheck won’t report on, but may not be much opportunity to catch those. • A flag = no high flow measurement that year to cover that year’s high flow • B flag = no high flow measurement in last 5 years to cover that year’s high flow • C flag = no low flow measurement that year to cover that year’s low flow.
Ratcheck • Yearly coverage of high flows shows some room for improvement in the WSC. • The lower B-flag percentages show that usually the same stations don’t go without measurements year after year and that significant flow events are usually covered by measurements, which is good. • There no concerns with low-flow measurements.
Ratcheck • Also finds potential measurement coverage “problem” stations that have received a large number of a particular flag over the period. May indicate systematic problems getting measurements at these stations that should be addressed if possible (if station is currently operated and operated by you…).
Ratcheck • Also provide kml files of stations with flags over the period for a time-series spatial display of stations with flags. May help identify problem areas and be useful for flood planning, etc..
Meas_summary • Meas_summary designed to summarize measurement frequency, check measurement frequency (approximate...) and type over a given period of time. • Looked at 1998 through 2014 water years for entire WSC and each office. • Overall measurement frequencies have been stable over the period and are right above the national average (averaging 9-10 meas./year of flow in recent years– national average is about 8-9). • Check measurements appear to be varied(1-4%) but the frequency is increasing. Possibility these may not all be check measurements (utility just counts measurements performed on the same day – check measurements, step-backwaters entered into DB on same day, etc… will bias this high). • Individual record reviews should give us a better idea if they’re being performed as often as warranted.
Meas_summary • Great job of implementing hydroacoustic instrumentation. • Hydroacoustic measurement percentages have shown an increase over last 9 years. • It is recognized that hydroacoustic techniques aren’t always the best option, but should provide increased accuracy, better documentation of measurement conditions, potentially valuable metadata, and increased efficiency where they can be used (after up-front investments in equipment, training, and figuring out deployments...).
Summary • Room for improvement : • Address remaining amount of unpublished historical DV’s in primary DD’s. • Increase frequency of check measurements • Data not on NWISWeb marked primary should be checked for normal publication quality, and either displayed or they should be removed from the primary DD, deleted or moved to a working DD. • Measurement coverage for range of flows could be improved for high flows. • Good findings: • Hydroacoustic technologies continue to grow and improve. • Historical measurements have been entered at most stations. • Very few erroneous data values found. • No major findings. Great job!
DD Thresholds • Used to screen incoming real-time data and prevent display of erroneous data to the public. • 100%of gage height DD’s displayed to public on NWISWeb have critical thresholds (very high and very low). • Overall excellent job with gage height (but still get them set at the remaining stations…) • Previous review noted 95% of DD’s had thresholds, so improvement has been made. • 100% of precipitation DD’s displayed to the public have critical thresholds. • Improvement from last review were only 64% had thresholds. • Overall – excellent job for the number of sites. • Adhere to OSW Tech Memo 2014.03: Adoption and display of gage operational limits. Memo requires changes by 03.01.2014. • 0% of gage height DD’s met memo requirement.
Real-time Precip. QA problems • Current real-time data were checked to ensure data are being screened and troubleshot promptly • Real-time QA/QC is critical for precipitation data that may be used by emergency personnel • Data should be screened daily and potential problems troubleshot promptly, even at temporary stations. • Twelve of the 140+ sites had problems! • Several had erroneous spikes, and half a dozen haven’t recorded any precipitation in several months.
Compliance with precip. memo • 8of 143 stations appeared to be “permanent” data stations lacking a “temporary” data qualification on NWISweb. • None were set up correctly and no progress has been made since the previous review. • See Table 5 within the report. • The 135 remaining stations were considered temporary, and 121 were set up correctly. • The remaining 14 stations were displaying data beyond the 120 day period, and were storing data in a primary DD instead of a working DD. • Further suggest adding permanent “temporary data” or “permanent data” DD descriptions to help keep temporary and permanent data from being mixed within the same DD – the data need to be kept separate if both types end up being collected at some point. • Problem stations will be provided in the appendix of the report. Determine if a site is permanent or temporary and set up accordingly. Defer to the on-site team to ensure that sites are serviced accordingly.
Peak Flow File Cleanup • Cleanup work specified in OSW Tech Memo.2009.01 was requested to have been completed by June 30, 2009 (but is really a continuous process…). • As of last progress report, approximately73%of peak-specific problems and 95% of statistical anomalies from regression of gage height had been addressed. • Some progress since the last review. • Issues shouldn’t take much time to address. • Work that has been completed looked okay, but make sure documentation is being adequately archived. No comments could be found in pkentry, so I’m assuming cleanup effort is documented in-house. • Primarily, address: • Should run the scripts annually to check for entry errors. • Checks are not perfect or inclusive of all errors, so may still be issues… but overal WSC appears to be done with the bulk of the cleanup effort.
Meas_latency • Meas_latency designed to summarize measurement latency over a given period of time (amount of time it takes to enter new measurements into our data-processing system…). • Looked at 2009 through 2014 water years for entire WSC. • Overall measurement latencies are staying steady, and numbers are a little above the national average (slower entry times). • Taking around a week or two to get most of the measurements into the database...
Meas_latency • MO enters 75% of their measurements slower than most of the WSC’s in the nation. • Fair for a WSC of this size; numbers have stayed steady since 2010... • Emphasizing measurement entry is a routine part of field trips can help lower latency time. • Ensuring personnel have routine access to data connections on remote field trips can also help if they’re not in the office at least one day a week to quickly load measurements….
Meas_latency • Difference between quickest 25% and slowest 25% of entry (IQR) is fairly average indicating a little inconsistency in how quickly measurements get entered between personnel and/or stations (has been decreasing since 2009) • Lee’s Summit faster than other two field offices. • Should represent more important shift/DC latencies… is it taking 30+ days to get web discharges up-to-date?
Meas_latency • 5 measurements (0.70%) are being displayed on MO’s NWISWeb that are not within error limits of instantaneous discharge values (National average is 0.80%). • All 5 were made within the last month. Discharges on NWISWeb should be adjusted as soon as possible. • This is a new check: briefly mentioned in the report.
Meas_latency • Usage reports look like the MO WSC is doing an excellent job with electronic data entry. • It appears they have been using the technology for some time and have fully adapted it. Most other WSCs have fully adopted the technology, and the National Average is now around 85% • Electronic usage could help decrease time it takes to get data entered
Shiftcheck output • MO had a relatively low percentage of these shift types given the hydraulic conditions. • Lots to look at, and with this limited analysis it was tough to draw any definitive conclusions. • Some concern in the last review whether half-houses were simply being used instead of correct shifts for the hydraulic conditions. Only problem I found was related to backwater from a beaver dam and subsequent flows weren’t estimated (they should be under variable backwater conditions). • Checks are limited; defer to the review team on whether shift analysis documentation and application is adequate….Should there be more of these types of shifts given the hydraulic conditions? • Use of these shifts should be justified in the station analysis using sound hydraulics.
Shiftcheck output cont’d This is what we’re looking for: • Should ideally be some rhyme and reason behind shift curve development – • What change is causing the shift? • How would that change affect the rating above/below the measurement? • Can you determine the approximate gage height a changed feature becomes submerged or comes out of water to determine where shift might break towards or away from base rating, etc...? • Keep transitions between different controls in mind (and try to determine the approximate flow/gage height where that transition occurs as that transition would be key for shift and rating development and can change along with the control changes). • Treat the cause (control change) not the symptoms (measurement shifts). • Define your shift curve based on the likely control change and hydraulic conditions at the site, then use the measurement shifts to set the actual values of the shift curve used. • Don’t define shift curves solely by measurement shifts (but use care not to overcorrect the rating too far beyond what your measurements are showing...)! • Document why you’re shaping the curves as you are (central to this is the likely cause of the shift and how that cause would affect the stage-discharge relationship at higher and lower flows) and why you’re timing the shifts as you are (when did the change occur?). • Most other information can be obtained from the database – no need to relist it in the analysis – just asking for discrepancies there...
Summary • Major things to do: • Compliance with precipitation memo • Improve real time QA/QC of precipitation data on NWISWeb • Comply with OSW SW 2014.03 on gage operational limits. • Minor things to do: • Finish the little remaining peak flow file cleanup, although most of the work has been completed. Again, this is an ongoing task. • Ensure real time discharges are accurate and are within percentages of recent measurements. • Definitely some good things: • Low percentages of problematic shifts. On-site team should check for adequate documentation. • Excellent usage of electronic data collection