590 likes | 769 Views
CUAHSI Hydrologic Information System: web services and related technologies. WaterML & Web Services team: David Valentine Tom Whitenack Tim Whiteaker Matt Rodriguez. HIS PIs: David Maidment Ilya Zaslavsky David Tarboton Michael Piasecki Jon Goodall
E N D
CUAHSI Hydrologic Information System:web services and related technologies WaterML & Web Services team: David Valentine Tom Whitenack Tim Whiteaker Matt Rodriguez HIS PIs: David Maidment Ilya Zaslavsky David Tarboton Michael Piasecki Jon Goodall and Rick Hooper (CUAHSI) www.cuahsi.org/his/
Hydrologic Information System Service Oriented Architecture Test bed HIS Servers HIS Lite Servers Central HIS servers External data providers Global search (Hydroseek) Deployment to test beds Customizable web interface (DASH) Other popular online clients HTML - XML Desktop clients Data publishing WaterOneFlow Web Services, WaterML WSDL - SOAP Ontology ETL services Controlled vocabularies Metadatacatalogs ArcGIS ODM DataLoader Matlab ODMTools IDL Streaming Data Loading MapWindow Excel Ontology tagging (Hydrotagger) Programming (Fortran, C, VB) Server config tools Modeling (OpenMI) WSDL and ODM registration
GetSites GetSiteInfo GetVariables GetVariableInfo GetValues Hydrologic Information Server WaterOneFlow services DASH – data access system for hydrology ArcGISServer Geospatial Data Observations Data & Catalogs Microsoft SQLServer Relational Database
6 5 4 2 3 1 WORKGROUP HIS SERVER ORGANIZATION STEPS FOR REGISTERING OBSERVATION DATA DASH Web Application Web Configuration file Stores information about registered networks MXD Stores information about layers Layer info,symbology, etc. WSDLs, web service URLs Connectionstrings Spatial store WOF services NWIS-IID points NWIS-IID WS USGS SQL Server NWIS-DV points NWIS-DV WS NWIS-IID NCDC ASOS points ASOS WS NWIS-DV STORET points STORET WS ASOS EPA TCEQ points TCEQ WS STORET BearRiver points BearRiver WS TCEQ TCEQ . . . . . . More WS fromODM-WS template More synced layers BearRiver My new points My new WS . . . More databases Background layers(can be in the same or separate spatial store) Geodatabase or collection of shapefilesor both Web services from a common template My new ODM ODMs and catalogs. All instances exposed as ODM (i.e. have standard ODM tables or views: Sites, Variables, SeriesCatalog, etc.) ODMDataLoader
New network registration steps Using the ODM DataLoader or other, load your data into a blank ODM instance (this will create all ODM tables that HIS relies on) Copy Web Services template to a new folder, edit the template web.config file to point to the new ODM, test to make sure the new service works as expected Create a point layer (a feature class in GDB, or a shapefile) from the new ODM’s Sites table using the GetSitesTool. Add the point layer to the MXD document, specify symbology, scale-dependent rendering, etc. Add information about the new ODM, the associated web service, and the associated point layer, to HIS configuration file (see the first slide for the exact content) Restart the HIS service Register and test the new service at the HIS Central: http://water.sdsc.edu/centralhis/ 1 2 3 4 5 6 7
Against the NIH Syndrome 2006: • CUAHSI HIS web services are discussed on the BASINS mailing list as a new way to access hydrologic data. The list is mostly used by hydrologists and developers outside academia; • NCDC develops ASOS web services following WaterML; 2007: • MOU with USGS; USGS is developing WaterML-compliant GetValues service; • GLEON uses an early version of ODM to develop their own database schema (VEGA) • Phoenix LTER is developing ODM (in MySQL) and WaterML web services (in Java) • A Google Earth-based client for CUAHSI web services is developed at CSIRO, Australia • Deployment to 11 hydrologic observatory test beds, + CBEO (CEOP project) 2008: • KISTERS develops WaterML-compliant web services over their database, for a client • MapWindow open source GIS develops WaterOneFlow parsers • Florida, Texas and Idaho use ODM and WaterOneFlow web services to provide access to state data repositories; New Jersey is considering the same. • Another CEOP project, at UC-Davis, is implementing ODM (in Postgres) and web services (in Java) – just learned last week…
WaterML design principles Driven largely by hydrologists; the goal is to capture semantics of hydrologic observations discovery and retrieval Relies to a large extent on the information model as in ODM (Observations Data Model), and terms are aligned as much as possible Several community reviews since 2005 Driven by data served by USGS NWIS, EPA STORET, multiple individual PI-collected observations Is no more than an exchange schema for CUAHSI web services The least barrier for adoption by hydrologists A fairly simple and rigid schema tuned to the current implementation Conformance with OGC specs not in the initial scope – but working with OGC on this
Set of query functions Returns data in WaterML WaterOneFlow NWIS Daily Values (discharge), NWIS Ground Water, NWIS Unit Values (real time), NWIS Instantaneous Irregular Data, EPA STORET, NCDC ASOS, DAYMET, MODIS, NAM12K, USGS SNOTEL, ODM (multiple sites)
WaterML key elements Response Types SiteInfo Variables TimeSeries Key Elements site sourceInfo seriesCatalog variable timeSeries values queryInfo GetSiteInfo GetVariableInfo GetValues
variablesResponse variables 1 many timeSeriesResponse variable queryInfo timeSeries criteria sourceInfo queryURL variable values Structure of responses sitesResponse queryInfo site criteria siteInfo seriesCatalog 1 queryURL many series variable variableTimeInterval
More Information about WaterML…next 20 slides…/we may skip them/ Or check the specification online at http://www.opengeospatial.org/standards/dp
Elements Defining Spatial Location SourceInfoType for observation sites for continuous surfaces SiteInfoType DatasetInfoType (other site information) child elements (other dataset information) GeogLocationType GeogLocationType LatLonPointType LatLonPointType LatLonBoxType
SiteInfoResponseType • Namespaces • queryInfo • site Network Sites Variables
userparameters query URL queryInfo example • Parameters sent to service • URLs called (if external resource)
siteInfo • Name • Site Code • Location
geoLocation • geogLocation – geographic coordinates • LatLon point • LatLon box • localSiteXY – projected coordinates
series • variable – what is measured • valueCount – how many measurements • variableTimeInterval – when is it measured TimePeriodType
variable • variableCode – global identifier • variableName • units Sites Variables Values TimePeriodType
Compare with… variableTimeInterval • TimePeriodType – date range (including “last n days” • TimeInstantType – single measurement
queryInfo name code location site seriesCatalog Series how many variables when SiteInfo response TimePeriodType
VariablesResponseType • variable – same as in series element • Code, name, units Sites Variables Values
TimeSeriesResponseType • queryInfo • timeSeries • sourceInfo – “where” • variable – “what” • values Sites Variables Values
sourceInfo • SiteInfoType • Same as siteInfo element • code, name, location • DataSetInfoType • For data continuous in space • LatLonPointType • LatLonBoxType
values • Each time series value recorded in value element • Timestamp, plus metadata for the value, recorded in element’s attributes qualifier ISO Time value
value metadata examples • qualifiers • censorCode (lt, gt, nc) • qualityControlLevel (Raw, QC’d, etc.) • methodID • offset • offsetValue • offsetUnitsAbbreviation • offsetDescription • offsetUnitsCode
TimeSeries response queryInfo location variable values
OGC Harmonization Best Practices WaterML text includes steps for harmonizing with GML/O&M Align spatial feature descriptions (e.g. using gml:Point, gml:Envelope) Align service signatures (getCapabilities) Align terminology with O&M
More on OGC Interactions • WaterML is published as OGC Discussion paper • We are working with OGC O&M (Observations & Measurements) authors to reconcile WaterML with OGC specs: • WOML proposal (Water Observations ML) • Think of it as WFS + O&M type service (with GetSiteInfo proxied as OGC’s GetFeatureInfo request) • The plan is to provide WOML-compliant services alongside WaterML 1.1 • There is an international group that Dave Valentine is coordinating, focused on developing a standard water information exchange schema (started at WaterML workshop in Canberra in September’07) • As we are going with OGC through this harmonization, we shall be able to assist WQX with similar mappings (more below on mapping between WaterML and WQX)
Stations are also available as: • ArcGIS Server services: • http://river.sdsc.edu/arcgis/services, the Networks service (NWIS DailyValues, NWIS_IID, EPA Storet, and USDA SNOTEL: Station ID, Station name, Organization, station type, (and secondary type), lat/lon, state, county, Elevation (for EPA) SiteCode, SiteName, lat/lon, elevation, state (for USGS) • http://river.sdsc.edu/arcgis/services --> CBEO (Chesapeake) • http://river.sdsc.edu/arcgis/services --> UCSD (several networks) • http://his02.usu.edu/arcgis/services --> Networks (Utah) • http://ees-his06.ad.ufl.edu/arcgis/services -->networks (Florida, Santa Fe basin) • http://ccbay.tamucc.edu/arcgis/services --> networks (Corpus Christi) • http://his03.geol.umt.edu/arcgis/services-->networks (Montana) • http://his.safl.umn.edu/arcgis/services -->networks (Minnesota) • http://his08.iihr.uiowa.edu/arcgis/services -->networks (Iowa) More listed at http://www.watersnet.org/wtbs/dash-sites.html • Will be also served as OGC’s WMS and WFS services (we’ve done a demo last year, with IGCC and other atmospheric data, served via WMS; we have also mapped WFS data into WaterML for Australia)
WaterML, and USGS Values • SDSC hosts a database catalog of USGS sites and series information (last update 09/2007 – courtesy David Briar) • GetValues method (Beta) now hosted at USGS • Follows the CUAHSI Webservices, and returns WaterML TimeSeriesResponse • Also need to fix null values handling; more flexible datetime handling; multiple value blocks • Our service now proxies the USGS service instead of scraping the web site (testing should be complete as we speak) • More services to be developed (Real Time is next) • Possibly routing Real Time data via a streaming data server (RBNB DataTurbine, http://dataturbine.org) • Also, a station matching service (discussing with Sandy Williamson)
EPA Web Services, and WaterML • EPA now provides web services http://www.epa.gov/storet/web_services.html • The web services use WQX, an implementation of the Environmental Sampling, Analysis and Results data standard. • Using EPA Webservices, instead of scrapping is over an order of magnitude speedup. • Need to do final testing before deployment • Need a catalog update (so far, we used a scraped one) • Ideally, would need a DB dump (so far, we have 45mil points scraped but this is old … I will show an OLAP datacube for STORET later) • Issues: WQX is based on the EPA data model (org-(Analysis-Location-(Result))) whereas WaterML is time-series oriented (Site-Variable-(Result)). • We are working on mapping WQX into WaterML • Q from audience: “can we use HIS to submit WQX data to STORET?”
Mapping WQX results to WaterML TimeSeries call to StationWebService WQX WaterML • TimeSeries • Site • SiteInfo • Variable • VariableName • Units • Values • DataValue • DateTime • Value • Qualifiers • Method • DataValue • DateTime • Value • Qualifiers • Method • Qualifier • Methods WQX-Each activity produces one (or more) WaterML DataValue • Organization • Activity • ActivityDescription • ActivityStartDate • Details • MonitoringLocation • StationID and Name only • Result • Result Details • CharacteristicName (variable) • [ResultMeasureValue,Unit] ResultMeasure (DataValue) • Qualifier • BiologicalResultDescription • Details • ResultLabInformation • AnalysisStartDate • ResultAnalyticalMethod • Activity • ActivityDescription • ActivityStartDate
NCDC services, and WaterML • Rich Baldwin developed REST services over ASOS, following WaterML schema • In Java, need an auth token • We wrap them as WaterOneFlow services (need to revisit this) • Need to update the catalog (so far just ASOS)
Status * We also have a merged NWIS DV and STORET catalog datacube
US Map of USGS Observations Alaska Puerto Rico Hawaii Antarctica
Different types of nutrients by decade: Available Data Total
STORET Datacube The EPA STORET datacube contains 273K sites and 2.7M series. 93% of the series arewater quality data. About 60% of the water quality records are short term measurements (one year or less in duration). The starting decade of the longer series are to the right. Number of years of record by start decade Florida is the source of about 25% of the total records. Demo
NAWQA NWIS NARR ODM Beyond Syntactic Uniformity: Semantic Mediator What we are doing now ….. GetValues GetValues GetValues GetValues generic request GetValues GetValues Michael Piasecki Drexel University GetValues GetValues
Hydroseekhttp://www.hydroseek.net Supports search by location and type of data across multiple observation networks including NWIS, Storet, and university data
CUAHSI HIS as a mediator across multiple agency and PI data • Keeps identifiers for sites, variables, etc. across observation networks • Manages and publishes controlled vocabularies, and provides vocabulary/ontology management and update tools • Provides common structural definitions for data interchange • Provides a sample protocol implementation • Governance framework: a consortium of universities, MOUs with federal agencies, collaboration with key commercial partners, led by renowned hydrologists, and NSF support for core development and test beds
SDSC History & Overview Founded in 1985, as one of the five original supercomputer centers, funded by the National Science Foundation; one of first 5 nodes on NSFnet (1986) Became an organized research unit of UCSD in 1996 350 employees 37+ teraflops of computing capacity; 25+ petabytes of storage capacity High bandwidth connectivity;San Diego Network Access Point (SD-NAP) 20+ year track record of applied & applications-oriented R&D in the service of science & society Strong R&D through NSF grants covering astronomy, biomedicine, clinical trials, computational chemistry, earth sciences, ecology, geophysics, hydrology, molecular biology, neuroscience, public safety SDSC building on UCSD campus IBM Datastar supercomputer