620 likes | 772 Views
MI 3 Station History Information Management System. Design, Details and Directions. Jeff Arnfield Station History Program Manager National Climatic Data Center, Asheville, NC National Oceanic and Atmospheric Administration. MI 3 : Presentation Roadmap. Background
E N D
MI3 Station History Information Management System Design, Details and Directions Jeff Arnfield Station History Program Manager National Climatic Data Center, Asheville, NC National Oceanic and Atmospheric Administration
MI3: Presentation Roadmap • Background • System Overview and Walkthrough • Enhancements Underway • Challenges • Questions
Metadata: Our Big Picture Observing Systems Satellite Granule Station Histories Datasets Standards Inventories
MI3: Goals • Integrate, enhance & increase access • Initial focus: manage station histories • Widely accessible • Support data ingest and access needs • Accommodate NOAA and non-NOAA stations • Contain wide variety of station details • Handle new observing systems, programs and phenomena without recoding • Track information sources, log all changes • Integrate with inventories, other details
First Step: Document Imaging • Much station info available only on paper • Reviewed, collated and imaged station info documents • 500 different forms, 50 commonly used • 450,000 documents • 750,000 pages • 37,000 stations • Images available on the web using WSSRD • Web Store Search Retrieve Display • Commercial service of Information Manufacturing Corporation • Security via individual user accounts, controlled by NCDC • Used by NOAA, state climatologists, others • Privacy concerns limit access • Content will be incorporated into database
Starting State: Standalone Station Sets • Variety of station repositories • usually designed for specific task, project, system • systems, spreadsheets, ASCII files • Both formal and ad hoc, updated and static • variety of information sources • variable freshness, accuracy, detail • some lacked historical values • Varying accessibility and usability • Updates did not propagate to all systems • Systems contained conflicting values • Seldom integrated with related information
The nature of the subject matter • Subject matter is challenging • Each station has many, many details • Each detail may vary independently • Each detail has its own period of validity • Variations in station management and identification practices • No widespread standard for handling station information • Station practices are challenging • Stations may participate in multiple networks and programs • Stations may be known by different names and identifiers • There may be multiple information sources for a given station • Few networks provided an automated station information feed • Frequent historical information backfill and correction
MI3: Design Inputs • Existing NCDC systems • Current and projected production requirements • Existing and projected data holdings and sources • Available station metadata sources • US CRN metadata requirements team output • NCDC subject matter expert workshops • Projects with NOAA, national, international partners
MI3: Database Design Options • Easy to find latest value • Easier to query • May improve performance • Simpler to develop • Good for batch submission • Synchronized entries in all tables when anything changes • Greater redundancy • More difficult to detect changes • Changes may affect many versions • Single change could require new version • Independent atomic values with begin and end dates • Fewer records • Minimizes redundancy • Changes easy to detect • Single update is logically propagated • Potential performance impact • Date management more complex • Development more difficult
MI3: Development Environment • Web-based user interface • ColdFusion • Javascript • Oracle stored procedures • Oracle database • Separate end-user query instance minimizes impact on production, increases availability
MI3: System Organization • Information grouped into subject areas • Identity • Updates • Location • Involved Parties • Data Programs • Datasets / Products • Equipment • Phenomena / Observing Practices • Location Map • Remarks • Administrative Options
MI3: Capabilities and Features • Flexible search options • One interface for query and maintenance (privilege-based) • Tabbed interface by subject area, easily expanded • Overview grids showing all time periods • Drill-down capability to an integrated form view • Further drilldown to individual “fact” level • Generates critical production reports • Date management functions simplify views/reports • Direct query, CGI access from other systems
MI3: System Security • Initial security at the subject area tab-level • Can configure for no access, read, read/write • At database level, it’s either read or read/write • Security enhancements underway • Control update privileges at the station group and data program level • Provides more granular database security • Users can maintain only “their” stations
MI3: Content • Initial content ported from legacy system • Many issues discovered and corrected • More than 33,300 stations, including: • 27,250 Cooperative stations (11740 currently open) • 886 ASOS • 76 CRN stations • 160 RADAR sites • 534 AWOS sites (others in process) • About 5200 other stations (mostly surface, includes historical) • Associations with 17 different datasets
MI3: Home Page http://mi3.ncdc.noaa.gov
MI3: External Access via CGI • MI3 station details window can be instantiated by any system using a CGI call to open a web browser • MI3 opens a station details window as GUEST user • If multiple stations have used that ID (it happens), a list is presented so user can select the station(s) of interest • http://arachne.ncdc.noaa.gov/mi3qry/ displaystation.cfm?idtypeabbr=ICAO&idvalue=KAVL • Idtypeabbr is an abbreviation for the ID type • Idvalue is a valid station ID of the specified type
MI3: Station ID Types • Abbreviations for ID types currently supported: • ICAO- International Civil Aeronautics Organization: 4 alphanumerics • WBAN - Weather Bureau Army Navy: 5 digits • COOP - NWS Cooperative ID: 6 digits • FAA - Federal Aviation Administration Call Sign: 3 characters • WMO - World Meteorological Organization Index Number: 5 digits • NWSLI - NWS Location Identifier: 3-5 alphanumerics; inconsistent for older stations • GOES - Geosynchronous Orbiting Satellite: format varies; used for GOES data transmission, currently entered only for CRN stations • CRN - Internal CRN network station ID: 4 digits • NCDCSTNID - NCDC Station ID: 8 digits; internal management • List is table driven, new types easily added
MI3: Enhancements Underway • Enhanced security – by station group, data program • Data inventories • Will replace a current system, providing added functionality • Different levels of granularity • Stations and geographic areas • Presentation options selectable by user • Digital image management • Simple map of query results • Print / export search results • Display and maintain related remarks by subject tab • Revised, expanded documentation
MI3: Planned Enhancements • GIS interface • Additional station reports and views • Advanced station management utilities • Generic auto-ingest for station information • Data collection and QC workflow interface for other networks • Links to external information • Collection-level FGDC metadata • Network reference information • Station history document images
MI3: Content Expansion Underway • Research, correction of existing stations • Automated update of WMO stations • Update geographic details via GIS • Add details to operational networks • Data inventories
MI3: Content Plans • Other NERON stations • Add historical details to current stations • Upper Air Stations • Air Force Master Station Catalog • Other NOAA stations • Other national, international stations
MI3: Adding A New Station Group • Identify information source(s) • Identify affected systems and processes • Quantify volume of information • Station count • Information volume per station • Map information to MI3, identify gaps • Identify overlap with current stations • Define QC requirements/ develop QC processes • Develop ingest processes • Manual • Automated (if warranted/feasible) • Define QA/QC and data entry rules, workflow and personnel • Develop critical reports & extracts • For operational networks • Ensure ongoing refresh from source agency(s) • Identify anticipated update frequency & volume
Station History Management Issues • Dealing with multiple networks • Tracking Changes • Date management • Logically period of validity for each detail • Constructing coherent views of the data • Maintaining logical integrity during update • Station ID management through the years
MI3: Dealing with dates • Each station may have many, many details • Details may vary independently • Each detail has its own period of validity • Relational design spreads details across many tables • Some tables may contain no detail for a station for a given time period