390 likes | 516 Views
Clearinghouse WG Telecon. October 7, 2003 (updated after call). Agenda. Introductions Status Gateways, Registration, Service testing Software (Isite, others) Geospatial One-Stop Portal Overview Integration with Clearinghouse. USGS Denver BLM Minnesota Wisconsin Oak Ridge NL ATS Inc
E N D
Clearinghouse WG Telecon October 7, 2003 (updated after call)
Agenda • Introductions • Status • Gateways, Registration, Service testing • Software (Isite, others) • Geospatial One-Stop Portal • Overview • Integration with Clearinghouse
USGS Denver BLM Minnesota Wisconsin Oak Ridge NL ATS Inc Dept Homeland Security USDA FSA Attendees Intergraph Cornell Univ Wyoming Washington Gt. Lakes Commission Army Corps New Mexico SARIS James Madison Univ (VA) NASA GCMD FGDC Staff Illinois NRCS GeoConnections ESRI
Gateways • Operating Blue Angel Gateway 3.6.2 at all six locations • Updated service templates and a javascript OGC Web Map Viewer (multiviewer) included in the package • Gateways should be at same revision • No migration to MetaStar 6 planned
Gateway Actions • FGDC and EDC have updated their Gateway instances as per recent guidance • Be sure Gateway admins are getting the notices • Find status of NOAA-CSC, ESRI, and Alaska GDC Gateways
Status checks on server quality • Existing Clearinghouse Status page is being renovated to provide more information for providers and administrators • New page will be set up on the new Linux host at EDC (server consolidation) • May add more checks to help in assessing quality for One-Stop use
9 8 1 2 3 4 5 6 7
1 2 3 4 5 • Quick Status Line • X means successful • 0 means 0 records returned • Timed Out means after 60 seconds the connection was broken • Error means an error occurred • Test that returned error • Text of error message (if any) • Information currently stored in the database for testing • Contact information for node administrator
Isite & Isearch ActivitiesOctober 2003 Archie Warnock warnock@awcubed.com
Summary of Recent Activities • Numerous bug fixes and several new features, including new Z39.50 configuration options • Release available this week at ftp.awcubed.com, FGDC to follow • Binaries for Linux, Solaris, Windows (Cygwin) • Updated documentation (now and forever)
Isearch2 Release 2.0.5 • Spatial Ranking now implemented and undergoing testing • CGI and Z39.50 demos will be available shortly atwww.awcubed.com/FGDCsearch.html • Official release this week • Binaries for Linux, Solaris, Windows/Cygwin • Isearch-cgi now includes full support for search engine capabilities • numeric, temporal & spatial searching, remote and virtual collections
Coming Next • New GUI administration program for configuring and running Isearch2 and Isite2 • Both Windows and Unix • Based on FLTK GUI toolkit • Investigation into how to improve harvesting capabilities – OAI, RSS • CGI interface to support CQL, SRW
Isite Admin Console Redesign • Similar to Internet Information Server (IIS) control interface • Nodes established through wizard interface • Nodes most visible component that actions are performed against • Actions or Right-Click to perform: • Re-index • Node Properties • Server tests (full text, field and spatial)
Other Software Status • Intergraph SMMS/GeoConnect • Capability to mass export XML files • GeoConnect works with Oracle, SQL-Server • Scripts to store metadata will be posted to website • ESRI ArcIMS Metadata Server • Will be modified to support harvesting by the OAI-PMH (Library) protocol
ISO 19115 and 19139 Status • 19115 is now a published standard • 19139 is a draft ISO Technical Specification to publish an XML Schema for 19115 – soon to be Committee Draft • Will allow a conversion engine to convert FGDC metadata to ISO form with some configuration support to fill-in missing source or destination info
Geospatial One-Stop (GOS) • Module 1: Develop logical data models and encoding for Framework data exchange • Module 2: Prepare and publish metadata on existing (primarily) Framework data holdings • Module 3: Prepare and publish metadata for planned data acquisition • Module 4: Set up authoritative and reliable Web services for maps and data • Module 5: Construct an easier-to-use search and access Portal for geospatial data and services
NSDI: a Foundation for GOS • Geospatial One-Stop (GOS) implements the principles of the NSDI spelled out in Executive Order 12906 (recently revised) • GOS builds on the resources already available in the NSDI – FGDC and endorsed external Framework Standards, NSDI Clearinghouse Network • Operated as project coordinated through FGDC Secretariat with many agency partners • NSDI and GOS activities provide no significant new funding
What's different about One-Stop? • GOS has a timeline for implementation of the various modules • Standards are being developed with multi-sectoral stakeholders as national (ANSI) standards, not FGDC ones • Visibility of the initiative as one of 24 e-gov initiatives from the Administration • Goals include measures of costs and savings through cost-sharing in data acquisition, processing, and service of geospatial data
What else is different? • More explicit focus on including State and Local government as primary stakeholders • Decisions in GOS are made by a governmental (non-FGDC) Board of Directors who advise the Executive Director • GOS as an initiative is intended to sunset when all Modules are operational (2005+) • Management partners from DOI and OMB are watching performance and capabilities (and budget effects) very closely
GOS Expectations of Portal • Search for geospatial data is 'easy-to-use' • Metadata are harvested from remote collections into a single metadatabase for search efficiency • Browsing for geospatial data is made possible through common ISO Categories and resource type assignments • Descriptions of map and data services are also indexed • Data descriptions are linked to an online map viewer where map services exist
Portal at geodata.gov • First release of a GOS portal is online with a few metadata collections plus individual files uploaded as XML or via form input • Developed and temporarily hosted by ESRI through a DOI contract and managed through BLM, Denver • OGC Web Map Services (WMS) and ArcIMS supported in viewer • Data, map services, static maps, and applications are among the resource types indexed
Current Publication Process • User enters FGDC metadata into a form at geodata.gov and they are stored and indexed there • User uploads FGDC metadata as XML into a the metadata store at geodata.gov • Admin staff negotiates with existing Z39.50 or ArcIMS metadata collections to harvest metadata • No current facility to nominate the existence of a metadata collection (Z39.50) to harvest
Current Process • BLM validates metadata, inserts special GOS-specific tags for: • Provider type • Resource type • Data extent • Validation and acceptance process is quite tedious • Metadata that are uploaded by XML or form are mixed in with harvested metadata
Observations • FGDC Metadata reviewed by GOS Team have many problems: • Metadata is not being parsed by 'mp' and fixed • Online_linkage not holding a valid URL • Dates violate the strict YYYYMMDD format • Some mandatory fields are populated with instructional content (e.g. “This field is required”)
Additional info needed by Portal • GOS Portal requires tagging of metadata content using the ISO 19115 Category values for browse and search assistance • Records must also be tagged as to the provider type (Federal, State, Local, Commercial, Academic, etc) • Also needed: • Information on the geographic scope (Global, National, Regional, Local) • Resource type (map graphic, map service, data, ref) – need feedback on an approach
Suggested Resolution • Metadata collections will be registered with the Portal along with: • Related service information pulled from Clearinghouse Registry, where available • Reference to an ISO Category mapping table to translate Theme Keyword values to Category equivalents – or add ISO Category as a controlled vocabulary in existing metadata • Classification as to Provider type
Automated Tagging • The following tag values will be derived from the FGDC metadata and stored in the metadata at geodata.gov: • Geographic extent (Global, National, Regional, Local) will be determined by the area referenced in the Bounding Coordinates • Resource type will be determined heuristically from the Geospatial Data Presentation Form and clues in the Online_Linkage and Resource_Linkage URLs • Specific guidance will be given to providers
Getting Content into GOS • Metadata may currently be harvested from existing Z39.50 and ArcIMS (Metadata) servers by the GOS Team • Future options for publishing to GOS include: • Z39.50 GEO Profile on ISO or FGDC metadata • ArcIMS Metadata Server • Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) • Web-accessible folder of FGDC XML metadata • Form entry of metadata on geodata.gov site • XML upload of FGDC or ISO metadata
Anticipated Publishing Threads for GOS Schedule/ registry Metadata OAI-PMH Service Metadata Metadata harvester ArcIMS Service Metadata Z39.50 Service validation XML Web folder no reject/ inform ok yes V A L I D A T E Entry Form GOS Metadatabase GOS Subset Repository XML XML upload Search interface yes no reject/ inform ok
To participate in One-Stop: • Provide XML form of CSDGM V2 (1998) • Metadata must pass mp without errors • Areas to focus on: • Assure minimum mandatories • Populate online_linkage with valid URL to the data being described • Fix dates • Review content of text fields • Select publishing method (XML upload, form, harvest of Z39.50, ArcIMS, Web, or OAI) • If a collection is to be harvested: • Develop a mapping table of Theme Keywords to ISO Category values – or add ISO Categories into metadata • Register collection with GOS Portal
Issues • Policy decision is needed for harvesting big collections. There are about 25 image collections with between 10,000 and million+ records each. Choices: • Require export of XML representations of metadata • Publish only a collection-level metadata record with links for further search • Provide a Z39.50 cascaded search of the deep servers, as in Geography Network • Set up an OAI-PMH Provider on each collection • Not include deep collections in geodata.gov
Discussion: Big collections • One-Stop metaphor shouldn’t require one to search again on image collections – incorporate them somehow • Image collections should be included • Collection-level metadata record could be provided instead of full inventory • Passthrough distributed search might work well enough on a few collections • Need to ask the Big providers their preferences
What's Next? • GOS Team will refine harvesting and publishing services to more automatically process remote metadata • FGDC staff will enhance testing tools and perform audits and metadata counseling to improve metadata content • Guidance will be provided as to how to construct the mappings from local Theme Keywords to ISO Categories • Instructions on setting up registered Web Folders and OAI-PMH providers to follow
Interface Enhancements • Clarify the options for publishing (XML upload, form entry, collection via Z39.50, OAI, web folders) • Clarify the search and display modes of the interface • Display search results rectangle on map • Elimination of the Primary/Secondary/Tertiary classification in favor of more legible and derivable resource types • Allow discovery and loading of a single layer (not all layers from a service)
Research Topics (6-12 months) • Following USGS example, evaluate spatial ranking to improve geographic relevance of results • Provide developers with technical information on how applications can connect to catalog, data access, and mapping services identified through GOS • Register organizations and services to the public UDDI as another method to promote GOS resources • Investigate a common thematic thesaurus for GOS/NSDI use • Implement transition strategy from FGDC to full ISO 19115 metadata
Discussion • What are the timelines for testing and deploying geodata.gov enhancements? • Unknown, pending assessment of workflow and assignments • What are the priorities? • ArcIMS harvesting underway • Z39.50 and Web Folder next • OAI provider connections
Who are good candidates? The following sites could be good Z39.50 harvest candidates for GOS: • Cornell University (CUGIR) has metadata in good shape • USGS Geosciences collection • Some EDC International collections