270 likes | 465 Views
AIP-2 Design Review Catalogue, Clearinghouse, Registry, Metadata (CCRM) WG Use Case Review. Josh Lieberman GEOSS AIP-2 Design Review 2 December 2008. Review Outline. Transverse use cases relevant to CCRM GEOSS Registries Components GetRecordbyID Standards Registry coordination
E N D
AIP-2 Design ReviewCatalogue, Clearinghouse, Registry, Metadata (CCRM) WG Use Case Review Josh Lieberman GEOSS AIP-2 Design Review 2 December 2008
Review Outline • Transverse use cases relevant to CCRM • GEOSS Registries • Components • GetRecordbyID • Standards Registry coordination • GEOSS Clearinghouse deployments (Compusult, ESRI, USGS)): • Registry harvest • Community Catalog harvest • Other service harvest • Discovery interface and model • Metadata standards and practices • Discovery / binding / evaluation roles • ISO 19115 / 19119 / 19139 profiles • GEOSS Common record / queryables • ISO Application Profile • Next Steps
Clearinghouse and Common Infrastructure GEOSS Registries Components Services Standards Requirements
GEOSS Clearinghouse OGC CSW ISO 23950 / SRU OpenSearch? ebRS?
What is infrastructure (and who cares)? • Clearinghouse / Registry – the tracks • Portals – the terminals • Applications – the trains • Users – the passengers • Content – the baggage • SBA’s – the destinations
Without metadata, SOA itself would be impossible Community Catalogs Community Catalogs Clearinghouse Harvests / Cascades ? Service / Dataset Description Metadata ? Get Capabilities Service Instances Service Instances Datasets ? Datasets Provisions
Annex B Use Case • Disaster Scenario • Discover which data, products and services are available for this area of interest and this thematic. Connect to the FedEO networks. • After having identified some interesting products for [user] needs, perform a catalogue search of these products to have a quick look and more information. • Order the previously identified products directly to the providers.
Transverse Use Cases • Register organization • Define & publish component with services • Develop component metadata system • Register component & services • Search for components • Search community catalogs / metadata services • Bind client to service • Access services for application • Evaluate component quality / usability • Subscribe to alerts • Construct & publish workflow • Register standards & best practices
Define & Publish Component with Services • Precondition: Organization registration • Select component type to represent contribution • Select service types which represent component access • Expose service endpoints • Postcondition: Services expose component
Develop Component Metadata System • Precondition: Develop component and services • Choose component metalevels • Choose metadata standards and formats • Choose publishing method (e.g. community catalog) • Construct metadata document(s) • Deploy / publish metadata documents • Postcondition: Descriptions of components and services are available
Register component and services • Precondition: component and services are exposed and described adequately by metadata • Enter information into Registry application for component • Enter information into Registry applications for associated services • Postcondition: component and services are registered
Search for components and services (data first) • Precondition: appropriate components are registered in the Registry and are searchable through a Clearinghouse. • User opens a Clearinghouse client, e.g. Geo Portal • User enters one or more target values for queryable parameters and initiates search. • Alternative: user browses candidate components presented by the Geo Portal • Alternative: user issues search by way of a semantic mediator which expands / maps the search terms for finding components and services in disparate domains. • User refines search parameters to refine search • User searches / browses services associated to candidate components • Postcondition: set of candidate components with suitable service interfaces for drill down.
Search community “catalogs” • Precondition: community catalog is registered in Registry and findable through Clearinghouse and/or otherwise known to user. • User opens client application (e.g. community portal) with capabilities to access community catalog. • User searches community catalog for holdings of interest • Alternative: user searches Clearinghouse for holdings of community catalogs which have been harvested by or federated with the Clearinghouse. • Alternative: user finds candidate community catalog summary records through Clearinghouse, then drills down into more detailed metadata in the originating community catalog • Postcondition: user has identified resources of interest, both data products, and services which make them available via the Web or other communication medium.
Evaluate component quality / usability • Precondition: user has identified a component (e.g. dataset) relevant to their project needs and accessed it through appropriate services. • User develops decision support workflow to derive analysis / visualization of dataset(s). • User obtains metadata descriptions of dataset(s) sufficient to determine validity of derived result (e.g. statistical power of a decision). • Optional: user contacts provider of dataset to obtain additional metadata elements needed for evaluation • Optional: user iterates through workflow and evaluation to optimize decision validity. • Postcondition: user has determined the significance of an observation-based decision.
Registry Providers • Organization / Component / Service Registry – GMU • ebRIM based, but user interface is limited to O / C / S • Service interface is discovery only • Standard and Special Arrangements Registry – IEEE • Coordination with Service registry being developed • User Requirements Registry – IEEE • Role is loosely defined • Best Practices Wiki - IEEE • Not an authoritative register
Component Registration • Observing System or Sensor Network • Exchange and Dissemination System • Modeling and Data Processing Center • Data set or Database • Catalog, Registry, Metadata Collection • Portal or website • Software or application • Computational model • Initiative or Programme • Information feed, RSS, or alert • Training or educational resources • Web-accessible document, file, or graphic • Other, enter information in box:
Registry Status • Component Types • GetRecordbyID • Standards Registry coordination • Content cleaning, testing, status • Work on attributes for harvest-able metadata and other links
Resource Discovery / Summary Needs • Catalogs • Record types • Holdings / collections • Supported interfaces • Queryable properties • Response types / formats • Tags / categories / relations • Portals / applications • Functionality • Client interfaces • Supported workflow • Intended users • Technology platform • Datasets • Data type / feature type • Observable(s) • Coverage in space and time • Origin / authority • Quality / usage • Services • Service type • Accessed content / data • Functionality / operations / options • Bindings • Quality / availability
Resource Description Relationships Operates on Provided by Dataset Description Service Description Provision Operation Collection Description Catalog Description Product Description Application Description Derivative Description Workflow Description
Service – data model (Cat 2.0.2 ISO Profile) Includes extensions to 19119. Not ingest-able from OGC Capabilities without constrained MetadataURL provision Related to but not the same as Inspire metadata profile Question whether this is sufficient / needed for discovery
Community Catalog / Component Providers • NOAA WAF • JAXA EO Catalog • CNES / Erdas Catalog • FedEO Community Clearinghouse • ICAN Coastal Atlas / Mediator • WUSTL / ESIP AQ Catalog • NOAA SNAAP • IP3 Mediator
Clearinghouses • Clearinghouse Status – Archie (FGDC) • Simple Clearinghouse" harvests component and service records from the registry. Working on the ingest procedure from e.g. Z39.50 community catalogs and moving on to CS/W catalogs, FGDC records from LandSat. • Lots of non-functioning endpoints still in the service registry, but they are slowly being cleaned out and/or set up for testing. • Not yet harvesting registered services except for catalogs, and the service registry, but have experimented with some WMS instances. • Doesn't yet implement a CS/W Discovery interface, but working on it. • Clearinghouse Status – Marten (ESRI) • Still some issues with harvesting the service registry - complaints about CS/W capabilities not being valid. Need to sync up with Yuqi and Archie to resolve this. • Able to harvest Ted Habermann's WAF records. Some metadata validity issues worked through, but how lenient should the ingest be? At what point does lenience interfere with "findability"? (Josh) • Also harvested the Renewable Energy registered services and Biodiversity site. • GeoGratis Z39.50 interface is still problematic (e.g. AVHRR imagery). • Last week at GEO meeting there were some folks from CEOS (EROS) with lots of medium-resolution imagery which is only searchable through Globus. • Clearinghouse Status – Robert (Compusult) • Ingestion from Registry being run every few days. No notification yet if harvesting fails.
Clearinghouse Distributed search vs. Harvest • Harvest alternative advantage: quick searches. Disadvantage: metadata duplication and scale of processing for large catalogs / archives • Distributed Search advantage - metadata is maintained closer to source. Disadvantage is that searching takes longer to complete and has more chances for the search to not be completed. • Recommend Harvest when possible • Harvest only collection metadata at appropriate scale • Policy of community catalogue must be respected
Integration Issues • Catalogues registered with GEOSS have a wide variety of standardization. Protocols include: • ISO23950 (Z39.50) “GEO” Profile Version 2.2 • FGDC (CSDGM Metadata) • ANZLIC Metadata • ISO 19115 Metadata • OGC Catalogue Service for the Web (Version 2.0.1 and 2.0.2) • ebRIM Profile (incl ISO and EO Extension Packages) • FGDC Profile • ISO 19115 Profile • SRU/SRW / OpenSearch • OAI-Protocol for Metadata Harvesting (OAI-PMH) • Dublin/Darwin Core Metadata • Web-accessible folder/ftp?
Mediation Issues - TBD • Where should mediation occur and when? • What and how many controlled vocabularies, taxonomies, ontologies? • Organizations, SBA’s, CoP’s • Top-down <-> Bottom-up <-> Free-for-all • How to manage and leverage mappings? • Is there a role for knowledgebase inference?
Workplan Elements for Catalogue / Clearinghouse Thread • Persistence, completeness, findability • More resources and resource types, e.g. applications, workflows • Minimum interoperability measures, e.g. geoss:Record • Best practices for federated harvest and query • User requirements refinement and added registry / clearinghouse value • Controlled vocabularies, mediation resources, cross-community enablement • On-going role for search and discovery in scenarios and decision support applications • Facilitation of usable OpenSearch / GeoSearch entry points to the Clearinghouse • Role for publish-subscribe-notify interaction style in Clearinghouse
Other CCRM Issues • Telecon Notes 26 November