450 likes | 532 Views
Welcome! DAY 2. ECHO Future WORK. Multi-Format Ingest REST API. REST Overview. REST API v1 is available on Testbed Backed by ECHO operations Will deploy to operations with Reverb on May 16 Services (Resources) exposed in v1 Users Tokens (Authentication) Groups ACLs Providers
E N D
Welcome! DAY 2
REST Overview • REST API v1 is available on Testbed • Backed by ECHO operations • Will deploy to operations with Reverb on May 16 • Services (Resources) exposed in v1 • Users • Tokens (Authentication) • Groups • ACLs • Providers • Data Quality Summaries • Option Definitions • Orders • Calendar Events • Services (Entries, Options)
Data access via REST v1 • Access to datasets and granules are through echo_datasets and echo_granules resources • Provide a subset of the full metadata • Sample request: • curl http://testbed.echo.nasa.gov/echo-rest/echo_datasets/?provider=NSIDC_ECS • REST Documentation available • http://testbed.echo.nasa.gov/echo-rest/
REST API v2 • Adds provider set of resources for REST based ingest • Enforces INGEST ACLs • Provider access only • Data is indexed asynchronously • Data ingest is done via HTTP PUTs to appropriate locations • Metadata crawlers introduced to support legacy FTP Ingest • Will introduce a small lag from when FTP ingest is processed to when data is searchable.
REST API v2 • Builds on REST API v1 • Will be released as part of Multiformat Metadata support • Adds echo-catalog set of resources • Full access to original provider metadata • Support for format translation on retrieval • Support for AQL searches • Support for parameter based searches • Support for facet (valids) retrieval
ECHO Query Engine Replacement • ECHO database stores original metadata plus book keeping information • Format specific indexers extract indexable information from original metadata • Indexable data is stored in high performance separate indexing application • The ECHO Kernel’s SOAP based Catalog Service will be reimplemented to route searches through the REST API
Current ECHO Data Model Required Link Optional Link Collection Browse Granule ECHO-Hosted Image DAAC-Hosted Image
Motivations for Change • Internal Data Model Inconsistency • Formal browse record only exists during Ingest. • “Legacy” DTD query results include in-lined browse information. • Format Translation Inconsistency • Formats, such as ISO 19115, do not have a separate browse record. • Translating between ECHO and ISO 19115 is made unnecessarily complicated when dealing with browse record associations. • Simplified Data Exports • Data Partner exports would be simpler without a third metadata type. • Simplified REST API • Metadata presentation & translation simpler w/o browse record. • Clients transitioning from SOAP to REST Querying would require additional & inconsistent interactions to request browse metadata.
Proposed ECHO Data Model Required Link Optional Link Collection Browse Granule ECHO-Hosted Image DAAC-Hosted Image
Proposed Browse Handling • Browse Reclassification • Collection & Granule metadata formats will be enhanced with a new <Browse Image> element containing relevant browse metadata. • Legacy Browse Record Management • Browse metadata & image processing remain unchanged. • AssociatedBrowseImageelements will persist. • New <BrowseImage> element will not initially be supported. • REST API Browse Record Management • ECHO-hosted images submitted via FTP and deleted via REST. • DAAC-hosted browse images no longer require independent tracking. • Collection & Granule deletion rules based on browse existence TBD. • Multi-Format Ingest Metadata Generation • Generated metadata will contain new <BrowseImage> elements.
Sample Metadata Modifications Collection Granule <Collection> <DatasetId>MODIS Aqua/Terra<DatasetId> … <OnlineResources> <OnlineResource type=“BROWSE”> ftp://lpdaac/:BR:1234.jpg </OnlineResource> </OnlineResources> … <AssociatedBrowseImages> <ProviderBrowseId>BR:123456</ProviderBrowseId> </AssociatedBrowseImages </Collection> <Granule> <GranuleUr>:SC:MOD10A1:12345<GranuleUr> … <OnlineResources> <OnlineResource type=“BROWSE”> ftp://lpdaac/:BR:1234.jpg </OnlineResource> </OnlineResources> … <AssociatedBrowseImages> <ProviderBrowseId>BR:123456</ProviderBrowseId> </AssociatedBrowseImages </Granule> Existing References Browse Record References Browse Record <Collection> <DatasetId>MODIS Aqua/Terra<DatasetId> … <OnlineResources> <OnlineResource type=“BROWSE”> ftp://lpdaac/:BR:1234.jpg </OnlineResource> </OnlineResources> … <BrowseImages> <BrowseImage mimeType=“” description=“”> http://api.echo.nasa.gov/LPDAAC/:BR:1234.jpg </BrowseImage> </BrowseImages> </Collection> <Granule> <GranuleUr>:SC:MOD10A1:12345<GranuleUr> … <OnlineResources> <OnlineResource type=“BROWSE”> ftp://lpdaac/:BR:1234.jpg </OnlineResource> </OnlineResources> … <BrowseImages> <BrowseImagemimeType=“” description=“”> http://api.echo.nasa.gov/LPDAAC/:BR:1234.jpg </BrowseImage> </BrowseImages> </Granule> Proposed References Browse Record Inline Browse Metadata
System Impacts • No Required Changes to ECHO Data or Client Partners • “Legacy” Ingest processing of existing metadata will remain unchanged. • The “legacy” DTD query results will return the same metadata. • Data Provider Transition from FTP to REST • Data Partners transitioning their ECHO 10 exports from FTP delivery to the REST API will need to update their metadata exports accordingly. • What You Sent Is Not What You Get (WYSINWYG) • Metadata submitted via FTP may not match metadata returned by REST. • Simplified & Consistent REST API • ECHO Clients that plan on utilizing the new REST Query API will only interact with collection and granule metadata records in each of the planned formats (ECHO, ISO 19115, Atom+Frost).
Current ECHO Data Model • Collection & Granule • OnlineAccessURL • Typically used to represent a direct download of binary data • Attributes • URL – Actual resource URL • URLDescription – Textual description intended for human consumption • MimeType – MimeType value intended for programmatic usage • OnlineResource • Used for all other downloadable files or related web pages & documents • Attributes • URL – Actual resource URL • Description – Textual description intended for human consumption • Type – Short description regarding URL type. • Values are typically not intended for end-users • WIST & Reverb utilize this field for interface display • MimeType – MimeType value intended for programmatic usage
Motivations for Change • Data Model Inefficiencies • OnlineAccessURLs do not have a ‘Type’ attribute • ‘Description’ naming differ between OnlineAccessURLs & OnlineResources • XSD must define two separate elements with very similar content • Format Translation Inconsistency • Formats, such as ISO 19115 & DIF, do not have separate URL elements. • Translating between ECHO and ISO 19115 may lose information due to model differences. • Insufficient OnlineAccessURL Information • Missing ‘Type’ attribute reduces the ability for a data partner to have multiple download URLs.
Proposed ECHO Data Model • Collection & Granule • OnlineResource • Used for all external references including downloadable data, related web pages, search tools, & documentation • Attributes • URL – Actual resource URL • Description – Textual description intended for human consumption • Type – Short description regarding URL type. • Additional enumerated values intended for format translation & client usage. • MimeType – MimeType value intended for programmatic usage • Proposed controlled ‘Type’ Values • DIRECT_ACCESS – URLs pointing to the downloadable collection or granule • ANCILLARY_FILES – URLs pointing to ancillary files • METADATA – URLs pointing to metadata for the associated item • DOCUMENTATION – URLs pointing to any documentation • SEARCH – URLs pointing to a search interface (programmatic or GUI) • BROWSE – URLs pointing to browse imagery
System Impacts • Multi-Format Metadata Crawlers • Translation from OnlineAccessUrl to OnlineResource will occur during metadata crawling with the new Multi-Format Ingest changes. • New Online Resources will have a ‘Type’ attribute value of DIRECT_ACCESS • The translation from BROWSE OnlineResource URLs to BrowseImagecould be done during metadata crawling as well. • This is open for discussion, but would consolidate all browse links into a single location. • No Required Changes to ECHO Data or Client Partners • “Legacy” Ingest processing of existing metadata will remain unchanged. • The “legacy” DTD query results will return the same metadata. • Data Provider Transition from FTP to REST • Data Partners transitioning their ECHO 10 exports from FTP delivery to the REST API will need to update their metadata exports accordingly. • What You Sent Is Not What You Get (WYSINWYG) • Metadata submitted via FTP may not match metadata returned by REST.
Service Framework (1 of 2) • ECHO Service Registry • Allows ECHO Providers to register the following service entities: • Service Interface (Simple Subset Wizard) • Service Implementation (GESDISC Simple Subset Agent) • Graphical User Interface (Reverb) • Service Advertisement (MRT) • Services can be associated w/ tag groups for discovery • Virtual Tag Groups (e.g. Datasets, Interfaces, & Implementations) • Managed Tag Groups (e.g. Science Keywords) • Unmanaged Tag Groups (e.g. Generic Key Words) • Service Virtual Tag Group Associations • Services may be associated with a public dataset outside of provider’s holdings • Interfaces - associations with 'datasets' virtual tag group only. • Implementations - associate with interface and dataset virtual group tags • GUIs - associate with implementation and dataset virtual group tags • Advertisements - associations with 'datasets' virtual tag group only.
Service Framework (2 of 2) • ECHO Service Forms • Based on order forms specification • Assigned to a “Service Implementation” Service & Dataset • Service Implementation must be associated with a service interface • Simple Subset Wizard API • EOSDIS Service Interface API • Workflow • User discovers data with an associated service • Client recognizes supported service interface • Client renders the service form, collects user input, and uses pruned “model” element from form as an input to the target service. • Target service Implementation is invoked on behalf of user for selected data. Service Interface (SSW) Dataset A Dataset A Dataset A Service Implementation (SS Agent) Service Form Reverb
Reverb Service Support • Reverb Service Discovery • Reverb requirements include basic service discovery and presentation to user • Service Discovery to include discovery based on tag groups & associated datasets • Significant improvement for service discovery in ECHO • Service Invocation • Reverb requirements include integration with GLAS Subsetter • GLAS Subsetter integration includes custom adapter to generate email to NSIDC user services. • Future requirements will include integration w/ ECS Service Interface • Available services will depend on ECS DAAC registration in ECHO
Existing Service Overview (1 of 2) • SubscriptionService • Capabilities • Deliver notifications regarding collection or granule metadata updates. • Updates filtered by record type, provider, dataset name, AQL filter, or time range. • Results may be contain a subset of metadata attributes • Users may specify a result size limit (in MB) and whether or not to compress. • Delivery mechanisms include EMAIL or FTPPUSH • Limitations • Filtering by provider, dataset, and AQL allow for incoherent subscription definitions • The record type (e.g. COLLECTIONS_ONLY) may be inconsistent with AQL filtering on granules. • Users are not emailed with a subscription is paused, renewed, or expired. • An EMAIL subscription may have FTPPUSH configured information • There is no support for metadata translation
Existing Service Overview(2 of 2) • EventNotificationService • Capabilities • Deliver notifications regarding catalog, service, provider, or subscription events. • Catalog notifications can be filtered by AQL filter. • An Xpath filter may be provided to filter events • Delivery mechanisms include EMAIL, FTPPUSH, or HTTP SOAP. • A separate URL can be provided to support early subscription ending messages. • Limitations • No support for metadata translation. • An AQL filter may be provided, but the filter topic may not be science metadata. • The xPath filtering requires a deeper understanding of the event notification XML. • Notifications re: event notifications that end prior to the expiration date are sent only through an HTTP POST
Motivations for Change • Multi-Format Support • Metadata could be provided to users in either ECHO 10, ISO 19115, or Atom • In lieu of filtering metadata fields, the simplified Atom format could be used • Instead of sending full metadata records, links to the metadata could be sent • Periodic Execution • Most users are not interested in on-demand notifications. • Xpath Filtering Superceded by AQL • Xpathfitering requires too much detailed information • Action Subscriptions • In addition to receiving metadata, user could configure a subscription to perform an action. (i.e. Ordering or Service Invocation) • Data & Service Casting • An improved subscription capability would lay the ground work for ECHO support of data and service casts. • Reverb Platform • Reverb provides a way to expose new subscription functionality.
Proposed ECHO Changes (1 of 2) • Information Subscription • Subscriptions for informational updates. • Notifications sent on 15 minute, Hourly, Daily, Weekly, or Monthly intervals • Subscription Types • Catalog Item Subscription • Filtered by AQL • Results include Metadata links, Full metadata links, Download links • Specific format (ECHO, ISO, Atom) may be specified • Provider Item Subscription • No filter information • Results formatted in XML or JSON • Service Item Subscription • Filter by Tag Group or Provider • Results formatted in XML or JSON • Delivery Types • Email • FTPPush • Event Notification
Proposed ECHO Changes (2 of 2) • Action Subscription • Subscriptions for actions invoked on behalf of a user. • Subscriptions fire on 15 minute, Hourly, Daily, Weekly, or Monthly intervals. • One or more users may be notified when action is invoked. • Subscription Types • Catalog Item Subscription • Filtered by AQL • Actions include order submission and service invocation • Order or Service Option Selection required during subscription creation • Service Item Subscription • Targeted service must be of type SERVICE_IMPLEMENTATION • User provides a pre-defined service selection that is passed to the target service
Client Changes • Reverb • Users can subscribe to collection or granule searches during search workflow. • Users would be able to subscribe to a service invocation or order submission based on search criteria and options. • Users would be able to create and manage existing subscriptions via Reverb. • Data & Service Cast • New Information Subscriptions would allow a “-caster” to be built on top of ECHO. • Metadata would already be in the correct (atom) format.
Service Framework (2 of 2) • ECHO Service Forms • Based on order forms specification • Assigned to a “Service Implementation” Service & Dataset • Service Implementation must be associated with a service interface • Simple Subset Wizard API • EOSDIS Service Interface API • Workflow • User discovers data with an associated service • Client recognizes supported service interface • Client renders the service form, collects user input, and uses pruned “model” element from form as an input to the target service. • Target service Implementation is invoked on behalf of user for selected data. Service Interface (SSW) Dataset A Dataset A Dataset A Service Implementation (SS Agent) Service Form Reverb
With a little squinting… • ECHO Service Forms • Based on order forms specification • Assigned to a “Service Implementation” Service & Dataset • Service Implementation must be associated with a service interface • Simple Subset Wizard API • EOSDIS Service Interface API • Workflow • User discovers data with an associated service • Client recognizes supported service interface • Client renders the service form, collects user input, and uses pruned “model” element from form as an input to the target service. • Target service Implementation is invoked on behalf of user for selected data. Service Interface (SSW) Order Fulfillment API Dataset A Dataset A Dataset A Service Implementation (SS Agent) EWOC Service Form Ordering Options Reverb
Ordering as a Service • The new ECHO service framework would allow the data providers implementing the OrderFulfillmentAPI (EWOC) to register such service in the service registry. Service forms & Order forms are already the same. • This would allow an opportunity to simplify and clean-up the ordering process • May require the end-of-life for WIST
ECHO URS Support • ECHO will be a Phase I participant in the User Registration System • Several external applications take advantage of the ECHO authentication API • ECHO will proxy authentication requests for the URS • Applications authenticating through ECHO will automatically gain URS support and should not be impacted… • With the exception of migration issues inherent in the URS changes (passwords, account consolidation, etc.) • URS user data will be stored in the URS, ECHO specific user information will be kept in ECHO but clients should be unaware of the change
Client Timeline GLAS Subsetter integrated (May 2011) Multi-metadata format (August 2011) Adopt DAAC client capabilities (September 2011) Web Access functionality integrated (TBD) Open Beta to end users (March 2011) Reverb 1.0 Operational (5/22/11) Data visualization (November 2011) Services integrated – including HEG (8.1 PSR in March 2012) Reverb Incorporate URS (February 2012) Transition Web Access users to Reverb Transition WIST users to Reverb WIST / Web access Retire WIST (December 2011) Web Access user survey (June/July 2011) Retire Web Access (TBD) Incorporate Coherent Web stylesheet (TBD) Deploy URS (December 2011) User Registration Implement EOSDIS User Registration in Coherent Web (TBD) Transition LANCE, ECHO, LP DAAC (February 2012) EOSDIS Portal Website / DAAC top-hat (May 2011) Coherent Web Expose search/order functionality via ECHO interface (TBD) TBD Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar 2011 2012
Oracle to Postgres Study • ECHO has been asked to evaluate feasibility of moving from Oracle to Postgres • The Multiformat support significantly simplifies ECHO DB usage, simplifying the problem • ECHO relies heavily on Oracle spatial capabilities • A significant part of the effort will be spent on evaluating PostGIS as a replacement • ECHO uses Oracle RAC for high availability and load balancing • Will evaluate Postgres HA architectures, typically handled at the OS • Study should complete by September
Upcoming Operations Changes • Operations/Dev Split Firewall / SLB Firewall / SLB Operations Testbed Partner Test WIST ECHO WIST ECHO WIST ECHO WIST ECHO WIST ECHO WIST ECHO Hosted w/ Ops Hosted w/ Ops ECHO DB ECHO DB ECHO DB • Operations Dashboard Enhancements • Improved usability & deployment model • Advanced Query Analysis
ECHO Website Modifications • Static Content Migration • All documents & static content will be migrated to the new EOSDIS website. • Migration scheduled to occur by mid-June 2011 • Dynamic Content Migration • ECHO Data Catalog • Considered for migration during Coherent Web Phase II Activities • Holdings Report • Will continue to be hosted via the ECHO website & referenced from the ED site. • Dynamic inclusion w/i the EarthData site to be discussed w/ Coherent Web Team • Ingest Documentation • Will be hosted off of ECHO API in 10.36/7 • ECHO Status Feed • Will continue to be hosted off of ECHO website • ECHO to consider consolidating this capability into the ECHO system. • EIAT • Will continue to be hosted via ECHO website URLs