340 likes | 560 Views
A Web Services Vision and Strategy for Earth Explorer and GloVis. CEOS WGISS #26, Boulder, CO September 25, 2008 Lyndon R. Oleson U.S. Geological Survey Earth Observations and Science (EROS) Center Sioux Falls, SD. Services-Oriented Architecture (SOA) and Web Services.
E N D
A Web Services Vision and Strategy for Earth Explorer and GloVis CEOS WGISS #26, Boulder, CO September 25, 2008 Lyndon R. Oleson U.S. Geological Survey Earth Observations and Science (EROS) Center Sioux Falls, SD
Services-Oriented Architecture (SOA) and Web Services • A service-oriented architecture (SOA), and in particular a Web services-based SOA, offers significant promise to addressing data access and interoperability challenges. • By inserting a Web Services interface between the client and servers of an existing “tightly-coupled” data and information access system, we create many opportunities for increased flexibility and enhanced interoperability • The client can now interact with other servers that also offer the same standard Web Services • Many other clients that utilize the same Web Service standards can now reach our servers • We can evolve and enhance the client without significantly impacting the servers, and vice-versa
Why Employ Web Services? Historically we’ve been “tightly coupled” between client and server. Client Tightly Coupled Interface Server Data and Information
Why Employ Web Services? Client Service Interface Server Data and Information
Why Employ Web Services? Client Service Interface Service Interface Service Interface Server Data and Information Data and Information Data and Information
Why Employ Web Services? Client Service Interface Server Data and Information
Opportunities • USGS/EROS is in the process of re-architecting and redesigning Earth Explorer and GloVis to take advantage of some of the benefits of a Web services-based SOA approach and to create a new baseline for future capabilities • Currently tightly coupled interface between Web client and servers • Insert Web service interface between Web client and server • Extend access to other clients • Includes both direct access to data and catalog access paradigms depending on data type and production model • Additional investigations and prototypes planned in 2008
Earth Explorer History Tightly-Coupled Earth Explorer EROS Servers
Vision for Earth Explorer Clients Earth Explorer Service Interfaces CSW WMS WCS Browse Images Servers Inventory Online Data
Vision for Earth Explorer Earth Explorer CSW WMS WCS Private Provider Servers EROS Servers Other Agency Servers
Vision for Earth Explorer Registration, Shopping Basket, On-demand Product Orders, other? Earth Explorer “Client” Catalog Searches View Browse Direct Data Download Registration, Pricing, Shopping Basket Services Service Interfaces CSW WMS WCS Servers Browse Images Inventory Online Data
Catalog Search and Results • OGC Standard & Applicability • The OGC Web Catalog Service (CS-W) is the standard that best describes the goal we are striving for concerning inventory services, but is not widely deployed. • The Keyhole Markup Language (KML) is a much more popular standard but doesn’t work well for catalog searching. • Although not an OGC standard per se, the geo-tagged Really Simple Syndication (GeoRSS) can be used as a simple catalog search method
Catalog Search and Results • Strengths and Weaknesses of Possible Approaches • We currently provide catalog service through Z39.50 and FGDC protocol. The FGDC output is somewhat wordy (verbose and cumbersome) and is not easily parsed. • The main strength of GeoRSS is that there are many options for RSS feed readers, so client software is readily available. Since RSS is primarily intended for distributing newly-available content, it works well for new data acquisitions but not for existing, large collections. • KML readers are not as widespread as GeoRSS and tend to be heavyweight applications (e.g. Google Earth or NASA World Wind). However, KML provides a richer set of options for displaying images. • Both KML and GeoRSS can be indexed by the Google search crawler which allows a form of geographic searching using standard web search tools (Google Maps and Google Earth).
Catalog Search and Results • Performance Risks and Possible Mitigations • Existing catalog services like the FGDC Z39.50 tend to work well when searching for collection-level metadata or for collections with a small number of granules. • They don’t work as well for large historical collections like Landsat that could involve hundreds of thousands of granules being returned.
Catalog Search and Results • Planned investigations • A prototype employing CS-W to support Landsat catalog metadata search and results. • KML has been accepted as an OGC best practice and seems to have some overlap with the OGC GML standard. This relationship could be the topic of another investigation.
Image Browse • Current Functionality • The image browse function provides a reduced-resolution image that allows a user to visually assess a satellite image or other science data to determine if the data is suitable. • Traditionally this has been done by generating a small (on the order of 1024 by 1024 pixels) overview image from the full resolution data by selecting three bands (for multispectral data), subsampling, enhancing the contrast, and compressing the result (in most cases using JPEG lossy compression). • The browse images are usually delivered using HTTP. The combination of HTTP and JPEG allows the images to be easily viewed by any web browser and to be saved for use by other image manipulation software.
Image Browse • OGC Standard & Applicability • The OGC Web Mapping Service (WMS) is the best candidate to replace the existing image browse service. • Strengths and Weaknesses of Possible Approaches • The WMS is one of the most popular of the OGC standards and is supported by many different software packages. It supports output in a variety of image formats (e.g. JPEG, PNG, TIFF) and supports dynamic image resizing and subsetting. • The output is georeferenced and can support multiple map projections. This would allow browse images to be combined with other data in a GIS application. If sufficient resources exist on the server, image data can be delivered up to full spatial resolution.
Image Browse • Performance Risks and Possible Mitigations • Support for full spatial resolution images would require more storage space on the server. • The ability to re-project data on the fly is a valuable feature of the WMS, but it can be costly in terms of server CPU performance. • In the ideal case, the source data would be stored in its native projection and offered in a variety of “standard” projections (e.g. UTM, geographic), but if CPU resources are insufficient the dynamic re-projection can be disabled and the images would be available only in the native projection. • Similarly, dynamic subsetting and resizing of images can have a significant impact on CPU resources. • This can generally be reduced by pre-generating reduced resolution overlays when the source data is loaded.
Image Browse • Data type specific issues • The WMS browse would be most useful if the source data is geometrically corrected (level-1 G product). • Almost all implementations of WMS require the data to have spatial references defined, although it is possible to define a pixel coordinate space and use it, then the WMS result will not integrate with external data sources.
Image Browse • Planned investigations • The Landsat Data Continuity Mission (LDCM) has done an investigation of the OGC WMS for browse delivery, including a prototype implementation using the Global Land Survey 2000 data. • The LDCM prototype could be extended to other data collections to gain a better understanding of system resource requirements and performance. • Different implementations of WMS could be investigated for performance and resource use. This could be a collaborative effort with some of the existing projects that already have operational WMS servers. • Another potential area of investigation would be using data for which no level-1 G products exist (e.g. NAPP).
Direct Download • Current Functionality • The direct download service provides science data directly to end users. • Traditionally this was done using an HTTP or FTP transfer of a compressed tar or zip file. • The main drawback to this approach is that the user must load the entire scene, even if only a portion of the scene is desired. • There is also no choice of file format or map projection.
Direct Download • OGC Standard & Applicability • The Web Coverage Service (WCS) is the best candidate for a direct data download service. • Strengths and Weaknesses of Possible Approaches • The WCS addresses may of the limitations of the current data download mechanism. • It allows for spatial and spectral subsetting, reprojection, resampling and data reformatting. • The main weakness of the WCS is that it is not as widely supported by client software as is the WMS. • WCS is a relatively new specification that has still not been implemented by many servers. The strengths and weaknesses are difficult to judge and seem to depend largely upon the completeness of the implementation. Things like server resources and bandwidth are not easy to estimate.
Direct Download • Performance Risks and Possible Mitigations • Many of the same performance risks that apply to the WMS also apply to the WCS. The source data must be stored on-line, either uncompressed or using a lossless, high performance compression. Also, dynamic reprojection, reformatting, subsetting and resampling can require significant CPU resources on the server. Restricting the available options may be required to minimize this risk • Data type specific issues • As with the WMS, the WCS is most useful if the source data is geometrically corrected (level-1 G). Unlike WMS, the WCS supports an “image coordinate reference system (ImageCRS)” for data that is not georeferenced. This may be useful for data that is only available in a level-0 format (or for which processing to level-1 G is not feasible).
Direct Download • Planned investigations • The LDCM project has also investigated the use of WCS for delivery of Landsat data, including a prototype using selected scenes from the Multi-Resolution Land Characterization (MRLC) data collection. • This prototype included a simple web browser based WCS client. • The WCS client developed for the LDCM prototype was very limited; a more robust prototype could be useful to several projects. • The use of the image CRS for delivery of data that is not georeferenced should be investigated. • Research into the current demand, support, and clients of WCS would be useful.
Other Functions • User Registration • Functionally allows users to enter new user registration, to login to the system, and to modify existing registration information. • EROS has developed a non-standard Web Service that is a stand-alone loosely coupled software component that uses web services standards and technology to handle requests for user code and password validation, new user registration, updating of user information, and validation of address information by external websites or web services. • The Registration Web Service also provides the capability for rights management for access to licensed commercial data at EROS. During the registration process, the user identifies the U.S. Government Agency they are associated with. Access to the licensed data is based on the user’s U.S. Government affiliation. A user may be able to view the data, but may not be able to download or use the data, based on their user profile.
User Registration • Functionally allows users to enter new user registration, to login to the system, and to modify existing registration information. • EROS has developed a non-standard Web Service that is a stand-alone loosely coupled software component that uses web services standards and technology to handle requests for user code and password validation, new user registration, updating of user information, and validation of address information by external websites or web services. • The Registration Web Service also provides the capability for rights management for access to licensed commercial data at EROS. During the registration process, the user identifies the U.S. Government Agency they are associated with. Access to the licensed data is based on the user’s U.S. Government affiliation. A user may be able to view the data, but may not be able to download or use the data, based on their user profile.
User Registration • OGC Standard & Applicability • Current specifications of the OGC do not include security related aspects. In order to enable protected geospatial information processing and licensing, it is relevant to describe how to handle security related aspects in an interoperable way. This can be achieved by leveraging existing IT-standards. Recommend this be tied to the OGC Security WG. • Data type specific issues • The current proposal provides the capability to identify user access, determine the user type, as well as control the types of data the user can see and download. This is important with certain datasets that is are licensed for only particular user groups, i.e USGS purchased commercial data is licensed only for USGS staff only. • In addition, the current capability allows the USGS to gather metrics on the type of users querying and downloading data. The purpose of the metrics is to provide better customer service and to identify popular data sets.
Shopping Basket and On-demand Product Ordering • Current Functionality • The Shopping Basket Service is a web based stand-alone interface that can be used by any website to finalize product ordering. • The Shopping Basket Service receives order requests from an ordering site and is callable using URL parameters. • The Shopping Basket interface guides users through ordering using a checkout process in which users can view and modify ordered items, receive accurate pricing information, enter or modify address information, enter payment information, and submit orders for processing. • OGC Standard & Applicability • The Shopping Basket interface or future Item Selection Basket, is tied closely to the Web Processing Service (WPS). A WPS defines a standardized interface that facilitates the publishing of geospatial processes, and the discovery of and binding to those processes by clients. Processes include any algorithm, calculation or model that operates on spatially referenced data
Advertising Services • Registry Services? • Registry services are a means to catalog and collate system services that are exposed to the network. • These system services are generally consumed on a machine-to-machine basis. As such, the XML schemas that describe the services within a registry are typically self-documented; meaning definitions of parameters and responses are embedded within registered service descriptor. • A web page with links to a WSDL with human readable developer documentation. These HTML pages can be screen-scraped by a system or application to extract the WSDL information. This implementation is low tech, centralized, manual and rarely up to date. • An established convention may be adopted. Although not a formal standard these can be effective within a developer community or enterprise. An example might to publish service WSDLs (conforming to an established naming convention) at the root directory of web servers. This is neither centralized nor formalized and as such not a viable long term option for robust enterprise systems. • A packaged Service Registry implementation (COTS, GOTS, Opensource package) such as those listed below. These are formal mechanisms intended for enterprise deployments and Service Oriented Architectures. • A search engine registry is a viable appliance based option (e.g. Google Mini Appliance). This device can be configured to index (crawl or spider) only .wsdl files that exist in numerous distributed systems. The appliance would update the index periodically to maintain freshness. The appliance could be used to respond to machine requests seeking a service WSDL.
Advertising Services • Will formal Web Service registries be sufficient to advertise and communicate the subtleties of satellite catalog search, image browse, and data download services? • Many of our initial services may need to extend beyond current standards • How do we communicate the necessary information to prospective clients? • Current thinking: • Employ earth science clearinghouses to advertise availability of services • Provide documentation and code examples (e.g. portions of client software) to illustrate functionality and approach • Offer technical point of contact
Landsat Data Continuity Mission (LDCM) Vision • OGC Standard & Applicability • The future LDCM web client architecture will be built on OGC and other Web Services to provide metadata (CSW), browse (WMS), KML, and products (WCS). The goal of this future interface is to provide a web client and open interfaces to allow other cooperators to create their own client, using the components available through a published registry. • Strengths and Weaknesses of Possible Approaches • The advantages of this approach include: • Meet/exceed the needs and expectations of the user and science community • Provide greater access to land remote sensing information and products to a wider range of users • Ensure access methods are easy to use, well documented and efficient • WMS specification has been implemented in a wide variety of settings
Landsat Data Continuity Mission (LDCM) Vision • Performance Risks and Possible Mitigations • Potential risks include: • Current WCS capabilities do not scale very well for large imagery • Dealing with multiple time dimensions over the same area • WPS just starting to evolve • Planned investigations • Continue to cooperate with the OGC group and report findings and limitations with the standards based on future design of the Landsat Data Continuity Mission.
Summary and Recommendations • Continue with plans to evolve EE and GloVis into SOA, Web services architecture • Collaborate with CEOS and US GEO interoperability efforts • Initiate investigations into performance mitigation approaches and solutions • Pursue prototypes to more deeply explore Web Service implementation and performance issues and feasibility • A prototype employing CS-W to support Landsat catalog metadata search and results. • Extended to include WCS direct download access to Landsat L1-T products • And explore dynamic processing extensions to WCS for Landsat L1-T products