190 likes | 347 Views
Overview. Earth System Grid Grid-enabled OPeNDAP Architecture - Server and Application access Framework experience Summary Plans for the coming year. Earth System Grid Overview.
E N D
Overview • Earth System Grid • Grid-enabled OPeNDAP • Architecture - Server and Application access • Framework experience • Summary • Plans for the coming year Fox (pfox@ucar.edu)
Earth System Grid Overview • The goal of ESG is to make climate data – particularly climate model data – an easily accessible community resource. The project is funded by the SciDAC program: Scientific Discovery through Advanced Computing. • Enabling researchers to understand and make effective use of very large, distributed climate datasets is critical. The broad strategy is to develop a collection of server-side capabilities – minimize the amount of data movement. • Multiple interfaces to ESG will allow researchers to focus on science rather than issues of data transfer, format, and data set manipulation. • Foundation is Globus Grid technology Fox (pfox@ucar.edu)
Earth System Grid Portal Fox (pfox@ucar.edu)
ESG: U.S. Collaborations & Development ANL: Computational grids, & grid-based applications LBNL: Climate storage facility LLNL: Model diagnostics & inter-comparison USC/ISI: Computational grids, & grid-based applications ORNL: Climate storage & computational resources LANL: Next generation coupled models & computing NCAR: Climate change predication and scenarios Fox (pfox@ucar.edu)
ESG: ESG-II Architecture Fox (pfox@ucar.edu)
The Earth System Grid DATA storage SECURITY services METADATA services TRANSPORT services LBNL ANALYSIS & VIZ services MONITORING services gridFTP server/client HRM FRAMEWORK services DISK ANL Auth metadata NCAR MySQL GSI CAS server RLS SLAMON daemon TOMCAT AXIS GRAM CAS client GSI NCL openDAPg client LAS server NERSC HPSS gridFTP server/client HRM openDAPg server ORNL NCAR MSS DISK TOMCAT LLNL SLAMON daemon CDAT openDAPg client MySQL Xindice RLS THREDDS catalogs gridFTP server/client HRM gridFTP server/client HRM CAS client MyProxy client MyProxy server GSI ORNL HPSS DISK DISK openDAPg server ISI MySQL MySQL RLS MySQL Xindice RLS MCS OGSA-DAIS CAS client GSI Fox (pfox@ucar.edu) GSI GSI
Metadata-centric view of ESG services DATA TRANSPORT USER AUTHENTICATION AND AUTHORIZATION LOCATION METADATA DATA ANALYSIS & VISUALIZATION ACCESS AND AUTHORIZATION METADATA AGGREGATION METADATA METADATA SERVICES CATALOGUING METADATA CONTENT METADATA DATA BROWSING ANNOTATION & HISTORY METADATA LOGGING METADATA SYSTEM MONITORING AND CONTROL DATA SEARCH & DISCOVERY Fox (pfox@ucar.edu)
OPeNDAP and Grid systems • DODS since ~ 1995 was based on http and cgi-style architecture • Two concerns • Application support and performance of HTTP • Housekeeping abilities of cgi architecture • Solution: evolve OPeNDAP, the discipline neutral aspect of DODS Fox (pfox@ucar.edu)
OPeNDAP ctd. • Data transport protocol and access protocol separated • Revised server architecture • Address Grid-style authentication • Memory management • Exception handling • All these changes and retain interoperation with HTTP and cgi • Advanced requirements: URL should support more than one dataset, or object, i.e. aggregation Fox (pfox@ucar.edu)
Simple and easy to install One CGI process per URL request Limited memory management – external Limited scalability Limited status reporting to web server Returns data stream from one format Standalone server or httpd module Can manage multiple daemon processes Strong memory management – internal Reuse processes, scales Coupled to OPeNDAP server for status Returns multiple formats in a single stream, multiple protocols OPeNDAP 3.x vs OPeNDAP-g Architecture Fox (pfox@ucar.edu)
Application development Fox (pfox@ucar.edu)
Status • Operational/production release of standalone OPeNDAP server (no dependence on web server) for ESG • Run OPeNDAP server as a client to GridFTP or HTTP server • Multi-protocol support: file, http, GridFTP, ftp, etc. • File format support: netCDF, CDF, FITS, CEDAR, … • Re-architected for aggregation support and performance • Portal application client in production, netCDF client operational • Authentication is handled outside OPeNDAP server framework • URL syntax is more complex but more expressive • Will become part of community OPeNDAP release very soon Fox (pfox@ucar.edu)
ESG: Framework experience • ESG is a highly collaborative effort allowing users to quickly access data (petabytes of raw or processed data in an application independent manner). • Payoffs of this distributed collaborative infrastructure have included: • Distributed data-sharing, RLS works! SRM/HRM work! OPeNDAP-g works! • Simplified data discovery of climate data, the work on metadata paid off! Scalability? • Large-scale climate data processing and analysis via highly integrated portal • Increased collaboration among climate research scientists, people use it! • Aid in climate assessments and estimates of future climate variability and trends, IPCC! Fox (pfox@ucar.edu)
ESG: Framework experience • Transport - GridFTP versus HTTP • Server to server • Very good performance • Depends on a very specific version of GRIDftp server (stripped) • Clients are not as capable due to ‘weight’ of globus, revert to HTTP • Scalability and response times (data AND metadata) • Framework architecture supports re-layered for tuning • Service monitoring • to support the distributed collaborative infrastructure • need lots or all services to really make a production environment work • Try out ESG by visiting the website at: http://www.earthsystemgrid.org Fox (pfox@ucar.edu)
Success? • Users are generally happy, developers are very happy • Exploited new technology components • Integration - when and how does it work and scale? • XML -> SQL • DODS -> OPeNDAP and OPeNDAP-g • Globus provides a suite of framework components, some are easier to integrate than others, some just don’t fit our use-cases and architecture • Data framework - e.g. OPeNDAP has been extremely successful • Carrying this to space science (solar-terrestrial) Fox (pfox@ucar.edu)
Summary • Basic success in both data systems and data frameworks • Satisfying user and sponsor needs (from ‘just’ to ‘outstanding’) • Experience with Globus ranges from very good, to not ready for our need • Experience with OPeNDAP is very good, esp. with core services • Scalability and performance require an adaptable architecture which is something system-level interfaces can still hide from the user • Challenge - to bring these attributes to a framework, i.e. in which the user is more exposed Fox (pfox@ucar.edu)
Plans • IDL application level access to new OPeNDAP server framework • Outreach to NASA communities/data centers to install and test new capabilities (server and client) • Joint development of accompanying semantic catalogs for Sun-Earth Connection datasets within the OPeNDAP framework • SPDML-enabled OPeNDAP server Fox (pfox@ucar.edu)