250 likes | 259 Views
Unidata Program Center provides data, tools, and community leadership for enhanced Earth-system education and research. Learn about their real-time data access, tools, and support for faculty and staff.
E N D
CRAFT and Unidata: A Natural Partnership Internet 2 Members Meeting 20 April 2004 Arlington, VA Mohan Ramamurthy Unidata Program Center UCAR Office of Programs Boulder, CO
UnidataA Program in UCAR* • Unidata Mission Statement: Provide data, tools, and community leadership for enhanced Earth-system education and research. At the Unidata Program Center, we • Facilitate [Real-time] Data Access • Provide Tools • Support Faculty and Staff • Build and Advocate for a Community *University Corporation for Atmospheric Research (UCAR) is a nonprofit consortium of 68 universities committed to advancing our understanding of atmosphere and other Earth systems.
Model Radar Internet Data Distribution(IDD) Satellite About 150+ sites are participating in Unidata Internet Data Distribution (IDD) system
Internet2 Traffic (Week of 4/5/04) Unidata IDD/LDM uses more of the Internet2 than any other advanced application;
The Unidata LDM • The Local Data Manager (LDM) • is a collection of cooperating programs that select, capture, manage, and distribute arbitrary data products. • is designed for event-driven data distribution, and is currently used in Internet Data Distribution (IDD). • includes network client and server programs and their shared protocols. • supports flexible, site-specific configuration, multiple sources of data products, and user-customizable actions on received data products
Data-Feed Topology SourceLDM (Relay node) LDM LDM (Relay node) … LDM LDM LDM … (Leaf node) (Relay node) (Leaf node) Primary Feed Secondary Feed
Real-time data distribution via IDD/LDM6 Unidata community now extends internationally on several continents
Technology Transfer: Operational LDM Use in the NWS Recently, the Korean Meteorological Administration has started using the LDM for some of their internal data distribution to/from nearly 40 weather service offices.
IDD Topologies Lightning data Satellite data Surface/Upper-air data Radar data
SuomiNet and LDM • A network of GPS receivers to provide real-time atmospheric precipitable water vapor measurements and other geodetic and meteorological information • SuomiNet collects data from 100+ GPS receivers distributed throughout the world. • The observations are sent to Boulder, CO for processing and analysis and then redistributed to the community using the LDM.
Generic LDM Installation LDM Server Ingester Product Queue Receiving LDM Sending LDM pqact Execution pqact Decoder Data Flow
LDM Specifics • Uses registered port 388 • Uses TCP and ONC RPC • Actions driven by configuration-file • Arbitrary data of arbitrary size (0 – 232-1 bytes) • One process per downstream (upstream) LDM • Metadata • Size of data in bytes • Creation time • Creation hostname • Feed-type (32 bit integer) • Product identifier (255 byte string) • MD5 checksum
LDM Strengths • Robust operation in the face of network congestion and outages • Efficient • Event driven • Capable of transmitting 2 TB/d from a well-connected, upstream host • Highly user-configurable • Authentication of feed requests • Requested products • Processing of received products • Proven technology: operational since 1994
LDM Limitations • Current implementation is UNIX-only • Data-product availability is limited by size of product-queue on upstream LDM • Upstream sites are fixed for duration of LDM session (i.e., static routing) • Performance over unreliable connection is limited by inherent receive-timeouts of ONC RPC • Adaptability is limited by single-threaded, multiple-process implementation
LDM-6 Upgrade • Use TCP guaranteed delivery instead of RPCs • User-selectable product “chunking” • Real-time statistics gathering and display • Code cleanup for greater efficiency
LDM-6 Performance Model Data: Highest Volume Datastream U. Albany U. Illinois U. Washington U. Utah
NLDM:News Server Technologyto Relay Data in Near Real Time NLDM Benefits Over LDM • Flooding algorithm: • automated routing through redundancy • Virtually unlimited number of hierarchically structured newsgroups: • finer granularity of product categorization and subscription • Cross posting: • Multiple “views” of same data products • Multiple numbers and types of storage buffers • Dynamic construction and destruction of connections • Backlog handling • Protocol supports both push and pull transmission
NLDM Results • INN used “out of the box” with one modification: • Message ID generation and handling modified to use product signatures • Default approach is to create message ID based on host name • Doesn't allow duplicate detection based on product content • Robust delivery with latencies comparable to current LDM • May be improved further with additional tuning • Local management functionality comparable to current LDM • Robust automated, dynamic routing • Automated connection management
NLDM Routing StatisticsCONDUIT, Boulder to D.C. Direct path: Boulder → D.C. • Isabel hit Washington, D.C. Two day window, 30 second bin size Average Latencies Boulder → U.Oregon.1 → D.C. Maximum Latencies Boulder → U.Oregon.2 → D.C.
MeteoForum MeteoForum Network AMPATH CLARA Bridgetown San Jose Caracas Belem Rio de Janeiro Buenos Aires MeteoForum is a joint project between Unidata and COMET, two programs in UCAR
IDD-Brazil Connects to Unidata IDD • Current Participants • Unidata • University of Miami • UFRJ (Brazil) • UFPA (Brazil) • CPTEC/INPE (Brazil) • USP (Brazil) • Infrastructure • Internet2 • AMPATH (NSF) • FIU • Global Crossing • RNP • ANSP
IDD-Brazil - Stress Testing IDD relay of all open feeds to UFRJ • Sustained 1.6 GB/hr over 10 day period at end of December • Peak rates routinely over 2.3 GB/hr over same period • Typical latencies range from 1 to several seconds • Impact on relay machine negligible
Shaping the Future of Data Use in the Geosciences We are moving from an era of data provision towards one in which data- and related web-services are important; Multidisciplinary integration and synthesis are emphasized.
THREDDS Middleware Thematic Real-time Environmental Distributed Data Servers (THREDDS) • To make it possible to publish, locate, analyze, visualize, and integrate a variety of environmental data • Combines IDD “push” with several forms of “pull” and DL discovery • About 25 data providers are partners in THREDDS • Connecting People with Documents and Data