390 likes | 502 Views
LEAD: An Overview for the Unidata Users Committee 7 October 2004. Motivation for LEAD.
E N D
LEAD: An Overview for the Unidata Users Committee 7 October 2004
Motivation for LEAD • Each year, mesoscale weather – floods, tornadoes, hail, strong winds, lightning, and winter storms – causes hundreds of deaths, routinely disrupts transportation and commerce, and results in annual economic losses > $13B.
The Roadblock • The study of events responsible for these losses is stifled by rigid information technology frameworks that cannot accommodate the • real time, on-demand, and dynamically-adaptive needs of mesoscale weather research; • its disparate, high volume data sets and streams; and • its tremendous computational demands, which are among the greatest in all areas of science and engineering
The LEAD Goal Provide the IT necessary to allow People (scientists, students, operational practitioners) and Technologies (models, sensors, data mining) TO INTERACT WITH WEATHER
Traditional Methodology STATIC OBSERVATIONS Radar Data Mobile Mesonets Surface Observations Upper-Air Balloons Commercial Aircraft Geostationary and Polar Orbiting Satellite Wind Profilers GPS Satellites • Product Generation, • Display, • Dissemination Prediction/Detection PCs to Teraflop Systems • Analysis/Assimilation • Quality Control • Retrieval of Unobserved • Quantities • Creation of Gridded Fields The Process is Entirely Serial and Static (Pre-Scheduled): No Response to the Weather! • End Users • NWS • Private Companies • Students
10 km 3 km 1 km The Desired Approach: Dynamic Adaptivity 20 km CONUS Ensembles
The Limitations of Today’s Research Environment: Example #2 • Applied Modeling Inc. (Vietnam) MM5 • Atmospheric and Environmental Research MM5 • Colorado State University RAMS • Florida Division of Forestry MM5 • Geophysical Institute of Peru MM5 • Hong Kong University of Science and Technology MM5 • IMTA/SMN, Mexico MM5 • India's NCMRWF MM5 • Iowa State University MM5 • Jackson State University MM5 • Korea Meteorological Administration MM5 • Maui High Performance Computing Center MM5 • MESO, Inc. MM5 • Mexico / CCA-UNAM MM5 • NASA/MSFC Global Hydrology and Climate Center, Huntsville, AL MM5 • National Observatory of AthensMM5 • Naval Postgraduate School MM5 • Naval Research Laboratory COAMPS • National Taiwan Normal University MM5 • NOAA Air Resources Laboratory RAMS • NOAA Forecast Systems Laboratory LAPS, MM5, RAMS • NCAR/MMM MM5 • North Carolina State University MASS • Environmental Modeling Center of MCNC MM5 MM5 • NSSL MM5 • NWS-BGM MM5 • NWS-BUF (COMET) MM5 • NWS-CTP (Penn State) MM5 • NWS-LBB RAMS • Ohio State University MM5 • Penn State University MM5 • Penn State University MM5 Tropical Prediction System • RED IBERICA MM5 (Consortium of Iberic modelers) MM5 (click on Aplicaciones) • Saint Louis University MASS • State University of New York - Stony Brook MM5 • Taiwan Civil Aeronautics AdministrationMM5 • Texas A\&M UniversityMM5 • Technical University of MadridMM5 • United States Air Force, Air Force Weather Agency MM5 • University of L'Aquila MM5 • University of Alaska MM5 • University of Arizona / NWS-TUS MM5 • University of British Columbia UW-NMS/MC2 • University of California, Santa Barbara MM5 • Universidad de Chile, Department of Geophysics MM5 • University of Hawaii MM5 • University of Hawaii RSM • University of Hawaii MM5 • University of Illinois MM5, workstation Eta, RSM, and WRF • University of Maryland MM5 • University of Northern Iowa Eta • University of Oklahoma/CAPSARPS • University of Utah MM5 • University of WashingtonMM5 36km, 12km, 4km • University of Wisconsin-Madison UW-NMS • University of Wisconsin-Madison MM5 • University of Wisconsin-Milwaukee MM5 • Mesoscale forecast models are being run by universities, in real time, at dozens of sitesaround the country, often in collaboration with local NWS offices • Tremendous value • Leading to the notion of “distributed” NWP • Yet only a few (OU, U of Wash, Utah) are actually assimilating local observations – which is one of the fundamental reasons forsuch models!
The LEAD Vision: No Longer Serial or Static STATIC OBSERVATIONS Radar Data Mobile Mesonets Surface Observations Upper-Air Balloons Commercial Aircraft Geostationary and Polar Orbiting Satellite Wind Profilers GPS Satellites • Product Generation, • Display, • Dissemination Prediction/Detection PCs to Teraflop Systems • Analysis/Assimilation • Quality Control • Retrieval of Unobserved • Quantities • Creation of Gridded Fields • End Users • NWS • Private Companies • Students
The LEAD Vision: No Longer Serial or Static DYNAMIC OBSERVATIONS • Product Generation, • Display, • Dissemination Prediction/Detection PCs to Teraflop Systems • Analysis/Assimilation • Quality Control • Retrieval of Unobserved • Quantities • Creation of Gridded Fields • End Users • NWS • Private Companies • Students
NWS National Static Observations & Grids Virtual/Digital Resources and Services Users ADAS ADaM Tools MyLEADPortal Remote Physical (Grid) Resources Local Physical Resources Local Observations LEAD: Users INTERACTING with Weather Interaction Level I Mesoscale Weather
Users Experimental Dynamic Observations LEAD: Users INTERACTING with Weather Interaction Level II NWS National Static Observations & Grids Virtual/Digital Resources and Services ADAS ADaM Mesoscale Weather Tools MyLEADPortal Remote Physical (Grid) Resources Local Physical Resources Local Observations
The LEAD Goal • To create an integrated, scalable framework that allows analysis tools, forecast models, and data repositories to be used as dynamically adaptive, on-demand systems that can • change configuration rapidly and automatically in response to weather; • continually be steered by new data (i.e., the weather); • respond to decision-driven inputs from users; • initiate other processes automatically; and • steer remote observing technologies to optimize data collection for the problem at hand; • operate independent of data formats and the physical location of data or computing resources
The LEAD Foundation WOORDS Workflow Orchestration for On-demand, Real-time, Dynamically-Adaptive Systems
And This Means… • Workflow Orchestration -- The automation of a process, in whole or part, during which tasks or information are passed from one or more components of a system to others -- for specific action -- according to a set of procedural rules. • On-Demand – The capability to perform action immediately, with or without prior planning or notification. • Real-Time -- The transmission or receipt of information about an event nearly simultaneously with its occurrence, or the processing of data immediately upon receipt or request. • Dynamically-Adaptive – The ability of a system, or any of its components, to respond automatically, in a coordinated manner, to both internal and external influences in a manner that optimizes overall system performance. • System – A group of independent but interrelated components that operate in a unified holistic manner.
LEAD is a Unique “Poster Child” Because it is End-to-End and Contains Just About Everything in Cyberinfrastructure • Collection of data by remote sensors • Analysis and prediction of physical phenomena • Huge data sets and streaming data • Visualization • On-demand, real time, dynamic adaptability • Resource prediction and scheduling • Fault tolerance • Remote and local resource usage • Interoperability • Grid and Web services • Personal virtual spaces • Education • An extremely broad user base (students, researchers, operational practitioners) that is in place • A long-standing mechanism for deployment (Unidata)
LEAD Grid and Web Services Testbeds • There will be five testbed sites at • Unidata • University of Oklahoma • Indiana • University of Alabama, Huntsville • NCSA/University of Illinois • They will be using the Grid framework • Initially, it will be a nearly homogeneous environment, running the same software stack
Geo-reference GUI Portal MyLEAD Workspace User Sub-System Task Design Workflow Engine ADAS ADaM Allocation & Scheduling WRF IDV Workflow GUI Monitoring User-Specified Detection Algorithms Grid and Web Services Test Beds Estimation Tools Sub-System Orchestration Sub-System Servers and Live Feeds Personal Catalogs THREDDS Catalogs Storage Semantics & Interchange Technologies Controllable Devices Data Sub-System Local Resources and Services Grid Resources and Services
The Grid • Refers to an infrastructure that enables the integrated, collaborative use of computers, networks, databases, and scientific instruments owned and managed by distributed organizations. • The terminology originates from a crude analogy to the electrical power grid; most users do not care about the details of power generation, distribution, etc, but your appliances work when you plug them into the socket. • Grid applications often involve large amounts of data and/or computing and require secure resource sharing across organizational boundaries. • Grid services are essentially web services running in a Grid framework.
LEAD CS/IT Research • Workflow orchestration – theconstruction and scheduling of execution task graphs with data sources drawn from real-time sensor streams and outputs • Data streaming – to support robust, high bandwidth transmission of multi-sensor data. • Distributed monitoringand performance evaluation -- to enable soft real-time performance guarantees by estimating resource behavior. • Data management – for storage and cataloging of observational data, model output and results from data mining. • Data miningtools – that detect faults, allow incremental processing (interrupt / resume), and estimate run time and memory requirements based on properties of the data (e.g., number of samples, dimensionality). • Semantic and data interchange technologies – to enable use of heterogeneous data by diverse tools and applications.
LEAD Meteorology Research • ARPS Data Assimilation System (ADAS) for the WRF model– adaptation of the CAPS ADAS to the WRF model to allow users to assimilate a wide variety of observations in real time, especially those obtained locally • Orchestration system for the WRF model – to allow users to manage flows of data, model execution streams, creation and mining of output, and linkages to other software and processes for continuous or on-demand application, including steering of remote observing systems • Fault tolerance in the WRF model for on-demand, interrupt-driven utilization – to accommodate interrupts in streaming data and user execution commands • Continuous model updating – to allow numerical models to be steered continually by observations and thus be dynamically responsive to them • Hazardous weather detection– to identify hazardous features in gridded forecasts and assimilated data sets, using data mining technologies, for comparison with sensor-only approaches • Storm-scale ensemble forecasting – to create multiple, concurrently valid forecasts from slightly different initial conditions, from different models, or by using different options within the same or multiple models.
LEAD Design Features • Entirely web- and web service-based; only requires a browser and Java Web Start • Minimum local resources needed to have significant functionality (to empower grades 6-12 and reach underprivileged areas) • Highly intuitive functionality with separate portal interfaces for different classes of users • Grid tool kit for security, authentication, job management, resource allocation, replication, etc. • Maximum utilization of existing capabilities (OpenDAP, THREDDS, Globus, DLESE) • Transparent access to all requisite resources (data, tools, computing, visualization) • Minimum depth accessibility (fewest number of mouse clicks) • Backward software compatibility • Scalable to large numbers of users • User extensibility • Ability to use each service in a stand-alone manner, outside of the orchestration and portal infrastructures
The 5 Canonical Problems • #1. Create a 10-year detailed climatology of thunderstorm characteristics across the U.S. using historical and streaming NEXRAD radar data. This could be expanded to a fine-scale hourly re-analysis using ADAS. • #2. Run a broad parameter suite of convective storm simulations to relate storm characteristics to the environments in which they form/move • #3. Produce high-resolution nested WRF forecasts that respond dynamically to prevailing and predicted weather conditions and compare with single static forecasts • #4. Dynamically re-task a Doppler radar to optimally sense atmospheric targets based upon a continuous interrogation of streaming data • #5. Produce weather analyses and ensemble forecasts on demand – in response to the evolving weather and to the forecasts themselves
LEAD Technology Roadmap Generation 3 Adaptive Sensing Generation 3 Adaptive Sensing Look-Ahead Research Generation 2 Dynamic Workflow Generation 2 Dynamic Workflow Generation 2 Dynamic Workflow Look-Ahead Research Technology & Capability Generation 1 Static Workflow Generation 1 Static Workflow Generation 1 Static Workflow Generation 1 Static Workflow Generation 1 Static Workflow Year 1 Year 2 Year 3 Year 4Year 5
In LEAD, Everything is a Web Service • Finite number of services – they’re the “low-level” elements but consist of lots of hidden pieces…services within services. Service B (WRF) Service A (ADAS) Service C (NEXRAD Stream) Service F (IDV) Service E (VO Catalog) Service D (MyLEAD) Service I (ESML) Service H (Scheduling) Service G (Monitoring) Service L (Decoder) Service K (Ontology) Service J (Repository) Many others…
Web Services • They are self-contained, self-describing, modular applications that can be published, located, and invoked across the Web. • The XML based Web Services are emerging as tools for creating next generation distributed systems that are expected to facilitate program-to-program interaction without the user-to-program interaction. • Besides recognizing the heterogeneity as a fundamental ingredient, these web services, independent of platform and environment, can be packaged and published on the internet as they can communicate with other systems using the common protocols. • Emerging web services standards such as SOAP, WSDL and UDDI are enabling much easier system-to-system integration.
Decoder & Data Mover Service Source URL Decoder & Data mover service • When it receives two messages (source and destination URL) it decodes the file and invokes GridFTP to move it. • Note: each message also identifies the user and experiment name. • When the move is complete it send a message indicating that it is done and where to find the file. • It also sends a “notification event” with the same information. (explained in a later slide) Destination URL Destination URL
Start by Building Simple Prototypes to Establish the Services/Other Capabilities… Service A (ADAS) Service F (IDV) Service E (VO Catalog) Service D (MyLEAD) Service L (Decoder) Prototype Z
Solve General Problems by Linking Services Together in Workflows Service L (Decoder) Service D (MyLEAD) Service C (NEXRAD Stream) Service A (ADAS) Note that these servicescan be used as stand-alonecapabilities, independent ofthe LEAD infrastructure(e.g., portal) Service L (Mining) Service B (WRF) Service J (Repository)
LEAD Prototype 4 GWSTBs IDDData Stream Decoders ADAS ADAS Output WRFModel WRF Output • Employ components of WRF prediction as a series of linked web services in a Grid Environment. IDV,NCL ADAM
The LEAD E&O Goal • To scale, integrate, and make extensible the new opportunities and environments for teaching and learning created by a Web Services framework that brings data accessibility, sharing, analysis and visualization tools to end users at different educational levels through the development of: • LEAD LEARNING COMMUNITITES(LLC) • LEAD-TO-LEARNModules • Evaluation and assessment rubrics • Outreach activities
LEAD-TO-LEARN Modules • Using technology tools in collecting, processing, analyzing, evaluating, visualizing, and interpreting data • Using technology tools to enhance learning, increase productivity, and promote creativity • Conducting scientific investigations • Predicting and explaining using evidence • Understanding the 1) concept of models as a method of representing processes; 2) application of scientific and technological concepts; 3) use of models in prediction • Recognizing and analyzing alternative explanations and models • Identifying questions and concepts that guide scientific investigations • Employing technology to improve investigations and develop problem-solving strategies • Evaluating and selecting new information resources based on specific tasks • Communicating and defending a scientific argument • Conducting outcomes assessment
LEAD LEARNING COMMUNITIES • Pre-College • Undergraduate • Graduate • Meteorology/Computer Science Research
This Can Only Be Achieved With Broad Deployment and Sustainability • LEAD’s audience: higher education, operations research, grades 6-12 • LEAD will be integrated into dozens of universities and operational research centers via the UCAR Unidata Program • includes 150 organizations • 21,000 university students • 1800 faculty • hundreds of operational practitioners. • Unidata will serve as the focal point for community-wide deployment, updates, training, integration • LEAD will accelerate the transfer of WRF-based research results into operations via its links with the • NOAA Forecast Systems Laboratory • National Center for Atmospheric Research Developmental Test Bed Center • LEAD will draw in traditionally underrepresented groups via its education programs (HU is a major player in WRF via NOAA center)
Prototype Ia Demo • Now the dreaded demo • Keep your fingers crossed!!!!
GWSTBs IDV IDDEta Grids Decoder Prototype 1a Automatic User driven
LEAD Contact Information • LEAD PI: Prof. Kelvin Droegemeier, kkd@ou.edu • Project Coordinator: Terri Leyton, tleyton@ou.edu • LEAD/UCAR PI: Mohan Ramamurthy Other LEAD UPC staff: Doug Lindholm, Tom Baltzer, Brian Kelly, Ben Domenico, Anne Wilson, Don Murray and http://lead.ou.edu/