300 likes | 318 Views
Learn about LEAD, a grid designed to change the paradigm for mesoscale weather prediction. Discover how it enables dynamic, adaptive workflows and better-than-real-time execution constraints.
E N D
A Quick tour of LEAD for the VGrADS Dennis Gannon, Beth Plale and Suresh Marru Department of Computer Science School of Informatics Indiana University (Lavanya and Dan have seen this many times)
A Science Driven Grid: LEAD • A Grid designed to change the paradigm for mesoscale weather prediction • Building dynamic, adaptive workflows from data streams and better-than-real-time execution constraints.
Traditional Methodology STATIC OBSERVATIONS Radar Data Mobile Mesonets Surface Observations Upper-Air Balloons Commercial Aircraft Geostationary and Polar Orbiting Satellite Wind Profilers GPS Satellites • Product Generation, • Display, • Dissemination Prediction/Detection PCs to Teraflop Systems • Analysis/Assimilation • Quality Control • Retrieval of Unobserved • Quantities • Creation of Gridded Fields The Process is Entirely Serial and Static (Pre-Scheduled): No Response to the Weather! • End Users • NWS • Private Companies • Students
Am Major Paradigm Shift: CASA NETRAD adaptive Doppler Radars.
The LEAD Vision: Adaptive Cyberinfrastructure DYNAMIC OBSERVATIONS • Product Generation, • Display, • Dissemination Prediction/Detection PCs to Teraflop Systems • Analysis/Assimilation • Quality Control • Retrieval of Unobserved • Quantities • Creation of Gridded Fields Models and Algorithms Driving Sensors The CS challenge: Build cyberinfrastructure services that provide adaptability, scalability, availability, useability, and real-time response. • End Users • NWS • Private Companies • Students
Change the Paradigm • To make fundamental advances we need: • Adaptivity in computational model. • But also Cyberinfrastructure to: • Execute complex scenarios in response to weather events • Stream processing, triggers • Close loop with the instruments. • Acquire computational resources on demand. • Need supercomputer-scale resources • Invoked in response to weather events • Deal with data deluge • User can no longer manage his/her own experiment products
Reaching the LEAD Goal • Understanding the role of data is the key • from streams, from mining, from simulations to visualizations • Enabling Discovery • The role of the experiment • Sharing both process and results • Creating an educational “context” • An agile architecture of composable services. • The data is distributed and the computational resources are distributed. • Requires all the Grid attributes (distributed resource allocation, robust and scalable, secure) • All application components are services. • Access to all LEAD resources should be easy
Experiment as Control/Data Flow graph • Another Paradigm Shift for the users. • Each activity a user initiates in LEAD is an Experiment • Data discovery and collection. • Applied analysis and transformation • A graph of activities (workflow) • Curated data products and results • Each activity is logged by an event system and stored as metadata in the users workspace. • Provides a complete provenance of work.
The Architecture The Users Desktop. LEAD Grid Portal Server Gateway Services Proxy Certificate Server / vault User Metadata Catalog MyLEAD Workflow engine Application Deployment Application Events Resource Broker App. Resource catalogs Core Grid Services Security Services Information Services Self Management Resource Management Execution Management Data Services OGSA-like Layer Physical Resource Layer
D data Q Q/A M model A analysis Compute servers Ontology service Query service Geo GUI My Workspace User Workspace and Resources my Workspace Catalog Search tools myExperiment space Authorization Portal Gateway Exper builder Exper GUI Community Resource Catalog My Tools IDV client … Registered with catalog Middleware Tools and Resources Storm event detection Community Datasets Forecast Models Data Mining tools Ensemble initial cond. generation … Used in experiment D D D Publish results to myWorkspace catalog Workflow orchestration engine Computational Layer Q A response M Monitoring & control service
Data Discovery Select community data products for import to workspace or use in experiment
LEAD Data Use Scenario • Importing community data products to users workspace indexed Resource catalog THREDDS, Opendap, LDM THREDDS, Opendap, LDM THREDDS, Opendap, CDM Query service Data in binary (often) Metadata in LEAD schema Noesis Ontology service myLEAD User workspace Grid storage repository
User’s Workspace (myLEAD) • Metadata catalog of user’s data products • User’s storage on LEAD grid • Agent actively archives data products: • Derived data products - data products result of processing original raw data • Temporally changing data products - data continuously changing through • regular additions streamed into archive • Ad hoc actions taken by content creators, or • In conjunction with workflow processes. • Approach: general, reusable data model; open source database (mySQL); standardized metadata schemas (XML); service-oriented architecture (SOAP, WSDL, gridFTP, x509 certificates)
Workflows: Execution of Complex Experiments LEAD requires ability to construct workflows that are • Data Driven • Weather data streams define nature of computation • Persistent and Agile • Data mining of data stream, detects “interesting” feature, event triggers workflow scenario that has been waiting for months. • Adaptive • In response to weather: weather changes. • Nature of workflow may have to change on-the-fly. • Resources • More may be needed, sometimes they become unavailable. • Need to be self-aware
The LEAD Application Codes • All community codes. • Data transformers, Data miners, converters, data assimilators, forecast codes. • Fortran, C and Java • We don’t mess with them beyond instrumentation dan’s group may insert. • We don’t have good profiles for how they behave … yet. • The big one: WRF – Weather Research Forecast code is the standard. • Over 3000 known versions exist … all incompatible. • Atmospheric scientists like to play with it. • Hmmm. What happens if I change this do loop?
c Application Factory Application Services • Workflows are built by composing web services • Fortran applications are “wrapped” by a Application Factory which generates a web service for the app. • Instances of the service are dynamically created using Globus • Registers WSDL for the service with a registry • Each service generates a stream of notifications that log the service actions back to the MyLEAD experiment. App Service Run program & publish events
1 2 3 4 5 6 Service Monitoring via Events • The service output is a stream of events • I am running your request • I have started to ftp your input files. • I have all the files • I am running your application. • The application is finished • I am moving the output to you file space • I am done. • These are automatically generated by the service using a distributed event system(WS-Eventing / WS-Notification) • Topic based pub-sub system witha well known “channel”. Application Service Instance Notification Channel Subscribe Topic=x x x publisher listener
Creating structure in user’s archive that models their investigation steps 12 hrs Gather data products Run 12 hour forecast (6 hrs to complete) Analyze results Based on analysis, gather other products Run 6 Hr forecast (3 hrs to complete) Analyze results workflow workflow Notif service Decoder service Product requests, Product registers, Notification msgs, myLEAD agent myLEAD server
The workflow composer • User designs, then compiler generates GBPEL
Workflow applied to Katrina 2D image of sea level generated by ARPS Plotting Service 3D Image generated by IDV
Resource Scheduling in LEAD • What we tell people • Huh? Oh, VGrADS is doing that. So that is off our plate.
LEAD Static Workflow Terrain data files NAM, RUC, GFS data 3D Model Data Interpolator (lateral Boundary Conditions) 3D Model Data Interpolator (Initial Boundary Conditions) Terrain Preprocessor Surface data, upper air mesonet data, wind profiler IDV Bundle Radar data (level II) 88D Radar Remapper WRF ARPS to WRF Data Interpolator Radar data (level III) ADAS NIDS Radar Remapper Satellite data WRF to ARPS Data Interpolator Satellite Data Remapper Surface, Terrestrial data files Initialization ARPS Plotting Program WRF Static Preprocessor Static data Forecast Real time data Visualization
Dynamic Workflows in LEAD Terrain data files NAM, RUC, GFS data 3 7 1 3D Model Data Interpolator (Initial Boundary Conditions) 3D Model Data Interpolator (lateral Boundary Conditions) Terrain Preprocessor 11 IDV Bundle 4 Radar data (level II) Surface data, upper air mesonet data, wind profiler 88D Radar Remapper 10 9 WRF Radar data (level III) 5 WRF ARPS to WRF Data Interpolator WRF NIDS Radar Remapper WRF 12 ADAS 6 Satellite data WRF to ARPS Data Interpolator Satellite Data Remapper 8 ADAM Triggered if a storm is detected Surface, Terrestrial data files 13 ARPS Plotting Program Data mining: Looking for storm signature 2 Run Once per forecast Region WRF Static Preprocessor Repeated for periodically for new data Visualization on users request
Where does LEAD need VGrADS • The static case workflows: • We can can build execution time models for each of the major workflow components. • Outgrowth of Renci team work. • We can convert this into a “workflow requirement schedule”. • A graph of “tasks”. • Each task is a service invocation of an application. • Associated metadata: • Required resources (memory, # of processors) • Volume of input/output data requirements • Edges are dataflow dependences • Annotated with data requirements • VGrADS can create a service contract which can schedule the right resources at the right time. • This is wide area, deadline driven contract negotiation.
What would our ideal outcome be? • A contract negotiator service. • We pass it Workflow Requirements Schedule document. • The negotiator returns a contract with details of the form • You have cluster A for task x at 6:30 for 20 min. then you have cluster B and C and data archive V for 30 min at 7:00 …
Issues • Time to completion is critical. • If the contract can’t be satisfied, then perhaps a reduced request can be made • “ok, I don’t really need 400000 processors. How about 400?” • The dynamic case is harder. • Do I build a contract based on a worst-case storm scenario? Or just frequently renogotiate?