1 / 27

Outline

zubeda
Download Presentation

Outline

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Transparent Grid Enablement ofWeather Research and ForecastingS. Masoud Sadjadi1, Liana Fong6, Rosa M. Badia2, Javier Figueroa1,9, Javier Delgado1, Xabriel J. Collazo-Mojica8, Khalid Saleem1, Raju Rangaswami1, Shu Shimizu4, Hector A. Duran Limon5, Pat Welsh3, Sandeep Pattnaik10, Anthony Praino6, David Villegas1, Selim Kalayci1, Gargi Dasgupta7, Onyeka Ezenwoye1, Juan Carlos Martinez1, Ivan Rodero2, Shuyi Chen9, Javier Muñoz1, Diego Lopez1, Julita Corbalan2, Hugh Willoughby1, Michael McFail1, Christine Lisetti1, and Malek Adjouadi11: Florida International University (FIU), Miami, Florida, USA; 2: Barcelona Supercomputing Center, Barcelona, Spain; 3: University of North Florida, Jacksonville, Florida, USA; 4: IBM Tokyo Research Laboratory, Tokyo, Japan; 5: University of Guadalajara, CUCEA, Mexico; 6: IBM T. J. Watson, NY, USA; 7: IBM IRL, India; 8: University of Puerto Rico, Mayaguez Campus, Puerto Rico; 9: University of Miami, Coral Gables, Florida, USA; 10: Florida StateUniversity, Tallahassee, Florida, USAContact: sadjadi@cs.fiu.edu

  2. Outline • Motivation • Grid Enablement • Application and Scenario • System Overview • Remaining Challenges & Lessons Learned

  3. Motivation • Weather Prediction can: • Save Lives • Help Business Owners • How? • Accurate Results • Precise Location Information • What do we have? • WRF – Weather Research Forecast • “The Weather Research and Forecasting (WRF) Model is a next-generation mesocale numerical weather prediction system designed to serve both operational forecasting and atmospheric research needs.”

  4. Motivation (Cont.) • WRF Status • Single Machine/Cluster • Single Domain • Fine Resolution -> Resource Requirements • How to Overcome this? • Through Grid Enablement • Expected Benefits to WRF • More available resources – Different Domains • Faster results • Improved Accuracy

  5. Grid Enablement • “Grid-enabling is the practice of taking existing applications, which currently run on a single node or on a cluster of homogeneous nodes, and adapt them (either automatically or manually) so that they can be deployed over non-homogeneous computing resources connected through the Internet across multiple organizational boundaries (e.g., multiple clusters from different organizations) without major modifications to the underlying source code.” • Grid-enablement process successful if the resulting Grid-enabled application “performs better” than the original application. • Performs better can be interpreted differently • Improved execution time, better resource utilization, enabling collaboration, …

  6. Application and ScenarioThree-Layer Nested Domain

  7. Application and ScenarioThree-Layer Nested Domain 15 km 5 km 1 km

  8. Application and ScenarioThree-Layer Nested Domain

  9. System Overview • Web-Based Portal • Grid Middleware (Plumbing) • Job-Flow Management • Meta-Scheduling • Profiling and Benchmarking • Development Tools and Environments • Transparent Grid Enablement (TGE) • TRAP: Static and Dynamic adaptation of programs • TRAP/BPEL, TRAP/J, TRAP.NET, etc. • GRID superscalar: Programming Paradigm for parallelizing a sequential application dynamically in a Computational Grid

  10. System Architecture Grid Middleware

  11. Web-Based Portal Screenshot Meteorologist Login Interface

  12. Web-Based Portal Screenshot Business Owners/Emergency Official’s Login Interface

  13. Grid Middleware Middleware: • “A layer between network operating systems and applications that aims to resolve heterogeneity and distribution” • Examples: CORBA, Java’s RMI and .NET. Grid Middleware: • Middleware for Grid Enablement • Examples: Globus, Legion, Condor-G, etc.

  14. Peer-to-Peer Inter-Domain Interactions Meteorologist Meteorologist BSC FIU Web-Base Portal Web-Base Portal Job-Flow Manager Job-Flow Manager Peer-to-peer Protocols Meta-Scheduler Meta-Scheduler Resource Policies Resource Policies Loca scheduler Loca scheduler Loca scheduler Loca scheduler Local Resources Local Resources Local Resources Local Resources C

  15. 1 1 4 2 3 6 5 7 1 2 3 4 1 4 6 7 5 3 2 1 5 6 7 Peer-to-Peer Inter-Domain Interactions Meteorologist Meteorologist BSC FIU Web-Base Portal Web-Base Portal Job-Flow Manager Job-Flow Manager Peer-to-peer Protocols Meta-Scheduler Meta-Scheduler 7 Resource Policies Resource Policies Loca scheduler Loca scheduler Loca scheduler Loca scheduler Local Resources Local Resources Local Resources Local Resources C

  16. FIU GCB GCBViz Fork TDWB SGE Fork TDWB LL/Fork Job-Flow Manager Meta-Scheduler Job-Flow Manager IBM Peer-to-peer Job-Flow Manager BSC Peer-to-peer Meta-Scheduler Meta-Scheduler CEPBA Peer-to-peer IBM-USA IBM-India BSCgrid Peer-to-Peer Inter-Domain Interactions

  17. Fault-Tolerant Job-Flow Management

  18. Sample Job flow Sample Job flow Sample Job flow Sample Job flow Sample Job flow Sample Job flow (WS BPEL + JSDL): (WS (WS - - - BPEL + JSDL): BPEL + JSDL): (WS (WS (WS - - - BPEL + JSDL): BPEL + JSDL): BPEL + JSDL): Flow Adapter Job Flow Manager (FM) Adapted job flow To adapt: To adapt: To adapt: Operation: Operation: Operation: Input Input submitJob submitJob submitJob PartnerLink PartnerLink : : PartnerLink : MS_JobSubmissionService MS_JobSubmissionService MS_JobSubmissionService FM: : Notification Proxy: : Generic Invoke Input job flow Patterns Patterns Patterns Patterns Generic Proxy Rule Editor Monitor Recovery Policies Policies Correlater Sample Adapted job flow: Sample Adapted job flow: Sample Adapted job flow: Logs Logs Logs Logs Logs MS:: Job Submission MS:: Notification and Monitoring After adaptation: After adaptation: After adaptation: After adaptation: After adaptation: Operation: Operation: Operation: Operation: Operation: submitJob submitJob submitJob genericInvoke genericInvoke PartnerLink PartnerLink PartnerLink : : : PartnerLink PartnerLink : : Proxy_JobSubmissionService Proxy_JobSubmissionService Proxy_JobSubmissionService Meta- Scheduler (MS) Proxy_GenericInvoke Proxy_GenericInvoke Start Deployment Time Run Time Job Flow Management Architecture

  19. The Meta-Scheduling Protocol

  20. FIU: Meta-Scheduler Internal Architecture

  21. Better Scheduling by Modeling WRF Behavior Mathematical Modeling An Incremental Process An Iterative Process Profiling Code Inspection & Modeling Modeling WRF Behavior Start Parameter Estimation Texe= ( 0 + 1 / #nodes ) ( 0 + 1/ clock )

  22. ResultsExecution Time Vs Allocated CPU

  23. ResultsModel Validation: A Linear Model!

  24. Challenges remain to be addressed • High latency of Internet compared to high-speed LANs • High overhead of the Grid middleware software • Risking compatibility with future WRF versions • High volume of the WRF sources code • Compiling WRF on unsupported platforms

  25. Lessons Learned • No current and complete methodology for Grid Enablement • Grid enabling cluster applications Issues: LAN vs WAN • WRF lack of enough documentation, old programming techniques • Mathematical Model – May Optimize Speedup but also Error Margin – More Clusters Needed • Still on early stage of Concrete Scenario for Forecast Ensemble

  26. Acknowledgements We are thankful to the following individuals for their contributions to some of the ideas presented in this paper: Yanbin Liu, Norman Bobroff, Balaji Viswanathan, Steve Luis, Shu-Ching Chen, Lloyd Trinish, Jason Liu, Alex Orta, T. N. Krishnamurti, Eric Johnson, and Donald Llopis. This work was supported in part by IBM (SUR and Student Support awards), the National Science Foundation (grants OISE-0730065, OCI-0636031, REU-0552555, and HRD-0317692). This work is part of the Latin American Grid (LA Grid) project

  27. Contact Information: S. MasoudSadjadi http://www.cs.fiu.edu/~sadjadi/ sadjadi@cs.fiu.edu Thank you! and Questions?

More Related