1 / 19

A modeling approach for estimating execution time of long-running Scientific Applications

A modeling approach for estimating execution time of long-running Scientific Applications. Seyed Masoud Sadjadi 1 , Shu Shimizu 2 , Javier Figueroa 1,3 , Raju Rangaswami 1 , Javier Delgado 1 , Hector Duran 4 , Xabriel J. Collazo-Mojica 5 Presented by: Xabriel J. Collazo-Mojica 5

carriejones
Download Presentation

A modeling approach for estimating execution time of long-running Scientific Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A modeling approach for estimating execution time of long-running Scientific Applications Seyed Masoud Sadjadi1, Shu Shimizu2, Javier Figueroa1,3, Raju Rangaswami1, Javier Delgado1, Hector Duran4, Xabriel J. Collazo-Mojica5 Presented by: Xabriel J. Collazo-Mojica5 1: Florida International University (FIU), Miami, Florida, USA; 2: IBM Tokyo Research Laboratory, Tokyo, Japan; 3: University of Miami, Coral Gables, Florida, USA; 4: University of Guadalajara, CUCEA, Mexico; 5: University of Puerto Rico, Mayagüez Campus, Puerto Rico   Miami, Florida – April 2008

  2. Presentation Outline • Motivation • Research Approach • Research Validation • Related Work • Concluding Remarks • Future Research HPGC '08 - April 14 - LA Grid

  3. Motivation • The impact of hurricanes is devastating • The Weather Research and Forecasting (WRF) model • Most popular • It is computational and storage intensive • We need higher resolution and more precise forecast • Many organizations are willing to share resources • But these resources are dynamic and unpredictable HPGC '08 - April 14 - LA Grid

  4. Motivation • At the time of a hurricane, we need to act fast • What resources should we allocate? • We need to finish in a strict deadline (i.e. on time for hurricane forecast) • In the order of seconds, we need to make a decision • We need to model execution time of WRF based on target resources • In our case: clusters with different parameters HPGC '08 - April 14 - LA Grid

  5. Approach to Modeling Resource Usage WRF Network Latency CPU Speed Hard Disk I/O Number of Nodes Network Bandwidth FSB Bandwidth RAM Size L2 Cache Application Resource Usage Model HPGC '08 - April 14 - LA Grid

  6. Approach to Modeling Execution Parallelism • Platform heterogeneity • We assume identical individual resource characteristics of computation, communication and storage power. • Execution scale • We add a parameter to model the number of nodes utilized during execution. 1 2 3 N … HPGC '08 - April 14 - LA Grid

  7. Application Resource Usage Model • Characterize Applications according to their resource usage characteristics (i.e. application "profiles”) • Assumptions: • Execution time is based on contributors • Product of contributors determines total execution time • Computation nodes are homogeneous (e.g. Beowulf cluster) • Non-ad-hoc application characteristics HPGC '08 - April 14 - LA Grid

  8. Application Resource Usage Model - Contributors • Model aims to allow as many contributors as necessary • This paper focus: 2 contributors • First contributor: Parallelism • Ppara = degree of parallelism • α0= constant contribution • α1 = variable contribution • Second contributor: CPU Performance • Pclock = clock speed of compute node • ß0 = constant contribution related to CPU performance • ß1 = variable contribution related to CPU performance HPGC '08 - April 14 - LA Grid

  9. Experimental Approach - Environment • GCB cluster: Rocks ver. 4.0, 8 nodes, each containing 32-bit x86 Intel 3.0 GHz processors, 1GB of main memory and uses a gigabit network connection • Mind cluster: Rocks ver. 4.0, 16 nodes, each containing dual Xeon 3.6GHz processors, 2GB of main memory and uses gigabit network connection • CPU vs. #-of-NODES:100% to 10% CPU percentages with intervals of 10% • We use CPULimit HPGC '08 - April 14 - LA Grid

  10. Experimental Approach - Monitoring and Prediction • Two tools were used • Amon – A Monitoring Tool • Daemon-like application that collects and reports exploratory variables • Aprof – A Profiling Tool • Statistical Prediction Program • Listens to Amon reports from compute nodes • Stores collected data as matrix for each application HPGC '08 - April 14 - LA Grid

  11. Experimental Approach - Monitoring and Prediction HPGC '08 - April 14 - LA Grid

  12. Application Resource Usage Model - Validation • Intuitive Assumption that execution time decreases linearly with the inverse of total computational power. • Predictions within a cluster (i.e. GCB to GCB) • GCB - FE 5.34% ME 5.86% • Mind - FE 5.66% ME 3.80% • Predictions across clusters • GCB to Mind - FE 9.97% ME 5.86% • Mind to GCB - FE 5.83% ME 4.13% • This results validate our simple model. HPGC '08 - April 14 - LA Grid

  13. Application Resource Usage Model - Mind to GCB prediction HPGC '08 - April 14 - LA Grid

  14. Concluding Remarks • We've proposed a new approach for modeling resource usage and execution time of a distributed application • Experimental results using WRF execution on two different clusters show good accuracy - within 10% from across cluster predictions • Using only two parameters - CPU speed and number of nodes. • WRF specific, we are one step closer to devising a complete solution for our goal of higher-resolution weather predictions and simulations. HPGC '08 - April 14 - LA Grid

  15. Related Work • S. Shimizu, R. Rangaswami, and H. A. Duran-Limon. "Platform-independent Modeling and Prediction of Application Resource Usage Characteristics” • Basis for prediction model • It is limited to one node • D. M. Swany and R. Wolski. “Multivariate Resource Performance Forecasting In the Network Weather Service.” • High-accuracy prediction model • They emphasize latency and bandwidth HPGC '08 - April 14 - LA Grid

  16. Related Work • R. Badia, F. Escale, E. Gabriel , J. Gimenez, R. Keller, J. Labarta, M. S. Müller, Perf. “Prediction in a Grid Environment.” • Offline prediction • Need to link their library to the application to be profiled HPGC '08 - April 14 - LA Grid

  17. Future Research • Extend our parallelism model to address heterogeneous resources. • Include more resource parameters to the model • Started joint research with Barcelona Supercomputing Center • We acknowledge that Amon & Aprof have limitations • We will integrate our tools with their simulation application - DIMEMAS HPGC '08 - April 14 - LA Grid

  18. Acknowledgements • National Science Foundation • REU Grant # IIS-0552555 • PIRE Grant # OISE-0730065 • CREST Grant # HRD-0317692 • GCB Grant # OCI-0636031 • IBM Research • LA Grid • FIU SCIS HPGC '08 - April 14 - LA Grid

  19. Questions?

More Related