1 / 26

Achieving Application Performance on Distributed Resources: Experience with AppLeS

This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items during your presentation In Slide Show, click on the right mouse button Select “Meeting Minder” Select the “Action Items” tab

perrin
Download Presentation

Achieving Application Performance on Distributed Resources: Experience with AppLeS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items during your presentation • In Slide Show, click on the right mouse button • Select “Meeting Minder” • Select the “Action Items” tab • Type in action items as they come up • Click OK to dismiss this box This will automatically create an Action Item slide at the end of your presentation with your points entered. Achieving Application Performance on Distributed Resources: Experience with AppLeS Francine Berman U. C. San Diego

  2. Distributed “Computers” • clusters of workstations • programs include medium-sized, MPP-style SPMD apps, proudly parallel apps • computational grids • programs include resource-intensive, coupled apps • for users, performance is key criteria in evaluating platform

  3. Program Performance • Current distributed programs achieve performance by • dedicating resources • careful staging of data • considerable coordination • It must be possible to achieve distributed program performance by ordinary users on ordinary days ...

  4. Achieving Performance • On ordinary days, many users share system resources • load and availability of resources vary • application behavior hard to predict • poor predictions make scheduling hard • Challenge: Develop application schedules which can leverage deliverable performance of system at execution time.

  5. Self-Centered Scheduling • Everything in the system is evaluated in terms of its impact on the application. • performance of each system component can be considered as a measurable quantity • forecasts of quantities relevant to the application can be manipulated to determine schedule • This simple paradigm forms the basis for AppLeS.

  6. AppLeS Joint project with Rich Wolski • AppLeS= Application-Level Scheduler • Each application has its own self-centered AppLeS • Schedule achieved through • selection of potentially efficient resource sets • performance estimation of dynamic system parameters and application performance for execution time frame • adaptationto perceived dynamic conditions

  7. AppLeS incorporates application-specific information dynamic information prediction Schedule developed to optimize user’s performance measure minimal execution time turnaround time = staging/waiting time + execution time other measures: precision, resolution, speedup, etc. NWS (Wolski) User Prefs App Perf Model Resource Selector Planner Application Act. resources/ infrastructure Grid/cluster AppLeS Architecture

  8. SARA: An AppLeS-in-Progress • SARA = Synthetic Aperture Radar Atlas • Goal: Assemble/process files for user’s desired image • thumbnail image shown to user • user selects desired bounding box within image for more detailed viewing • SARA provides detailed image in variety of formats • Simple SARA: focuses on obtaining remote data quickly • code developed by Alan Su

  9. Focusing in with SARA Thumbnail image Boundingbox

  10. Simple SARA Network shared by variable number of users Data Server Computation servers and data servers are logical entities, not necessarily different nodes Compute Server Data Server Data Server Computation assumed to be done at compute servers

  11. Simple SARA AppLeS • Focus on resource selection problem: Which site can deliver data the fastest? • Data for image accessed over shared networks • Wolski’s Network Weather Service provides forecasts of network load and availability • Servers used for experiments • lolland.cc.gatech.edu • sitar.cs.uiuc • perigee.chpc.utah.edu • mead2.uwashington.edu • spin.cacr.caltech.edu via vBNS via general Internet

  12. Simple SARA Experiments • Ran back-to-back experiments from remote sites to UCSD/PCL • Data sets 1.4 - 3 megabytes, representative of SARA file sizes • Simulates user selecting bounding box from thumbnail image • Experiments run during normal business hours mid-week

  13. Preliminary Results • Experiment with larger data set (3 Mbytes) • NWS trying to track “trends” -- seems to eventually figure out what’s going on

  14. More Preliminary Results • Experiment with smaller data set (1.4 Mbytes) • NWS chooses the best resource

  15. Distributed Data Applications • SARA representative of larger class of distributed data applications • Simple SARA template being extended to accommodate • replicated data sources • multiple files per image • parallel data acquisition • intermediate compute sites • web interface, etc.

  16. Move the computation or move the data? Data Server Data Server Data Server Data Server Comp. Server Data servers may access the same storage media. How long will data access take when data is needed? Client Comp. Server . . . Comp. Server Computation, data servers may “live” at the same nodes SARA AppLeS -- Phase 2 Client, servers are “logical” nodes, which servers should the client use?

  17. A Bushel of AppLeS … almost • During the first “phase” of the project, we’ve focused on getting experience building AppLeS • Jacobi2D, DOT, SRB, Simple SARA, Genetic Algorithm, Tomography, INS2D, ... • Using this experience, we are beginning to build AppLeS “templates”/tools for • master/slave applications • parameter sweep applications • distributed data applications • proudly parallel applications, etc. • What have we learned ...

  18. Lessons Learned from AppLeS • Dynamic information is critical

  19. Lessons Learned from AppLeS • Program execution and parameters may exhibit a range of performance

  20. Lessons Learned from AppLeS • Knowing something about performance predictions can improve scheduling

  21. Lessons Learned from AppLeS • Performance of scheduling policy sensitive to application, data, and system characteristics

  22. Show Stoppers • Queue prediction time • How long will the program wait in a batch queue? • How accurate is the prediction? • Experimental Verification • How do we verify the performance of schedulers in production environments? • How do we achieve reproducible and relevant results? • What are the right measures of success? • Uncertainty • How do we capture time-dependent information? • What do we do if the range of information is large?

  23. Current AppLeS Projects • AppLeS and more AppLeS • AppLeS applications • AppLeS templates/tools • Globus AppLeS, Legion AppLeS, IPG AppLeS • Plans for integration of AppLeS and NWS with NetSolve, Condor, Ninf • Performance Prediction Engineering • structural modeling with stochastic predictions • development of quality of information measures • accuracy • lifetime • overhead

  24. X New Directions • Contingency Scheduling • scheduling during execution • Scheduling with • partial information, poor information, dynamicallychanging information • Multischeduling • resource economies • scheduling “social structure”

  25. AppLeS in Context Integration of multiple grid constituencies architectural models which support high-performance, high-portability, collaborative and other users. automation of program execution Performance “grid-aware” programming; languages, tools, PSEs, performance assessment and prediction Usability, Integration development of basic infrastructure Short-term Medium-term Long-term Integration of schedulers and other tools, performance interfaces Application scheduling Resource scheduling Throughput scheduling Multi-scheduling Resource economy You are here

  26. Thanks to NSF, NPACI, Darpa, DoD, NASA AppLeS Corps: Francine Berman Rich Wolski Walfredo Cirne Marcio Faerman Jamie Frey Jim Hayes Graziano Obertelli AppLeS Home Page: http://www-cse.ucsd.edu/groups/hpcl/apples.html Jenny Schopf Gary Shao Neil Spring Shava Smallen Alan Su Dmitrii Zagorodnov Project Information

More Related