380 likes | 480 Views
This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items during your presentation In Slide Show, click on the right mouse button Select “Meeting Minder†Select the “Action Items†tab
E N D
This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items during your presentation • In Slide Show, click on the right mouse button • Select “Meeting Minder” • Select the “Action Items” tab • Type in action items as they come up • Click OK to dismiss this box This will automatically create an Action Item slide at the end of your presentation with your points entered. Application Scheduling on Distributed Resources Francine Berman U. C. San Diego and NPACI
The Computational Grid • Computational Grid becoming increasingly prevalent as a computational platform • Focus is on using distributed resources as anensemble • clusters of workstations • MPPs • remote instruments • visualization sites • storage archives
Programming the Grid • How do we write Grid programs? • How do we achieve program performance? • First try: extend MPP programs ...
MPP Programming Model processors, network are uniform single administrative domain “machine” is typically dedicated to user Grid Programming Model resources are distributed, heterogeneous grid may comprise multiple administrative domains resources are shared by multiple users Programming the Grid
MPP programs achieve performance by dedicating resources careful staging of computation and data considerable coordination Computational Grids are dynamic load and availability of resources vary with time and load both system and application behavior hard to predict Achieving Program Performance Grid Programming Challenge:How can programs leverage the deliverable performance of the Grid at execution time?
Scheduling • Scheduling is fundamental to performance • On the Computational Grid, scheduling mechanism must • perceivethe performance impact of system resources on the application • adapt to dynamic conditions • optimizeapplication schedule for Grid at execution time
Performance feedback Perf problem Realtime perf monitor Software components Service negotiator Grid runtime system Config. object program Source appli- cation whole program compiler P S E negotiation Scheduler Dynamic optimizer libraries Grid Application Development System Whose Job Is It? • Application scheduling can be performed by many entities • Resource scheduler, job scheduler, program developer, system administrator, user, application scheduler
Scheduling and Performance • Achieving application performance can conflict with system performance goals • Resource Scheduler -- perf measure is utilization • Job Scheduler -- perf measure is throughput • System Administrator -- focuses on system perf • Goal of scheduling application is to promote application performance over performance of other applications and system components • Application Scheduler -- perf measure is app.-specific
Self-Centered Scheduling • Everything in the system is evaluated in terms of its impact on the application. • performance of each system component can be considered as ameasurable quantity • forecasts of quantities relevant to the application can be manipulated to determine schedule • This simple paradigm forms the basis for AppLeS.
AppLeS Joint project with Rich Wolski • AppLeS= Application-Level Scheduler • Each application has its own self-centered AppLeS agent. • Custom application schedule achieved through • selection of potentially efficient resource sets • performance estimation of dynamic system parameters and application performance for execution time frame • adaptationto perceived dynamic conditions
AppLeS incorporates application-specific information dynamic information prediction Each AppLeS schedule is customized for its application and envt. AppLeS scheduler promotes performance as defined by the user execution time convergence turnaround time NWS (Wolski) User Prefs App Perf Model Resource Selector Planner Application Act. resources/ infrastructure Grid/cluster AppLeS Architecture
Sensor Interface Reporting Interface Forecaster Model Model Model Network Weather Service (Wolski) • The NWS provides dynamic resource information for AppLeS • NWS is stand-alone system • NWS • monitors current system state • provides best forecast of resource load from multiple models
The Role of Prediction • Is monitoring enough for scheduling?
Monitored data Monitoring vs. Forecasting • Monitored data provides a snapshot of what hashappened, forecasting tells us: what willhappen?. • Last value is not always the best predictor...
P1 P2 P3 Using Forecasting in Scheduling • How much work should each processor be given? • Jacobi2D AppLeS solves equations forArea
Good Predictions Promote Good Schedules • Jacobi2D experiments
SARA: An AppLeS-in-Progress • SARA = Synthetic Aperture Radar Atlas • application developed at JPL and SDSC • Goal:Assemble/process files for user’s desired image • thumbnail image shown to user • user selects desired bounding box for more detailed viewing • SARA provides detailed image in variety of formats
Network shared by variable number of users Data Server Computation servers and data servers are logical entities, not necessarily different nodes Compute Server Data Server Data Server Computation assumed to be done at compute servers Simple SARA • Simple SARA focuses on obtaining remote data quickly • Code developed by Alan Su
Simple SARA AppLeS • Focus on resource selection problem: Which site can deliver data the fastest? • Data for image accessed over shared networks • Data sets 1.4 - 3 megabytes, representative of SARA file sizes • Servers used for experiments • lolland.cc.gatech.edu • sitar.cs.uiuc • perigee.chpc.utah.edu • mead2.uwashington.edu • spin.cacr.caltech.edu via vBNS via general Internet
Which is “Closer”? • Sites on the east coast or sites on the west coast? • Sites on the vBNS or sites on the general Internet? • Consistently the same site or different sites at different times?
Which is “Closer”? • Sites on the east coast or sites on the west coast? • Sites on the vBNS or sites on the general Internet? • Consistently the same site or different sites at different times? Depends a lot on traffic ...
Simple SARA Experiments • Ran back-to-back experiments from remote sites to UCSD/PCL • Wolski’s Network Weather Service provides forecasts of network load and availability • Experiments run during normal business hours mid-week
Preliminary Results • Experiment with larger data set (3 Mbytes) • During this time-frame, general Internet provides data mostly faster than vBNS
More Preliminary Results • Experiment with smaller data set (1.4 Mbytes) • During this time frame, east coast sites provide data mostly faster than west coast sites
9/21/98 Experiments • Clinton Grand Jury webcast commenced at trial 62
Distributed Data Applications • SARA representative of larger class of distributed data applications • Simple SARA template being extended to accommodate • replicated data sources • multiple files per image • parallel data acquisition • intermediate compute sites • web interface, etc.
Data Servers Compute Servers . . . Distributed Data Applications Move the computation or move the data? Which servers to use for multiple files? Client Which compute servers to use?
A Bushel of AppLeS … almost • During the first “phase” of the project, we’ve focused on developing AppLeS applications • Jacobi2D • DOT • SRB • Simple SARA • Genetic Algorithm • CompLib • INS2D • Tomography, ... • What have we learned?
Compile-time Blocked Partitioning Run-time AppLeS Non- Uniform Strip Partitioning Lessons Learned From AppLeS Dynamic information is critical.
Lessons Learned from AppLeS • Program execution and parameters may exhibit a range of performance
Lessons Learned from AppLeS • Knowing something about the “goodness” of performance predictions can improve scheduling SOR CompLib
Lessons Learned from AppLeS • Performance of application sensitive to scheduling policy, data, and system characteristics
Achieving Performance on the Computational Grid Adaptivity a fundamental paradigm for achieving performance on the Grid. • AppLeS uses adaptivity to leverage deliverable resource performance • Performance impact of all components considered • AppLeS agents target dynamic, multi-user distributed environments
Related Work • Application Schedulers • Mars, Prophet/Gallop, VDCE • Scheduling Services • Globus GRAM • Resource Allocators • I-Soft, PBS, LSF, Maui Scheduler, Nile • PSEs • Nimrod, NEOS, NetSolve, Ninf • High-Throughput Schedulers • Condor • Performance Steering • Autopilot, SciRun
B A C Current AppLeS Projects • AppLeS Templates • distributed data applications • parameter sweeps • master/slave applications • data parallel stencil applications • Performance Prediction Engineering • scheduling with quality of information • accuracy • lifetime • overhead
X AppLeS Projects • Real World Scheduling • Contingency Scheduling • scheduling during execution • Imperfect Scheduling • scheduling with • partial information • poor information • dynamically changing information • Multischeduling • resource economies • scheduling “social structure”
Performance feedback Perf problem Realtime perf monitor Software components Service negotiator Grid runtime system Config. object program Source appli- cation whole program compiler P S E negotiation Scheduler Dynamic optimizer libraries The Brave New World • “Grid-aware” programming will require comprehensive development and execution environment • Adaptation will be fundamental paradigm Grid Application Development System
Thanks to NSF, NPACI, Darpa, DoD, NASA AppLeS Corps: Francine Berman Rich Wolski Walfredo Cirne Henri Casanova Marcio Faerman Markus Fischer Jaime Frey AppLeS Home Page:http://www-cse.ucsd.edu/groups/hpcl/apples.html Jim Hayes Graziano Obertelli Jenny Schopf Gary Shao Shava Smallen Alan Su Dmitrii Zagorodnov Project Information