310 likes | 318 Views
This study explores the impact of inaccurate job size information on the performance and fairness of size-based scheduling policies. Results show the importance of correlation and the potential for improvement with effective job size estimators.
E N D
Size-Based Scheduling Policies with Inaccurate Scheduling Information Dong Lu*, Huanyuan Sheng+, Peter A. Dinda* *Prescience Lab,Dept. of Computer Science +Dept. of Industrial Engineering & Management Science Northwestern University Evanston, IL 60201 USA
Outline • Review of size-based scheduling • Motivation • Simulation Setup • Simulation Results • New applications
Non-size-based scheduling • FCFS, PS, etc. • FCFS: First Come First Serve • Intuitive • Easiest to implement • PS: Processor Sharing • Fair: all jobs accept equal resources • Also easy to implement Problem: Unaware of job size information, which results in big mean response time
Review of size-based scheduling • SRPT, FSP, etc. • Utilize the job size (processing time, service time) information for scheduling • Optimal in mean response time • Fair? • Easy to implement? We use Job Size to refer to the Processing Time (Service Time) of the job
Shortest Remaining Processing Time (SRPT) • Always serve the job with minimum remaining processing time first, Preemptive scheduling • Yields minimum mean response time [Schrage, Operations Research, 1968] • Performance gains of SRPT over PS do not usually come at the expense of large jobs, in other words, it is Fair for heavy-tail job size distribution [Bansal and Harchol-Balter, Sigmetrics ‘01] • Easy to implement? • With accurate a priori job size information, YES • Otherwise, NO
Fair Sojourn Protocol (FSP) • Combined SRPT with PS, preemptive scheduling • Mean response time is close to that of SRPT; and more fair than PS [Friedman, et al, Sigmetrics ‘03] • Easy to implement? • With accurate a priori job size information, YES • Otherwise, NO
Motivation • Size-based scheduling requires accurate knowledge of job sizes • In practice, a priori job size information is not always • available • All the previous work assumes perfect knowledge of job sizes a priori • How does performance depend on • quality of job size information?
Correlation We study the performance of Size-based schedulers as a function of the correlation coefficient (Pearson’s R) between actual job sizes and estimated job sizes.
Outline • Review of size-based scheduling • Motivation • Simulation Setup • Simulation Results • New applications
Simulation Setup: Trace generator Correlation (Pearson’s R) Distribution A Distribution B Trace Generator • X Y • 100 • 300 • . . • . . • . . • Correlated random pairs of X and Y • X has distribution A • Y has distribution B • X and Y are correlated to R
Simulation Setup: Trace generator • Algorithm: “Normal-To-Anything” • First developed by Cario and Nelson, on INFORMS Journal on Computing 10, 1 (1998). • We simplified the algorithm and first introduced it into the simulation studies of computer systems
Scatter plot of example traces Y Y X X R=0.78 R=0.13
Simulation Setup: Performance metrics • Performance metrics • Mean response time: Sojourn time, Turn-around time • Slowdown: the ratio of response time to its size. Fairness metric
Simulation Setup: Simulator • Simulator • Written in C++ • Supports M/G/1 and G/G/n/m queuing model • Simulator validation • Little’s law • Repeat the simulations in the FSP paper [Friedman, et al, Sigmetrics ‘03] • Compare with available theoretical results [Bansal and Harchol-Balter, Sigmetrics ‘01]
Simulation Setup: Scheduling Policies • PS: Processor sharing • Size-based scheduling policies • SRPT: Ideal SRPT scheduler • SRPT-E: SRPT scheduler using estimated job size • FSP: Ideal Fair Sojourn Protocol • FSP-E: FSP scheduler using estimated job size Each simulation is repeated 20 times and we present the average
Outline • Review of size-based scheduling • Motivation • Simulation Setup • Simulation Results • New applications
Simulation Results: Conclusions • Performance heavily depends on correlation • SRPT-E and FSP-E can outperform PS given an effective job size estimator • Crossover point of performance metrics is a function of correlation • Also of job size distributions (See TR NWU-CS-04-33)
Outline • Review of size-based scheduling • Motivation • Simulation Setup • Simulation Results • New applications
New Applications: Web server scheduling (TR NWU-CS-04-33) • Is file size a good estimator of a job’s service time (processing time)? Not Really (R 0.14) File Size Service time (wall clock time)
New Applications: Web server scheduling • Domain-based estimator: much more accurate prediction of the service timeat low overhead
New Applications: P2P server side scheduling (LCR ’04) • “Server side” of current file sharing P2P applications superficially similar to web server • Both send back files upon requests. • However, P2P application can’t even know the file size accurately a priori • Partial downloads • Our ongoing work shows that SRPT-E performs well using our time-series based job size estimators.
New Applications: Network backup system scheduling • Incremental backup copies only the files that have been created or modified since a previous backup • With Incremental backup, the actual job sizes is difficult to know until the backup finishes • We believe that SRPT-E or FSP-E can be applied with time series based job size predictors
Summary • Performance of size-based scheduling policies depends on correlation between size estimates and actual sizes • Fairness, mean response time, etc. • Estimator must preserve ordering of job sizes for high performance • Performance degrades as correlation degrades • Effective new estimators for Web and P2P
For MoreInformation • Prescience Laboratory • http://plab.cs.northwestern.edu For more details on the applications, please also see our short paper “Applications of SRPT Scheduling with Inaccurate Scheduling Information” in digital proceedings of MASCOTS ‘04 and a poster this evening.