Simulation Evaluation of Hybrid SRPT Policies

Simulation Evaluationof Hybrid SRPT Policies Mingwei Gong and Carey Williamson Department of Computer Science University of Calgary April 19, 2004

Introduction • Web: large-scale, client-server system • WWW: World Wide Wait! • User-perceived Web response time involves: • Transmission time, propagation delay in network • Queueing delays at busy routers in the Internet • Delays caused by TCP protocol effects (e.g., handshake, slow start, packet loss, retransmits) • Queueing delays at the Web server itself, which may be servicing 100’s or 1000’s of concurrent requests • Our focus in this work: Web request scheduling

Example Scheduling Policies • FCFS: First Come First Serve • typical policy for single shared resource (“unfair”) • e.g., drive-thru restaurant; playoff tickets • PS: Processor Sharing • time-sharing a resource amongst J jobs • each job gets 1/J of the resources (equal, “fair”) • e.g., CPU; VM; multi-tasking; Apache Web server • SRPT: Shortest Remaining Processing Time • pre-emptive version of Shortest Job First (SJF) • give full resources to job that will complete quickest • e.g., ??? (express lanes in grocery store)(almost)

Research Methodology • Trace-driven simulation • Input workload is empirical/synthetic trace • Web server simulator • Empirical trace (1 million requests, World Cup 1998) • Synthetic traces (WebTraff) • Probe-based sampling methodology • Based on PASTA: Poisson Arrivals See Time Averages • Any scheduling policy, any arrival process, any service time distribution.

Simulation Assumptions • User requests are for static Web content • Server knows response size in advance • Network bandwidth is the bottleneck • All clients are in the same LAN environment • Ignores variations in network bandwidth and propagation delay • Fluid flow approximation: service time = response size • Ignores packetization issues • Ignores TCP protocol effects • Ignores network effects • (These are consistent with SRPT literature)

Performance Metrics • Slowdown: • The slowdown of a job is its observed response time divided by the ideal response time if it were the only job in the system • Lower is better • We consider mean slowdown as well as the variance of slowdown (complete distribution)

Empirical Web Server Workload

S S S Probe-based Sampling Algorithm • The algorithm is based on PASTA (Poisson Arrivals See Time Average) principle. Slowdown (1 sample) Repeat N times

Probe-based Sampling Algorithm For scheduling policy S =(PS, SRPT, FCFS, LRPT, …) do For load level U = (0.50, 0.80, 0.95) do For probe job size J = (1B, 1KB, 10KB, 1MB...) do For trial I= (1,2,3… N) do Insert probe job at randomly chosen point; Simulate Web server scheduling policy; Compute and record slowdown value observed; end of I; Plot marginal distribution of slowdown results; end of J; end of U; end of S;

“asymptotic convergence” “crossover region” (mystery hump) 1 1-p x y Slowdown Profile Plot 8 PS Slowdown SRPT 1 0 8 Job Size

Notation Details • Number of jobs in the system: J • Number of threads for a single server: K • Number of servers in the system: M • Probe jobs: 1KB, 10KB, 100KB, 1MB... • Number of probes: 3000 • All simulation results are for 95% load

Single Server Scenario (M = 1) • PS: Processor Sharing • SRPT: Shortest Remaining Processing Time • FSP: Fair Sojourn Protocol • FSP computes the times at which jobs would complete under PS and then orders the jobs in terms of earliest PS completion times. FSP then devotes full service to the uncompleted job with the earliest PS completion time. • “FSP response time dominates PS” (i.e., is never worse) E. Friedman and S. Henderson: “Fairness and Efficiency in Web Server Protocols”, Proc. ACM SIGMETRICS 2003.

Mean Slowdown (M = 1)

Variance of Slowdown (M = 1)

A Hybrid SRPT/PS Policyfor a Single Server (M = 1) • Threshold-based policy, with threshold T • T-SRPT • Determining whether the system is "busy" or not depends on number of jobs (J) in the system. • If J <= T • Then use PS • Else use SRPT • Special cases: T = 0 is SRPT, T = is PS 8

Mean Slowdown for T-SRPT

Variance of Slowdown for T-SRPT

A Generalized SRPT Policy for aMulti-threaded Single Server • K-SRPT • Multi-threaded version of SRPT that allows up to K jobs (the K smallest RPT ones) to be in service concurrently (like PS), though with the same fixed aggregate service rate. Additional jobs (if any) in the system wait in the queue. Also preemptive, like SRPT. • Let s = min (J, K) • If J <= K • Then J jobs each receive 1/s • Else K jobs each receive 1/s (while J-K wait) • Special cases: K = 1 is SRPT, K = is PS 8

Mean Slowdown for K-SRPT

Variance of Slowdown for K-SRPT

Multi-Server Scenario • M-SRPT: • Let s = M • If J <= M • Then J jobs each receive 1/s (M-J idle servers) • Else M jobs each receive 1/s (while J-M wait) • M-PS: • Let s = max(M, J) • Each job receives a service rate of 1/s • M-FSP: • Let s = M • If J <= M • Then J jobs (under PS) each receive 1/s • Else M jobs (under PS) each receive 1/s

Mean Slowdown for M-SRPT

Variance of Slowdown for M-SRPT

Mean Slowdown for M-PS

Variance of Slowdown for M-PS

Mean Slowdown for M-FSP

Variance of Slowdown for M-FSP

Summary • Slowdown profile plots for several policies • For the largest jobs, FSP better than SRPT and PS • For small jobs, FSP is sometimes worse than SRPT • Multi-threaded server results • T-SRPT and K-SRPT provide a smooth transition between SRPT and PS, implying smoother tradeoff in fairness between small jobs and large jobs • Multi-server results • With more servers, mean slowdown worsens, but variance of slowdown often improves • FSP does not response time dominate PS for M > 1

Thank You!Questions? For more information Email: {gongm,carey}@cpsc.ucalgary.ca

Simulation Evaluation of Hybrid SRPT Policies