270 likes | 284 Views
This study investigates the correlation between file size and service time on web server scheduling policies, and suggests an improved domain-based estimator. The results show weak correlation on web cache servers and highlight the need for a better service time estimator.
E N D
Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies Dong Lu*+ Peter Dinda* Yi Qiao* Huanyuan Sheng* *Northwestern University +Ask Jeeves, Inc.
Outline • Quick review of size-based scheduling • Motivation and approach • Correlation between file size and service time: a measurement study • Performance of SRPT scheduling under real workload • Domain-based scheduling
Quick Review of Size-based Scheduling • SRPT • Shortest Remaining Processing Time • Assuming perfect knowledge of service times • FSP • Fair Sojourn Protocol • Assuming perfect knowledge of service times • Typical non-size-based scheduling • Processor Sharing (PS) • First Come First Serve (FCFS)
SRPT • Always serve the job with minimum remaining processing time first, preemptive scheduling • Performance: Minimum mean response time [Schrage, Operations Research, 1968] • Fairness: performance gains of SRPT over PS do not usually come at the expense of large jobs, in other words, it is fair for heavy-tail job size distribution [Bansal and Harchol-Balter, Sigmetrics ‘01]
FSP • Combined SRPT with PS, preemptive scheduling. [Friedman, et al, Sigmetrics ‘03] • SRPT + the longer a job stay in the queue, the higher its priority • Performance: Mean response time is close to that of SRPT • Fairness: Fairer than PS
Outline • Quick review of size-based scheduling • Motivation and approach • Correlation between file size and service time: a measurement study • Performance of SRPT scheduling under real workload • Domain-based scheduling
Motivation • Current implementation of SRPT and FSP • Use file size as service time (sorting jobs using file size) • Is file size a good estimator of service time? • What is the performance of SRPT and FSP using file size as service time? And how to improve? Service time: the time needed to send requested data in the absence of other requests in the system
Trace-driven Simulation • Simulator: • C++ • Supports G/G/n/m queuing model • Driven by enhanced web server traces • Validation • Little’s law • Repeat the simulations in the FSP paper [Friedman, et al, Sigmetrics ‘03] • Compare with available theoretical results [Bansal and Harchol-Balter, Sigmetrics ‘01]
Scheduling Policies Studied • SRPT: Ideal SRPT • SRPT-FS: File size as service time • SRPT-D: Domain-estimated service time • FSP: Ideal FSP • FSP-FS: File size as service time • FSP-D: Domain-estimated service time • PS: Processor sharing
Outline • Quick review of size-based scheduling • Motivation and approach • Correlation between file size and service time: a measurement study • Performance of SRPT-FS and FSP-FS scheduling under real workload • Domain-based scheduling
R ≈ 0.14 File Size Service time Correlation is Weak on a Typical Web Server • Measurement on departmental web server: Scatter plot of file size versus service time (log-log scale) Request from the whole Internet
1.0 P[R>x] 0.5 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Correlation Coefficient R Between File size and Service time Correlation is Weak on Web Cache Servers • Measurement on 10 Squid web cache servers: • www.ircache.net
Main reason for the weak correlation • End-to-end path diversity Web Server Client 4 Client 3 Client 1 Client 2
Outline • Quick review of size-based scheduling • Motivation and approach • Correlation between file size and service time: a measurement study • Performance of SRPT-FS and FSP-FS scheduling under real workload • Domain-based scheduling
Mean Response Time Much Worse Than Expected Simulation driven by web server trace. G/G/1/m. Pareto arrivals (rate controlled to tune the load). Mean Response Time (millisec) 900 700 SRPT-FS 500 FSP-FS PS 300 Ideal SRPT and FSP 100 0 0.5 1.0 1.5 2.0 Load on the queue
0 0.5 1.0 1.5 2.0 Mean Queue Length Much Worse Than Expected Simulation driven by web server trace. G/G/1/m. Pareto arrivals (rate controlled to tune the load). 5000 Mean Queue Length 4000 SRPT-FS PS FSP-FS 3000 2000 Ideal SRPT and FSP 1000 Load on the queue
Requirements For A Better Service Time Estimator • Low overhead • Passive measurement • Low computation complexity • Low / adjustable memory usage • Effective • Approximate the correct ordering of the service times. High correlation.
Outline • Quick review of size-based scheduling • Motivation and approach • Correlation between file size and service time: a measurement study • Performance of SRPT-FS and FSP-FS scheduling under real workload • Domain-based scheduling
Domain-based estimator • Divide Internet into smaller “domains” by leveraging CIDR (Classless Inter-domain Routing) • Hosts in the same domain are likely to share same/similar routes to web server, and thus similar throughput Web Server
Supporting Facts • Statistical Internet stability and locality • Routing stability [Paxson, Sigcomm 1996] • TCP throughput locality and stability [Balakrishnan, et al, Sigmetrics 1997]; [Seshan, et al, USITS 1997]; [Myers, et al, Infocom 1999] • Classless Inter-domain Routing • implies that routes from machines in the domain to a server outside the domain will share many hops.
Algorithm • Use high order k bits of client IP address to classify clients into 2k domains • For each domain, calculate R = F/S • R: representative service rate • F: sum of file sizes delivered to domain • S: sum of corresponding service times • For each request, first extract its domain, then service time can be estimated as B/R • B: requested file size • R: representative service rate obtained before
0.7 Correlation Coefficient R 0.5 0.3 0.1 0 8 16 24 32 Bits used to define a domain Higher Correlation Can Be Achieved
Much Lower Service TimesCan Be Achieved FSP-D 900 FSP-FS Mean Response time (milisec) 700 500 SRPT-FS SRPT-D 300 PS 100 0 8 16 24 32 Bits used to define a domain SRPT and FSP
Much Lower Queue LengthsCan Be Achieved 3000 FSP-D FSP-FS Mean queue length 2000 SRPT-FS SRPT-D 1000 PS 0 5 10 15 20 25 30 35 Bits used to define a domain SRPT and FSP
Conclusions • File size may not be a good estimator of service time for many regimes • File size-based SRPT and FSP can perform worse than PS in these regimes • Domain-based scheduling brings the benefits of size-based scheduling to these regimes
For more information • Prescience Lab at Northwestern University • www.presciencelab.org
Jeeves’ Invitation … • Have you ever seen the whole Web at once? • Did you ever wonder how to rein the power of thousands of machines? • We are hiring talents for Internet Search • Software Engineer • Development Manager Send us your Resume: talentacquisition@askjeeves.com