Measuring the Capacity of a Web Server

Measuring the Capacity of a Web Server Gaurav Banga Peter Druschel Presenter: Sumit Lohia February 22, 2000

Outline • Introduction • HTTP Server Dynamics • Simple Request Generator • Scalable Clients • Evaluation of Scalable Clients • Conclusion

Introduction • What do we measure? • Requests/sec • How do we measure? • Live workloads • Synthetic workloads

Realistic Traffic • Large number of simultaneous clients • Bursty traffic (8 to 10 times average load) • Delay and loss characteristics of WAN • Request file types • Transfer sizes • Locality of reference in URL requested

HTTP Server Dynamics Server GET <uri> HTTP/1.0 Client HTTP/1.0 200 OK New socket connection made for each request

Connection Establishment Phase • SYN to Server • SYN-ACK to Client • ACK to Server HTTP Connection Establishment Timeline

OS and TCP Limitations • Maximum backlog on a listen socket • This is the upper-bound on the sum of the lengths of SYN-RCVD and accept queues • Most systems today have high somaxconn values (eg. 32767). • Exponential backoff or connection timeout by client if it misses SYN-ACK packet. • BSD retransmits at 6 secs, 30 seconds and gives up at 75 seconds.

OS and TCP Limitations(2) • Accept queue length depends upon how fast the server calls accept(). • FIN also has a 3-way communication. Slow FIN sequence limits the number of sockets active on the server.

Realistic Traffic • Large number of simultaneous clients • Bursty traffic (8 to 10 times average load) • Delay and loss characteristics of WAN • Request file types • Transfer sizes • Locality of reference in URL requested

Simple Request Generator • Set of N web client processes executing on P client machines • Client machines and Server share a LAN • Each client sends request, receives response, waits for think time and repeats cycle

Problems with Simple Method • Clients stay lockstep with the server • When server is running at full capacity, all additional clients are placed in the accept queue and no incremental requests are made • When clients are added past the accept queue length, TCP exponential backoff generates further requests at very low rate (0.04 requests/sec)

Problems with Simple Method(2) Request Rate versus no. of Clients

Problems with Simple Method(3) • Does not model WAN characteristics which cause long SYN-RCVD queues • Resource constraints on client machine might cause client to become bottleneck

Outline • Introduction • HTTP Server Dynamics • Simple Request Generator • Scalable Clients • Evaluation of Scalable Clients • Conclusion

Scalable Clients Testbed Architecture

Scalable Clients (2) • WAN effects are introduced by introducing artificial delay in router’s forwarding mechanism • Request rate is independent of service rate by server • Ability to model burstiness

Scalable Clients (3) • Multiple clients on single client machine • 2 processes per client • Connection Establishment Process • Connection Handling Process

Connection Establishment Process • Opens connections to server using non-blocking mode • Connections are spaced out over T milliseconds • If connection is completed within T ms, send request and hand over socket to Connection Handling process through Unix domain socket • If T ms have elapsed then, close socket and initiate another connection

Connection Handling Process • Wait for data to arrive on connected sockets • Wait for new connections to arrive on the Unix domain socket • Close socket on complete response

Scalable Client Model A Scalable Client

Design Rationale • Shorten TCP connection timeout • non-blocking connects with socket close() if not connected in T ms • Maintain a constant number of unconnected sockets • establishing another connection as soon as a connection fails

S-client performance Request Rate versus no. of Clients

Quantitative Evaluation • NCSA httpd server • somaxconn set to 1024 • No WAN delays • 4 client machines with equally distributed clients

Quantitative Evaluation (2) Request rate versus number of clients

Overload Behavior Web server throughput versus request rate

Overload Behavior (2) • Server saturates at 130 transactions/second • With 2056 requests/second, transaction rate drops to 75 transactions/second • Drop in throughput due to CPU resources spent on protocol processing for incoming requests

Bursty Conditions • Configured S-client to generate bursty traffic • Two parameters are configured • ratio of maximum request rate and average request rate (burst ratio) • fraction of time for which request rate exceeds average rate (burst duration)

Bursty Conditions (2) Server throughput under bursty conditions versus request rate

Conclusion • S-clients are offers substantial improvement over simple request generator to model overload and bursty conditions • Combined with related work on traffic model, a more accurate benchmark can be developed

Q & A ?

Measuring the Capacity of a Web Server

Measuring the Capacity of a Web Server

Presentation Transcript

A Portable Web Server

Configuring a Web Server

Measuring the Quality of Web Artifacts

Death of a Web Server

Measuring Lung Capacity

Creating a Web Server

Measuring the Requirements Allocation Capacity within a System of Systems

Measuring The Capacity of a Web server

Headroom A Measure of Server Remaining Capacity

Measuring the Semantic Web

Capacity—Measuring Liquid

WEB Server Based Distributed Measuring System

Capacity—Measuring Liquid

Capacity—Measuring Liquid

A ‘minimal’ web-server

Measuring the Size of the Web

A ‘minimal’ web-server

WEB Server Based Distributed Measuring System