420 likes | 627 Views
Apache Architecture. How do we measure performance?. Benchmarks Requests per Second Bandwidth Latency Concurrency (Scalability). Building a scalable web server. handling an HTTP request map the URL to a resource check whether client has permission to access the resource
E N D
How do we measure performance? • Benchmarks • Requests per Second • Bandwidth • Latency • Concurrency (Scalability)
Building a scalable web server • handling an HTTP request • map the URL to a resource • check whether client has permission to access the resource • choose a handler and generate a response • transmit the response to the client • log the request • must handle many clients simultaneously • must do this as fast as possible
Resource Pools • one bottleneck to server performance is the operating system • system calls to allocate memory, access a file, or create a child process take significant amounts of time • as with many scaling problems in computer systems, caching is one solution • resource pool: application-level data structure to allocate and cache resources • allocate and free memory in the application instead of using a system call • cache files, URL mappings, recent responses • limits critical functions to a small, well-tested part of code
Multi-Processor Architectures • a critical factor in web server performance is how each new connection is handled • common optimization strategy: identify the most commonly-executed code and make this run as fast as possible • common case: accept a client and return several static objects • make this run fast: pre-allocate a process or thread, cache commonly-used files and the HTTP message for the response
Connections • must multiplex handling many connections simultaneously • select(), poll(): event-driven, singly-threaded • fork(): create a new process for a connection • pthread create(): create a new thread for a connection • synchronization among processes/threads • shared memory: semaphores • message passing
Select • select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout); • Allows a process to block until data is available on any one of a set of file descriptors. • One web server process can service hundreds of socket connections
Event Driven Architecture • one process handles all events • must multiplex handling of many clients and their messages • use select() or poll() to multiplex socket I/O events • provide a list of sockets waiting for I/O events • sleeps until an event occurs on one or more sockets • can provide a timeout to limit waiting time • must use non-blocking system calls • some evidence that it can be more efficient than process or thread architectures
Process Driven Architecture • devote a separate process/thread to each event • master process listens for connections • master creates a separate process/thread for each new connection • performance considerations • creating a new process involves significant overhead • threads are less expensive, but still involve overhead • may create too many processes/threads on a busy server
Process/Thread Pool Architecture • master thread • creates a pool of threads • listens for incoming connections • places connections on a shared queue • processes/threads • take connections from shared queue • handle one I/O event for the connection • return connection to the queue • live for a certain number of events (prevents long-lived memory leaks) • need memory synchronization
Hybrid Architectures • each process can handle multiple requests • each process is an event-driven server • must coordinate switching among events/requests • each process controls several threads • threads can share resources easily • requires some synchronization primitives • event driven server that handles fast tasks but spawns helper processes for time-consuming requests
What makes a good Web Server? • Correctness • Reliability • Scalability • Stability • Speed
Correctness • Does it conform to the HTTP specification? • Does it work with every browser? • Does it handle erroneous input gracefully?
Reliability • Can you sleep at night? • Are you being paged during dinner? • It is an appliance?
Scalability • Does it handle nominal load? • Have you been Slashdotted? • And did you survive? • What is your peak load?
Speed (Latency) • Does it feel fast? • Do pages snap in quickly? • Do users often reload pages?
Apache the General Purpose Webserver Apache developers strive forcorrectness first, andspeed second.
Classic “Prefork” Model • Apache 1.3, and • Apache 2.0 Prefork • Many Children • Each child handles one connection at a time. Parent Child Child Child … (100s) http://httpd.apache.org/docs/2.2/mod/prefork.html
Multithreaded “Worker” Model • Apache 2.0 Worker • Few Children • Each child handles many concurrent connections. Parent Child Child Child … (10s) 10s of threads
Dynamic Content: Modules • Extensive API • Pluggable Interface • Dynamic or Static Linkage
In-process Modules • Run from inside the httpd process • CGI (mod_cgi) • mod_perl • mod_php • mod_python • mod_tcl
Out-of-process Modules • Processing happens outside of httpd (eg. Application Server) • Tomcat • mod_jk/jk2, mod_jserv • mod_proxy • mod_jrun Parent Tomcat Child Child Child
Architecture: The Big Picture Parent 100s of threads 10s of threads Tomcat Child Child Child … (10s) DB mod_jk mod_rewrite mod_php mod_perl
“MPM” • Multi-Processing Module • An MPM defines how the server will receive and manage incoming requests. • Allows OS-specific optimizations. • Allows vastly different server models(eg. threaded vs. multiprocess).
“Child Process” aka “Server” • Called a “Server” in httpd.conf • A single httpd process. • May handle one or more concurrent requests (depending on the MPM). Parent Child Child Servers Child … (100s)
“Parent Process” • The main httpd process. • Does not handle connections itself. • Only creates and destroys children. • Shared memory scoreboard to determine who handles connections Parent Only one Parent Child Child Child … (100s)
“Thread” • In multi-threaded MPMs (eg. Worker). • Each thread handles a single connection. • Allows Children to handle many connections at once.
Prefork MPM • Apache 1.3 and Apache 2.0 Prefork • Each child handles one connection at a time • Many children • High memory requirements • “You’ll run out of memory before CPU”
Prefork Directives (Apache 2.0) • StartServers • MinSpareServers • MaxSpareServers • MaxClients • MaxRequestsPerChild
Worker MPM • Apache 2.0 and later • Multithreaded within each child • Dramatically reduced memory footprint • Only a few children (fewer than prefork)
Worker Directives • MinSpareThreads • MaxSpareThreads • ThreadsPerChild • MaxClients • MaxRequestsPerChild
Apache 1.3 and 2.0Performance Characteristics Multi-process, Multi-threaded, or Both?
Prefork • High memory usage • Highly tolerant of faulty modules • Highly tolerant of crashing children • Fast • Well-suited for 1 and 2-CPU systems • Tried-and-tested model from Apache 1.3 • “You’ll run out of memory before CPU.”
Worker • Low to moderate memory usage • Moderately tolerant to faulty modules • Faulty threads can affect all threads in child • Highly-scalable • Well-suited for multiple processors • Requires a mature threading library(Solaris, AIX, Linux 2.6 and others work well) • Memory is no longer the bottleneck.
Important Performance Considerations • sendfile() support • DNS considerations
sendfile() Support • No more double-copy • Zero-copy* • Dramatic improvement for static files • Available on • Linux 2.4.x • Solaris 8+ • FreeBSD/NetBSD/OpenBSD • ... * Zero-copy requires both OS support and NIC driver support.
aio • Process does not block on • Socket • Disk read or write • semaphores