330 likes | 368 Views
Improving Web Servers performance. Objectives: Scalable Web server System Locally distributed architectures Cluster-based Web systems Distributed Web systems Cluster-based solutions Distributed Web-based solutions Dispatching algorithms for cluster-based Web systems. Reference.
E N D
Improving Web Servers performance Objectives: • Scalable Web server System • Locally distributed architectures • Cluster-based Web systems • Distributed Web systems • Cluster-based solutions • Distributed Web-based solutions • Dispatching algorithms for cluster-based Web systems
Reference “The State of the Art in Locally Distributed Web-server Systems” Valeria Cardellini, Emiliano Casalicchio, Michele Colajanni and Philip S. Yu
Concepts • Web server System is a system that Provides web services • The trend is • Increasing number of clients • Growing complexity of web applications • Scalable Web server systems • The ability to support large numbers of accesses and resources while still providing adequate performance
Locally Distributed Web System • Cluster Based Web System • The server nodes mask their IP addresses to clients, using a Virtual IP address corresponding to one device (web switch) in front of the set of the servers • Web switch receives all packets and then sends them to server nodes • Distributed Web System • The IP addresses of the web server nodes are visible to clients • No web switch, just a layer 3 router may be employed to route the requests
Request routing mechanisms • After classifying the two Web systems • Cluster Based Web System • Distributed Web System • The question now becomes “how are packets routed to each of the web servers?
Request routing mechanisms for cluster-based Web systems • layer-4 switch • Content-blind routing • layer-7 switch • Content-aware switches • Also called Layer 5 switches in TCP/IP protocol What are the trade-offs between layer-4 and layer-7 switches?
Layer-4 one-way mechanisms • Packet single-rewriting • Same as two-way architecture. The only difference is in the modification of the source address of outbound packets • Packet tunneling • This is also known as IP encapsulation • IP datagrams with IP datagrams • Requires that all servers support IP tunneling • Packet frowarding • Assumes that the Web switch and the server nodes are on the same LAN • All nodes share the VIP address • Server nodes need to disable ARP • Web switch forwards the inbound packet to the target server without modifying the TCP/IP header
LAN Addresses Each adapter on LAN has unique LAN address
LAN Address (more) • MAC address allocation administered by IEEE • manufacturer buys portion of MAC address space (to assure uniqueness) • Analogy: • MAC address: like Social Security Number • IP address: like postal address • MAC flat address => portability • IP hierarchical address NOT portable
223.1.1.1 223.1.2.1 E B A 223.1.1.2 223.1.2.9 223.1.1.4 223.1.2.2 223.1.3.27 223.1.1.3 223.1.3.2 223.1.3.1 Routing discussion Starting at A, given IP datagram addressed to B: • look up net. address of B, find B on same net. as A • link layer send datagram to B inside link-layer frame frame source, dest address datagram source, dest address A’s IP addr B’s IP addr B’s MAC addr A’s MAC addr IP payload datagram frame
Question: how to determine MAC address of B knowing B’s IP address? ARP: Address Resolution Protocol • Each IP node (Host or Router) on LAN has ARP table • ARP Table: IP/MAC address mappings for some LAN nodes < IP address; MAC address; TTL> • TTL (Time To Live): time after which address mapping will be forgotten (typically 20 min)
A wants to send datagram to B, and A knows B’s IP address. Suppose B’s MAC address is not in A’s ARP table. A broadcasts ARP query packet, containing B's IP address all machines on LAN receive ARP query B receives ARP packet, replies to A with its (B's) MAC address frame sent to A’s MAC address (unicast) A caches (saves) IP-to-MAC address pair in its ARP table until information becomes old (times out) soft state: information that times out (goes away) unless refreshed ARP is “plug-and-play”: nodes create their ARP tables without intervention from net administrator ARP protocol
Layer-7 two-way mechanisms • TCP gateway An application level proxy running on the web switch mediates the communication between the client and the server • Makes separate TCP connections to client and server • TCP splicing reduce the overhead in TCP gateway. For outbound packets, packet forwarding occurs at network level by rewriting the client IP address
Layer-7 two-way Mechanisms • TCP gateway An application level proxy running on the web switch mediates the communication between the client and the server • TCP splicing reduce the overhead in TCP gateway. Packet forwarding occurs at network level between the network interface driver and the TCP/IP stack, is carried out directly by OS user kernel user kernel
Content-aware Switch www.yahoo.com Internet Image Server IP TCP APP. DATA Application Server Switch GET /cgi-bin/form HTTP/1.1 Host: www.yahoo.com… HTML Server • Front-end of a web servers • Route packets based on layer 5/7 (content) information
Why use Context-aware Switching • Servers can be specialized for certain types of request • Content segregation • Exploit locality • Affinity-based routing • Increase the performance because of the improved hit rate • Partial replication of server file set • Partition the server’s file set over different nodes
URL Parsing is expensive!! • Performing content-aware routing implies that some kind of string searching and matching algorithm is required • Such a time-consuming function is expensive in a heavy traffic web site • Experience showed that the system performance would be severely degraded if we implement some URL parsing functions in the distributor
TCP splicing • Once the two TCP connections are established, they are spliced • IP packets are forwarded at the network layer • TCP splicing requires • Connection binding • Packet analyzer to rewrite packets • Appropriate address translation • Sequence number modifications to be performed on the packets • Basically, we are deploying connection re-use
Layer-7 one-way mechanisms • TCP handoff • The switch hands off the TCP connection endpoint to the server • Needs changes to the OS on both components • TCP connection hop • Software-based proprietary solution • encapsulating the IP packet and sending it to the server
Layer-7 one-way mechanisms • Migrate the created TCP connection from the switch to the back-end sever • Create a TCP connection at the back-end without going through the TCP three-way handshake • Retrieve the state of an established connection and destroy the connection without going through the normal message handshake required to close a TCP connection • Once the connection is handed off to the back-end server, the switch must forward packets from the client to the appropriate back-end server
So far, we have discussed: Summary