520 likes | 712 Views
Web Servers & Load Balancing Techniques. 3/20/2001 송준화 김영호. Part I : Web Servers. Overview. What is a web server? Market share How a web server works? How does a web server serve contents? Architectures of Web Servers Example : Apache, AOLServer, Jigsaw Issues on Web Servers
E N D
Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호 Network Computing Laboratory EE. KAIST
Part I : Web Servers Network Computing Laboratory EE. KAIST
Overview • What is a web server? • Market share • How a web server works? • How does a web server serve contents? • Architectures of Web Servers • Example : Apache, AOLServer, Jigsaw • Issues on Web Servers • Load Balancing Techniques (part 2) • References Network Computing Laboratory EE. KAIST
What is a Web Server? • An advanced application which runs on a server and does the following • Provides connections to remote computers • Sends web pages to remote computers via the Internet or an Intranet • Examples of Web Servers • Apache • MS Internet Information Server for Windows NT • AOLServer Network Computing Laboratory EE. KAIST
Market Share Network Computing Laboratory EE. KAIST
How does a Web Server Work? • Static Contents • Web server receives a request for a Web page such as http://www.kaist.ac.kr/index.html • Server maps URL to a local file on the host server. • The server then loads this file from disk and serves it out across the network to the user's Web browser. Network Computing Laboratory EE. KAIST
Dynamic contents • Dynamic means Web pages created in response to a user’s input (eg : CGI) • Web server should run programs locally and transmit their output through the Web server to the user's Web browser that is requesting the dynamic content • user's Web browser never really has to know that the content is dynamic because CGI is basically a Web server extension protocol. Network Computing Laboratory EE. KAIST
How does a web server serve contents? • The primary mechanism for deciding how to display content is the MIME type header. • Multipurpose Internet Mail Extension (MIME) types tell a Web browser what sort of document is being sent. Network Computing Laboratory EE. KAIST
More than 370 MIME types are distributed with the Apache Web server by default in the mime.types configuration file. • eg) Apache mime.types file. • text/xml xml • video/mpeg mpeg mpg mpe • video/quicktime qt mov Network Computing Laboratory EE. KAIST
Browser Reception Request Analysis Record Transaction Web Server Access Control Resource Handler Operating System Abstraction Layer Utility Operating System Network Computing Laboratory EE. KAIST
Architecture of Web Server • Reception • Interprets the resource request protocol • Parses the requests, and builds an internal representation of the request • Determines capabilities of the browser (e.g., simple text browser or graphics capable browser) Network Computing Laboratory EE. KAIST
Request Analyzer • Translates the location of the resource from network location to local file name • eg) ~/index.html could be transformed to local file /usr/httpd/pub/index.html • Access Control • Enforces the access rules employed by the server • Authenticate the browser and authorizes their access to the requested resources February 19, 2001 PC Data Online Network Computing Laboratory EE. KAIST
Resource Handler • Determines the type of the resource requested by the browser, executes it and generates the response. • Record Transaction • Records all the requests and their result. • Support Layer • Utility and Operating System Abstraction Layer • Provide functions used by the above subsystems Network Computing Laboratory EE. KAIST
Utility subsystem • Contains functions that are used by all other subsystems. • It has functions for manipulating strings or URLs and many commonly used functions • Operating System Abstraction Layer • Encapsulates the OS specific functionality to facilitate the porting of the server to different platforms Network Computing Laboratory EE. KAIST
Example (1): Apache • Freely Available • Source code • binaries for many platforms (version 1.3.x includes also the Windows NT) • Web server originally based on NCSA server (in 1995) • Over 60% of Internet Web servers run Apache or an Apache derivative (in the December 2000 survey) Network Computing Laboratory EE. KAIST
Process based • 2.0 will support multi threads • Very configurable, lots of directives... • Optional modules provide extra functionality • Apache is “A PAtCHy server” • Patches on NCSA Httpd 1.3 • Powerful performance and Continually upgrade Network Computing Laboratory EE. KAIST
core Translation Req. analysis Recep. Logging Record Trans. Authentication Res. Handler Access Ctrl Mime type Response Authorization Util OS Layer Network Computing Laboratory EE. KAIST
Apache • Core: maintains multiple processes • Request_rec: internal representation Network Computing Laboratory EE. KAIST
Example(2): AOLServer • Commercial Web Server • Developed by AOL • Source opened in 1999 • First released in 1995 • Powerful support for Database • Provide extensibility • By using a maintainable and safe extension language • Using TCL (Tool Command Language) as the extension language. Network Computing Laboratory EE. KAIST
Communication Driver Recep. Req. analysis Daemon Core *(NS: NaviSoft) NSPerm NSLog Access Ctrl Record Trans. URL Handle Res. Handler Util OSAL Database Interface TCL Interpreter Timer Util NSthread platform independent Thread Lib. Network Computing Laboratory EE. KAIST
AOLServer • Richer OSAL and Utility subsystems (than Apache) • Portable thread lib. Implementation • Database interface • Timer • Event scheduling, time-out of connections, etc • TCL interpreter • Support for multiple network protocols • Internal structure: Conn Network Computing Laboratory EE. KAIST
Jigsaw • Experimental server developed by W3C • Analyzing Internet protocols and standards • Open source, first released in 1996 • Written in Java • Platform independent • OSAL does not exist • Extensibility • Object Oriented design Network Computing Laboratory EE. KAIST
Daemon Recep. Protocol Frame In Filter Protocol Frame Out Filter Access Ctrl Record Trans. Resource In Filter Resource Out Filter Res. Handler Resource Util Network Computing Laboratory EE. KAIST
Jigsaw • Daemon: maintains a thread pool for concurrency • Filters: for different experiments??? Network Computing Laboratory EE. KAIST
Issues on Web Server • Connections explosion • Due to rapid growth of WWW application on the internet, a web server may encounter the situation that a huge number of connection requests in a very short time • Research trend on web server • Load Balancing • Distributed Scalable Web Server Network Computing Laboratory EE. KAIST
Part II : Load Balancing Techniques Junehwa Song Young Ho Kim Network Computing Laboratory EE. KAIST
Load Balancing Technique • Mirror • Client based approach • DNS-based approach • Dispatcher based approach • Packet Single Rewriting • Packet Double Rewriting • Network Dispatcher • Server based approach • HTTP redirection • Packet redirection Network Computing Laboratory EE. KAIST
Mirror • Replicate information across a mirrored server architecture • User manually select alternative URL • Not user transparent • Don’t allow the Web-server system to control request distribution Network Computing Laboratory EE. KAIST
Client Based Approach • Web Client • Web client selects a node of the cluster and submits the request to the selected node • Netscape home(http://www.netscape.com) use this technique • When user access this site, Navigator selects a random number i between 1 and the number of servers and directs the request to the node wwwi.netscape.com • Limited practical applicability and is not scalable Network Computing Laboratory EE. KAIST
Smart Client • Migrates server functionality to the client through a Java applet • Increase network traffic and network delay • Client side Proxies • Web Cluster standpoint, proxy servers are similar to clients Network Computing Laboratory EE. KAIST
DNS Based Approach • DNS server maps the domain name to multiple IP address • Returning more than one IP address for the hostname or returning a different IP address for each DNS request it receives (Round robin) • User transparent • Simple and easy to implement Network Computing Laboratory EE. KAIST
Drawbacks • Unable to know the situation of the whole system • Not really fair because DNS uses a simple round robin • DNS may encounter TTL problem in IP-address cache • Between the client and the web server DNS, many intermediate name servers can cache the logical name to IP address mapping to reduce network traffic and every web browser typically caches some address resolution Network Computing Laboratory EE. KAIST
Because of address caching, each address can cause a burst of future requests to the selected server and quickly obsolete the current load information • Many DNS based solutions to this problem • System-Stateless algorithms • Server-State-based algorithms • Client-State-based algorithms • Adaptive TTL Algorithms Network Computing Laboratory EE. KAIST
Dispatcher based approach Network Computing Laboratory EE. KAIST
To centralize request scheduling and completely control client-request routing • Request routing among server is transparent-unlike DNS-based • DNS deals address at the URL level, the dispatcher has a single, virtual IP address(IP-SVA) • Dispatcher uniquely identifies each server in the system through a private address • Dispatcher typically use simple algorithms to select the Web server Network Computing Laboratory EE. KAIST
Packet Single Rewriting Network Computing Laboratory EE. KAIST
TCP router acts as an IP address dispatcher • Router tracks the source IP address for every established TCP connection to route packets regarding the same connection to the same web server node • High System availability • When one of server fails, its address can be removed from the router’s table • Can be combined with a DNS based solution Network Computing Laboratory EE. KAIST
Packet Double Rewriting Network Computing Laboratory EE. KAIST
Two solution using this approach • Magicrouter • Cisco System’s Local Director • Because outgoing packets typically outnumber incoming request packets, dispatcher becomes bottleneck Network Computing Laboratory EE. KAIST
Network Dispatcher • Extends the basic TCP router mechanism work with both LANs and WANs • Dispatcher forward packets to the selected server using its physical address without IP modification Network Computing Laboratory EE. KAIST
Core and Sore Lab NRL project • http://core.kaist.ac.kr/nrlintro2.htm Network Computing Laboratory EE. KAIST
Server based approach • Use two level dispatching mechanism • Integrating the DNS based approach with redirection techniques executed by Web server • Solves most DNS scheduling problem • Two Solution • HTTP redirection • Packet redirection Network Computing Laboratory EE. KAIST
HTTP Redirection Network Computing Laboratory EE. KAIST
Above figure server1 redirect the request to server2. Not client transparent ! • Overhead of infra cluster communication • Every server must periodically transmit status information to cluster DNS • Increases response time in client side, because of packet redirection Network Computing Laboratory EE. KAIST
Packet Redirection • Use a round robin DNS mechanism to schedule the request among the Web Server • Server reached by a request reroutes the connection to another server through a packet rewriting • Transparent to the client! • Packet rewriting overhead Network Computing Laboratory EE. KAIST
Reference [1] A reference architecture for Web Server Reverse Engineering, 2000. Proceedings. Seventh Working Conference on , 2000 , Page(s): 150 -159 [2] Dynamic load balancing on Web-server systemsCardellini, V.; Colajanni, M.; Yu, P.S.IEEE Internet Computing Volume: 3 3 , May-June 1999 , Page(s): 28 -39 [3] Design and practice of a dispatch server architectureHong, H.C.; Chen, Y.C.Distributed Computing Systems, 1999. Proceedings. 7th IEEE Workshop on Future Trends of , 1999 , Page(s): 246 -251 [4] Scalable Web server architecturesMourad, A.; Huiqun LiuComputers and Communications, 1997. Proceedings., Second IEEE Symposium on , 1997 , Page(s): 12 -16 Network Computing Laboratory EE. KAIST