1 / 73

Advanced Internet and Web Systems

Advanced Internet and Web Systems. C. Edward Chow. Outline of the Talk. Syllabus Introduction to WWW Systems Survey of Web Cluster Systems Survey of Caching Techniques Server Selection and Load Balancing. Introduction to WWW Systems. Web Server Hosting web pages. Retrieving web pages

Download Presentation

Advanced Internet and Web Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Advanced Internet and Web Systems C. Edward Chow

  2. Outline of the Talk • Syllabus • Introduction to WWW Systems • Survey of Web Cluster Systems • Survey of Caching Techniques • Server Selection and Load Balancing chow

  3. Introduction to WWW Systems Web Server Hosting web pages Retrieving web pages using HTTP protocol Web Authoring System create web pages Internet Web Client Browser Publish web pages Scanner Video capture Sound card Web page: document written in HTML chow

  4. What is Unique in WWW? • Hyperlink: Use Hypertext Markup Language HTML to describe the document in ASCII text (extended to iso-8859-1) • Naming scheme: Name object in the web with Universal Resource Locator (URL) with syntax:protocol://domain_name/<uri or path name> • HTTP: HyperText Transfer Protocola simple request-response protocol for transferring HTML documents • ASCII text based (not binary, therefore easy to debug) chow

  5. Web Authoring System • Text Editor: type in HTML <tag> and content • HTML Editor: like normal word processor, user did not have know a lot about HTML syntax. Provide easy upload/download function. • Dreamweaver • Netscape Page Composer, MS Front Page • Front Page takes another step by providing templates and hyperlink management functions • Most desktop publishing software and word processor have built-in converters to convert from their internal format to HTML format. For example • FrameMaker, Office97(require special viewer) chow

  6. Web Delivery Systems • Delivery web documents efficiently and reliably to the web clients. • Content Distribution and Content Delivery • Performance is decided by • Web server performance • Network path performance • Client browser performance. • Use multiple physical servers (server farm), and multiple server farms in wide area. • New generation of proxy servers/content switches emerge. chow

  7. Host Server Sprint UUnet Gloobix QWest Clients Content Delivery Network (CDN) Slow Response Huge Requests @Home Clients PSINet Server Crash MindSpring Clients chow

  8. Content Delivery Problems http://www.akamai.com chow

  9. Host Server QWest Use Client Cache/Client Side Cache Server Fewer Requests Clients @Home PSINet Fast Response Sprint UUnet Client Cache Gloobix MindSpring Client Side Cache Server Clients Clients chow

  10. Fewer Requests Host Server UUnet Sprint Server Gloobix QWest MirrorSite Use Mirror Sites Need improvement by guiding the selection of mirror servers with server load/network bandwidth measurement Mirror Site Clients @Home PSINet Clients MindSpring Fast Response Clients chow

  11. Cache Server Cache Server Cache Server Cache Server Fewer Requests Host Server Sprint UUnet Server Gloobix QWest MirrorSite Mirror Site Edge Network Cache Servers Fast Response Clients @Home PSINet Client Cache MindSpring Edge Network Cache Server Client Side Cache Server Clients Clients chow

  12. Architecture solutions for scalable Web-server systems (Fig. 1) chow

  13. Fig. 2. Model architecture for a locally distributed Web system chow

  14. Fig. 3. Architecture of a cluster-based Web system chow

  15. Fig. 4. Architecture of a virtual Web cluster chow

  16. Fig. 5. Architecture of a distributed Web system chow

  17. Content Distribution • Secure, automate content/application distribution to single (multiple server)/wide area Internet sites. • Provide replication, synchronization, staged rollout and roll back. • With revision control, transmit only updates. • User-defined file distribution profiles/rules chow

  18. Content Delivery Problem • Cache Location Problem: Where to put cache servers? • How many are needed? • When/where/how to push/delivery the content? • How about dynamic content? chow

  19. Akamai Edge Delivery Service • Peering Bottleneck Problem: Access traffic evenly spread over 7400+ networks (no one over 5%; most << 1%) Need to put edge servers in many networks. • 11/2000, 4 billion bits/day for 2800 sites. • Source Http://www.akamai.com chow

  20. Site II losangeles.domain.com Internet Internet Site I newyork.domain.com Router 3-DNS BIG-IP BIG-IP Local DNS GLOBAL-SITE Webmaster Site III tokyo.domain.com Server Array User london.domain.com F5 Web System Product chow

  21. BIG/ip - Delivers High Availability • E-commerce - ensures sites are not only up-and-running, but taking orders • Fault-tolerance - eliminates single points of failure • Content Availability - verifies servers are responding with the correct content • Directory & Authentication - load balance multiple directory and/or authentication services (LDAP, Radius, and NDS) • Portals/Search Engines – Using EAV administrators perform key-word searches • Legacy Systems - Load balance services to multiple interactive services • Gateways – Load balance gateways (SAA, SNA, etc.) • E-mail (POP, IMAP, SendMail) - Balances traffic across a large number of mail servers chow

  22. 3DNS Intelligent Load Balancing • Intelligent Load Balancing • QoS Load Balancing • Quality of Service load balancing is the ability to select apply different load balancing methods for different users or request types • Modes of Load Balancing • Round Robin Ratio • Least Connections Random • User-defined Quality-of-Service Round Trip Time • Completion Rate (Packet Loss) BIG/ip Packet Rate • Global Availability HOPS • Topology Distribution Access Control • LDNS Round Robin Dynamic Ratio • E-Commerce chow

  23. GLOBAL-SITE Replicate Multiple Servers and Sites • File archiving engine and scheduler for automated site and server replication • BIG-IP controls server availability during replication and synchronization • Gracefully shutdown for update • update in group/scheduled manner • FTP provides transferring files from GLOBAL-SITE to target servers (agent free, scalable) • RCE for source control • No client side software • Complete, turnkey system (appliance)(adapt from F5 presentation) chow

  24. Intel NetStructure • Routing based on XML tag (e.g., given preferred treatment for buyers, large volume) • http://www.intel.com/network/solutions/xml.htm chow

  25. 1. Compared to SUN E450 server chow

  26. Simple Web Access Example: Step1 • Someone requests a document using a browser (Web Client) on a computer connected to Internet • On a browser window Type in a URL, http://news.netcraft.com/archives/web_server_survey.html • Equivalent of %telnet www.netcraft.co.uk 80 > outGET /survey/ HTTP/1.0<cr><cr> • Here <cr> is “carriage return” entered by pressing “enter”key • The browser parses the URL, • obtains domain name of url, www.netcraft.co.uk • asks Domain Name Server (DNS) for translating the domain name to the IP address • with IP address the client computer set up a HTTP connection to the server chow

  27. Computer Network Local Area Network (LAN): a private-owned network within a single building or campus of up to a few kilometer in size (Tanenbaum). Wide Area Network (WAN): a network that spans a large geographical area, often a country or continent, and connects LANs or MANs. It consists of transmission line (called circuits, channels, or trunks) and switching elements (called switching nodes, data switching exchanges or router). web client web server DNS DNS chow

  28. Protocol and Protocol Layer • A set of rules for achieving a global objective exercised by geographically distributed nodes. (Robert Gallager, Prof. EE MIT) chow

  29. Protocol Data Encapsulation chow

  30. Internet Protocol Layer Interface chow

  31. Simple Web Access Example: Step2 Browser sends the following character string to serverGET /survey/ HTTP/1.0User-agent: Mosaic for X windows/2.4Accept: text/plainAccept: text/htmlAccept: image/* httpd server • parses the request according to HTTP protocol 1.0 • interprets rest of the metainfo for browser capabilities • Maps the /survey/ to c:/InetPub/wwwroot/survey/default.htma file path in its file system according to server configuration. • retrieves c:/InetPub/wwwroot/survey/default.htm or index.html • sends information back using HTTP/1.0 format chow

  32. Simple Web Access Example: Step3 • Server replies information using HTTP/1.0 format HTTP/1.0 200 Document follows Date: Tue, 19 Jan 1999 18:10:20 GMT Server: NCSA/1.5 Content-type: text/html <html> <head><title>Netcraft Web Server Survey</title></head> • Server close file, set certain timeout and wait for next subsequent requests, such as images/midi files referenced in the web page. (called keep-alive connection). When time expires, disconnect the connection. chow

  33. Simple Web Access Example: Step3a • Browser send GET /sample.htm HTTP/1.0 • Server replies HTTP/1.0 404 Object Not Found Content-Type: text/html <body><h1>HTTP/1.0 404 Object Not Found </h1></body> • Server close file, network connection, wait for next request chow

  34. Simple Web Access Example: Step4 • Browser receives http response, a web document with HTML tags, from the server. • Browser parses/processes the HTML document, display the document content according the tags. • When other images/audio/video data are referenced by <img> <object> <applet> tags, the browser initiates the retrieval of those data. • Some of them will http requests to the same web servers. That is the reason why keep-alive connection improves the web server throughput. • A URL request may trigger many http requests to several web servers. chow

  35. HTTP • HTTP1.0/1.1http://www.w3.org/Protocols/rfc2068/rfc2068 • A HTTP request consists of • method: GET, HEAD, POST, PUT, DELETE, • Universal Resource Identifier (URI) • Protocol version • other info to modify or supplement the request • If-Modified-Since: (only return object if it is newer the date • authorization: (user password or other authentication as required) • accept: application/postscript chow

  36. HTTP Response • consists of • status line (success or failure) HTTP/1.1 400 Bad Request200 (Document Follow), 301 (Move Permanently), 302 (Move Temporarily), 304 (Not Modified), 401 (Unauthorized), 402 (payment required), 403 (Forbidden), 404 (Not Found), 500 (server error) • description of the information (metaheader) • Server, Date, Content-Length, Content-Type, Content-Encoded, Last Modified • actual info requested chow

  37. Content-Type: MIME Type MIME Type File Extension text/plain txt, default (most server) text/html htm, html application/postscript ps application/ms-powerpoint ppt application/x-javascript js image/gif gif image/jpeg jpg audio/midi mid video/mpeg mpg x-world/x-vrml wrl chow

  38. Configure MIME Types • For supporting new mime types, both web server and web client may need to be reconfigured. For web server, • Include new mime.type definition in the mime.types file of the configuration directory of the web server • By default, most servers deliver unknown type as text/plainbrowser then may display them as “gibberish” • Restart the web server For web client, • Specify external viewer associated with the mime type • Or, install the plug-in associate with the mime type chow

  39. Brief Survey of Web Servers • http://www.w3c.org/hypertext/WWW/Servers.html • Jigsaw, http://www.w3c.org/Jigsaw/ • http://java.sun.com/products/java-servers/ • http://www.yahoo.com/computers_and_Internet/Internet/World_Wide_Web/HTTP/Servers • http://www.netcraft.co.uk/Survey/ • “Web Server Technologies” by Nancy J. Yeager and Robert E. McGrath, Morgan Kaufmann 1996. chow

  40. CGI Script Example • Client type http://owl.uccs.edu/cgi-bin/chow/uptime.pl • or click on <A HREF =“http://owl.uccs.edu /cgi-bin/chow /uptime.pl”> Show the load on owl</A> in a web page. • uptime.pl #!/usr/bin/perl $UPTIME = '/usr/ucb/uptime'; select(STDOUT); $| =1; #make output unbufferedprint "Content-type: text/html\n\n"; if (-x $UPTIME) { exec($UPTIME); } else { print "cannot find uptime command on this system.\n"; exit(1); } chow

  41. CGI Script Example (Step 2) • Web browser sends “GET /cgi-bin/chow/uptime.pl HTTP/1.0” to owl.uccs.edu • httpd server at owl parses the request and discovers that a perl script needs to be executed. • It locates the script in the file system. • Create the execution environment • starting a process with appropriate shell environment variable set • with STDIN from httpd program • with STDOUT to httpd chow

  42. CGI Script Example (Step 3) • uptime.pl generates Content-type: text/plain 15:55 up 18 days, 7:15, 5 users, load average: 0.89, 0.81, 0.79 • It was sent over STDOUT back to httpd • httpd add HTTP/1.0 200 OK Server: Netscape-Communications/1.1 Date: Tuesday, 27-Jan-98 23:12:45 GMT • httpd relays the text string back to the web browser chow

  43. What problems can occur? • How to detect a script running infinite loop? • How to detect a hung script? chow

  44. Handle Multiple Requests • Can’t afford sequential processing, since some requested documents are big. Three basic approaches: 1. Fork a new child process: Cloning a copy of httpd 2. Use multithread (if the OS or language support it)e.g., IIS, Java Web Server, Jigsaw 3. Spread the load among several helper programse.g., Apache • Apache allows the starting , min, max # of child web server processes to be specified in a configuration file. It can dynamically adjust to the load. chow

  45. More than One Web Service on the Same Server Platform • Run different/same httpd programs on different ports http://www.server.org/intro.html (port 80 by default) http://www.server.org:8080/intro.html (port 8080) http://www.server.org:8081/intro.html (port 8081) • They may have different document trees, content, and access control, and serve different user groups (customer, sales, authorized) • Note that running program at any port < 1024 requires root privilege. chow

  46. Virtual Hosting • To allow one server to server requests with multiple IP addresses. • It is a low cost option for clients that want own id and cannot afford a separate machine/connection. • Hosting other domain names on the same machine. • http://www.a.com/home.html • http://www.b.com/home.html • Require OS with virtual host support. • Assign Multiple IP numbers to the same interfaceusing the ifconfig command in UNIX or ipconfig in NT. chow

  47. Assign Multiple IP Address to the Same Interface • On FreeBSD, execute ifconfig ep0 192.168.123.2 ifconfig ep0 192.168.123.3 alias netmask 0XFFFFFFFF ifconfig ep0 192.168.124.1 alias (netmask option is used to suppress error msg) • On Linux, execute ifconfig eth0:0 192.168.123.3 192.168.124.1 you may add # route add -host 192.168.123.3 dev eth0:0 # route add -host 192.168.124.1 dev eth0:0 chow

  48. New Hosting Technique • Set up virtual machines for each customer • Related software packages: • User mode Linux • VMWare ESX and Virtual Center/Infrastructure. • MS VS 2005 • Utility Computing (On-Demand Computing) chow

  49. Improving WWW Delivery Systems • Currently network is bottleneck. • The retrieval of web pages can be improved by • increasing network bandwidth, e.g., ADSL link • reducing round trip, e.g., use client side programming to check data with Java/Javascript • caching (both at client and proxy cache server) • increase # and processing power of web servers • load balancing by partitioning client-server requests chow

  50. to Internet RRDNS DMZ Firewall Router/Firewall Web Server1 Internal Proxy Server Web Server9 HA NFS Server HA NFS Server Router/Firewall To Intranet Web Pages Large Web Sites • Mapping the request, e.g., ftp.netscape.com, evenly across a set of server, e.g., ftp[1-28].netscape.com chow

More Related