1 / 43

Web Servers

Web Servers. Pre-lecture Survey: What is the #1 web server:. Apache Google MS IIS HTTP server nginx Sun Other. http://en.wikipedia.org/wiki/Web_servers. Generic Overview. Web Servers. A web server can be a: Computer Program

dotsonj
Download Presentation

Web Servers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Web Servers

  2. Pre-lecture Survey: What is the #1 web server: • Apache • Google • MS IIS HTTP server • nginx • Sun • Other

  3. http://en.wikipedia.org/wiki/Web_servers Generic Overview

  4. Web Servers • A web server can be a: • Computer Program • Responsible for accepting HTTP requests from clients (web browsers) • Returns HTTP responses with optional data contents • Usually web pages • HTML documents • Linked objects (images, etc.). • Computer • Running a computer program which provides the above functionality

  5. Common Features

  6. Common Features • HTTP • Accepts HTTP requests from a client • Provides HTTP responses to the client • Typical “HTML” document can be: • File containing HTML statements • Raw text file • Image • Some other type of document • defined by MIME-types • In case of an error in a client request or trying to service the request: • Web server sends an error response • May include custom HTML • May have text messages • Better explain the problem to end user

  7. Common Features • Logging • Web servers keep detailed information to log files • Client requests • Server responses • Allows the Webmaster to collect data • Running log analyzers

  8. Additional Available Features • Authentication • Optional authorization before allowing access to some or all resources • Requires a user name and password • Handles: • Static content • Dynamic content • Support one or more related interfaces • SSI, CGI, SCGI, FastCGI, JSP, PHP, ASP, ASP .NET, Server API such as NSAPI, ISAPI, etc.

  9. Additional Available Features • HTTPS support • VIA SSL or TLS • Allows secure (encrypted) connections • Uses port 443 instead of port 80 • Content compression • I.e. by gzip encoding • Reduces the size of the responses • Lower bandwidth usage, etc.

  10. Additional Available Features • Virtual hosting • Serve many web sites using one IP address • Large file support • Serve files greater than 2 GB • Typical 32 bit OS restriction • Bandwidth throttling • Limit the speed of responses • Do not saturate the network • Able to serve more clients

  11. Where does the requested material come from? Origin of the returned content

  12. Content Origin • Origin of the returned content may be: • Static • Pre-existing data file • Content changes only if manually edited • Contents loaded on request • Dynamic • Content generated by another program • Script (programming language) • Creates/retrieves the requested information • Static content is usually delivered much faster than dynamic content • 2 to 100 times • Especially if the latter involves data pulled from a database

  13. How does it find it? Path translation

  14. Path translation • Web servers map the path component of a Uniform Resource Locator (URL) into: • Local file system resource • Static requests • Internal or external program name • Dynamic requests • For a static request the URL path specified by the client is relative to the Web server's root directory • This is not the same as the computers root directory

  15. Path translation • Consider the following URL requested by a client Web Browser: • http://www.example.com/path/file.html • Client's Web browser translates it: • Where • http:// • Use the HTTP protocol • www.example.com • The Web server to connect to • This is translated to an IP address by DNS • Sent to 93:184.216.119:80 • Note port 80 is usually implicit • /path/file.html • The resource to access • Generates the following HTTP 1.1 request sent to the IP address: • GET /path/file.html HTTP/1.1Host: www.example.com

  16. Path translation (cont.) • Web server host (www.example.com) • Sees the request is for port 80 • Sends request to the Web Server software • Appends the given path/file to the path of the servers Web root directory • Linux Apache typical roots: • /var/www/htdocs • /var/www • /var/www/html • Result would then be the local file system resource: • /var/www/htdocs/path/file.html • /var/www/path/file.html • /var/www/html/path/file.html • Web server: • Retrieves the file, if it exists • Processes it by the Web servers rules • Sends a response to the client's web browser • Response: • Describes the content of the returned data/file • Contains the data requested –or- a response

  17. Performance

  18. Performance • Web servers must: • Serve requests quickly! • From more than one TCP/IP connection at a time • Some main key performance parameters are: • number of requests per second • depends on the type of request, etc. • latency response time in milliseconds • for each new connection or request • throughput in bytes per second • Depends on • File size • Content cached or not • Available network bandwidth • etc. • concurrency level • How does a server respond to multiple client requests

  19. Performance • Measured under: • Varying load of clients • Varying requests per client • Performance parameters may vary noticeably depending on the number of active connections • Specific server model used to implement a web server program can bias the performance and scalability level that can be reached under heavy load or when using high end hardware • many CPUs, disks, etc.

  20. Load limits

  21. Load limits • Web servers have load limits • Can be set in a configuration file • Can handle only a limited number of concurrent client connections per IP address (and IP ports) • Usually between 2 and 60,000 • Default between 500 and 1,000 • Can serve only a certain maximum number of requests per second depending on: • Settings • HTTP request type • Content origin • Static • Dynamic • Served content cached or not • Hardware and software limits of the native OS • A web server near or over its limits • Becomes overloaded • Unresponsive

  22. Overload causes

  23. Overload causes • A sample daily graph of a web server's load, indicating a spike in the load early in the day.

  24. Overload causes • Web servers may be overloaded because of: • Too much legitimate web traffic • Thousands or even millions of clients hitting the web site in a short interval of time • DDoS • Distributed Denial of Service attacks • Coordinated • Computer worms • Abnormal traffic because of millions of infected computers • Not coordinated • XSS viruses • Millions of infected browsers and/or web servers • Internet web robots • Traffic not filtered / limited on large web sites with very few resources (bandwidth, etc.) • Internet (network) slowdowns • Client requests are served more slowly and the number of connections increases so much that server limits are reached • Web servers (computers) partial unavailability • Required / urgent maintenance or upgrade • HW or SW failures • Back-end (i.e. DB) failures, etc. • Remaining web servers get too much traffic and they become overloaded

  25. Overload symptoms

  26. Overload symptoms • Symptoms of an overloaded web server include: • Requests are served with (possibly long) delays • from 1 second to a few hundred seconds • 500, 502, 503, 504 HTTP errors returned to clients • Sometimes also unrelated 404 error or even 408 error may be returned • TCP connections are refused or reset (interrupted) before any content is sent to clients • In very rare cases, only partial contents are sent • This behavior may well be considered a bug • Even if it stems from unavailable system resources

  27. Anti-overload techniques

  28. Anti-overload techniques • To partially overcome load limits and to prevent overload use techniques like: • Managing network traffic by using: • Firewalls • Block unwanted traffic • Bad IP sources • Bad patterns • HTTP traffic managers • Drop, redirect or rewrite requests having bad HTTP patterns • Bandwidth management and traffic shaping • Smooth down peaks in network usage • Deploying web cache techniques • Use different domains to serve different content (static and dynamic) by separate Web servers, i.e.: • http://images.example.com • Serves static images • http://www.example.com • Serves dynamic data requests

  29. Anti-overload techniques • Techniques continued: • Use different domain names and/or computers to separate big files from small/medium files • Be able to fully cache small and medium sized files • Efficiently serve big or huge (over 10 - 1000 MB) files by using different settings • Using many Web servers (programs) per computer • Each bound to its own network card and IP address • Use many Web servers that are grouped together • Act or are seen as one big Web server • See Load balancer

  30. Anti-overload techniques • Techniques continued: • Add more hardware resources • RAM, disks, NICs, etc. • Tune OS parameters • Hardware capabilities • Usage • Use more efficient computer programs for web servers, etc. • nginx • Use workarounds • Specially if dynamic content is involved

  31. Historical notes

  32. Historical notes • World's first web server • 1989 - Tim Berners-Lee proposed to CERN a new project • Ease the exchange of information between scientists • Using a hypertext system • 1990 - Berners-Lee wrote two programs: • Browser • WorldWideWeb • Web server • Ran on NeXTSTEP

  33. Historical notes • First web server in USA • Installed December 12, 1991 • Bebo White at SLAC • After returning from a sabbatical at CERN • Between 1991 and 1994: • Simplicity and effectiveness of early technologies used to surf and exchange data through the World Wide Web helped to: • Port them to many different operating systems • Spread their use among lots of different social groups of people • First in scientific organizations • Then in universities • Finally in industry

  34. Historical notes • 1994: Tim Berners-Lee constituted the World Wide Web Consortium (W3C) • Regulate the further development of the many technologies in a standardization process: • HTTP • HTML • etc. • Following years saw an exponential growth of the number of web sites and servers

  35. Resume 2/27

  36. Software

  37. Software • There are thousands of different web server programs available • Many specialized for very specific purposes • About 50 mainstream • The fact that a web server is not very popular does not necessarily mean • Lot of bugs • Poor performance • See Category:Web server software for a longer list of HTTP server programs.

  38. Statistics

  39. Statistics • Most popular web servers, used for public web sites, are tracked by • Netcraft.com • Details given by • Netcraft Web Server Reports • According to this site: • Apache has been the most popular web server on the Internet since April of 1996 • July 2010 Netcraft Web Server Survey: • 54.90% web sites on the Internet use Apache • 25.87% web sites use IIS

  40. Web Servers

  41. Post-survey: What is the #1 web server: • Apache • Google • MS IIS HTTP server • nginx • Sun

  42. Summary

  43. Summary • Concentrated on HTTP servers • Apache and IIS are the main web serving tools • nginx is rising fast • Apache/Microsoft battling • Apache currently declining • IIS currently up • Usage tracked • Netcraft Web Server Survey

More Related