180 likes | 276 Views
Chapter 9 Application Layer, HTTP. Professor Rick Han University of Colorado at Boulder rhan@cs.colorado.edu. Announcements. Read Sections 9.1 - 9.2, Skip 9.3 HW #4 due April 16 Programming Assignment #3 soon… Midterm: hand back April 4 Next, Application Layer. Recap of Previous Lecture.
E N D
Chapter 9Application Layer, HTTP Professor Rick Han University of Colorado at Boulder rhan@cs.colorado.edu
Announcements • Read Sections 9.1 - 9.2, Skip 9.3 • HW #4 due April 16 • Programming Assignment #3 soon… • Midterm: hand back April 4 • Next, Application Layer Prof. Rick Han, University of Colorado at Boulder
Recap of Previous Lecture • Domain Name Service • Translate/resolve a name to an IP address • www.cs.colorado.edu => 128.9.17.42 • Hierarchical name space • Hierarchical name servers • Root name servers – about a dozen • Then, .edu, .com, .gov, .mil, .org, .net, … • Local name server • Authoritative name server – gives back final IP address • Recursive vs. iterative queries • Caching Prof. Rick Han, University of Colorado at Boulder
HyperText Transfer Protocol (HTTP) • Basis for Web • Application-layer protocol built on top of TCP • Request-Response type of protocol • Request: e.g. “GET URL HTTP_version_#” • Response from server • Requests and responses are encoded in text • Stateless: after request and response, no further state maintained • Cookies maintain session state outside of HTTP Prof. Rick Han, University of Colorado at Boulder
HTTP Request • Request headers • Authorization – authentication info • Acceptable document types/encodings • From – user email • If-Modified-Since return page only if mod after date • Referrer – what caused this page to be requested • User-Agent – client software • Blank line • Body Prof. Rick Han, University of Colorado at Boulder
HTTP Request Example: GET GET / HTTP/1.1 Accept: */* Accept-Language: en-us Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0) Host: www.seshan.org Connection: Keep-Alive Prof. Rick Han, University of Colorado at Boulder
HTTP Response • Headers • Location – for redirection • Server – server software • WWW-Authenticate – request for authentication • Allow – list of methods supported (get, head, etc) • Content-Encoding – E.g x-gzip • Content-Length # bytes in content • Content-Type MIME type • Expires when contents become stale • Last-Modified time contents last mod by servr • Blank-line • Body Prof. Rick Han, University of Colorado at Boulder
HTTP Response Example HTTP/1.1 200 OK Date: Tue, 27 Mar 2001 03:49:38 GMT Server: Apache/1.3.14 (Unix) (Red-Hat/Linux) mod_ssl/2.7.1 OpenSSL/0.9.5a DAV/1.0.2 PHP/4.0.1pl2 mod_perl/1.24 Last-Modified: Mon, 29 Jan 2001 17:54:18 GMT ETag: "7a11f-10ed-3a75ae4a" Accept-Ranges: bytes Content-Length: 4333 Keep-Alive: timeout=15, max=100 Connection: Keep-Alive Content-Type: text/html MIME Type ….. Prof. Rick Han, University of Colorado at Boulder
HTTP 0.9/1.0 • One HTTP 1.0 request/response per TCP connection • Simple to implement • Disadvantages • Multiple connection setups three-way handshake each time • Several extra round trips added to transfer • Netscape browser opens up to 4 parallel HTTP 1.0 connections • Multiple slow starts Prof. Rick Han, University of Colorado at Boulder
HTTP 1.0 Interaction With TCP Server Client 0 RTT SYN Client opens TCP connection SYN 1 RTT ACK DAT Client sends HTTP request for HTML ACK Server reads from disk DAT FIN 2 RTT ACK Client parses HTML Client opens TCP connection FIN ACK SYN SYN 3 RTT ACK Client sends HTTP request for image DAT Server reads from disk ACK 4 RTT DAT Image begins to arrive Prof. Rick Han, University of Colorado at Boulder Courtesy: Srini Seshan
More HTTP 1.0 & TCP Interaction Problems • Lots of extra connections • Increases server state/processing • Server also forced to keep TIME_WAIT connection state for dead TCP connections • Tends to be an order of magnitude greater than # of active connections, why? Prof. Rick Han, University of Colorado at Boulder
HTTP 1.1 Persistent Connection Solution • Multiplex multiple requests onto one open TCP connection (& multiple responses in reverse direction) • Serialize transfers client makes next request only after previous response • Reduce slow start latency • Reduce amount of TCP state at both endpoints • Reduce overhead • HTTP 1.1 adds complexity because multiple requests (and responses) have to be multiplexed and demultiplexed Prof. Rick Han, University of Colorado at Boulder
HTTP 1.1 Persistent Connection Example Server Client 0 RTT DAT Client sends HTTP request for HTML ACK Server reads from disk DAT 1 RTT ACK Client parses HTML Client sends HTTP request for image DAT Server reads from disk ACK DAT 2 RTT Image begins to arrive Prof. Rick Han, University of Colorado at Boulder Courtesy: Srini Seshan
HTTP Client/ Browser HTTP Caching Proxy HTTP Web Server Web Caching Proxies • Place a Web caching proxy in the network between Web client and Web server • Reduces client response time • HTTP GET only goes as far as intermediate cache, rather than all the way to server • Reduces network bandwidth usage • HTTP GET doesn’t travel over wide area from caching proxy to server • Reduces server load • HTTP GET never reaches server Prof. Rick Han, University of Colorado at Boulder
Web Proxies • Used for Caching • Improved response time, etc. from previous slide • Provides a centralized coordination point to share cached information across all of a company’s client hosts • Also used for security • Proxy for a company can be the only host that can access Internet • Administrators makes sure that it is secure • Used for protocol translation • Translate HTTP 1.0 to/from HTTP 1.1. Enables old HTTP 1.0 clients to connect to HTTP 1.1 servers, and benefit from HTTP 1.1 performance boosts Prof. Rick Han, University of Colorado at Boulder
Designing Caching Proxies • How much can/should be cached? • How large a cache is necessary? • On disk vs. in memory typically on disk • What are the cache hit rates? • If user behavior is uncorrelated, have to cache a lot of data to improve response time, resulting in small cache hit rate • If user behavior is correlated, i.e. everyone visits only a few Web sites, then cache less data and still improve response time (high cache hit rate) Prof. Rick Han, University of Colorado at Boulder
Designing Caching Proxies (2) • What can be cached? • Cache first-time unknown documents/objects • Non-cacheable documents • CGI-scripts • Personalized documents (cookies, etc) • Encrypted data (SSL) • Document should no longer be cached if updated/expired before reuse Prof. Rick Han, University of Colorado at Boulder
Designing Caching Proxies (3) • Performance: • How many TCP connections can the proxy handle? • How to efficiently index into database/cache? • Early caches used file system to find file • Metadata now kept in memory on most caches • Prefetching – combine with caching to reduce response time • Proxy parses a Web page and prefetch its hyperlinked objects before the client asks for them • Example: when a proxy fetches a Web page on behalf of a client, the proxy will parse and cache the Web page returned by the server, and then prefetch all links before client requests them • Not widely used due to poor hit rates? Prof. Rick Han, University of Colorado at Boulder