470 likes | 659 Views
CSE 524: Lecture 4. Application layer protocols. Administrative. Reading assignment Chapter 2 Mid-term exam may be delayed to 11/2/2004 Mostly on lecture material, but may have some chapter material not covered in class. Where we’re at…. Internet architecture and history
E N D
CSE 524: Lecture 4 Application layer protocols
Administrative • Reading assignment Chapter 2 • Mid-term exam may be delayed to 11/2/2004 • Mostly on lecture material, but may have some chapter material not covered in class
Where we’re at… • Internet architecture and history • Internet protocols in practice • Application layer • Overview and functions • Network programming interface • Specific application protocols • Transport layer • Network layer • Data-link layer • Physical layer
AL: Specific applications/protocols • HTTP • DNS • FTP • SMTP
Web page: consists of “objects” addressed by a URL Most Web pages consist of: base HTML page, and several referenced objects. URL has two components: host name and path name: Client called a browser: MS Internet Explorer Netscape Communicator Server called a web server: Apache (public domain) MS Internet Information Server AL: WWW/HTTP basics http://www.someSchool.edu/someDept/pic.gif
AL: HTTP basics • http: hypertext transfer protocol • Web’s application layer protocol • client/server model • client: browser that requests, receives, “displays” Web objects • server: Web server sends objects in response to requests • HTTP/1.0: RFC 1945 • http://www.rfc-editor.org/rfc/rfc1945.txt • HTTP/1.1: RFC 2068 • http://www.rfc-editor.org/rfc/rfc2068.txt • HTTP state management (cookies): RFC 2109 • http://www.rfc-editor.org/rfc/rfc2109.txt http request PC running Explorer http response http request Server running NCSA Web server http response Mac running Navigator
http: uses TCP transport service: client initiates bi-directional TCP connection (via socket) to server, port 80 server accepts TCP connection http protocol messages exchanged between client and server Requests/responses encoded in text Client sends request to server, followed by response from server to client Server closes connection http is “stateless” server maintains no information about past client requests AL: http aside Protocols that maintain “state” are complex! • past history (state) must be maintained • if server/client crashes, their views of “state” may be inconsistent, must be reconciled
Suppose user enters URL www.someSchool.edu/someDepartment/home.index 1a. http client initiates TCP connection to http server (process) at www.someSchool.edu. Port 80 is default for http server. AL: http example (contains text, references to 10 images) 1b.http server at host www.someSchool.edu waiting for TCP connection at port 80. “accepts” connection, notifying client 2.http client sends http request message (containing URL) into TCP connection socket 3.http server receives request message, forms response message containing requested object (someDepartment/home.index), sends message into socket time
5. http client receives response message containing html file, displays html. Parsing html file, finds 10 referenced jpeg objects AL: http example (cont.) 4.http server closes TCP connection. 6.Steps 1-5 repeated for each of 10 jpeg objects time
AL: http message format: request • two types of http messages: request, response • http request message: • ASCII (human-readable format) request line (GET, POST, HEAD commands) GET /somedir/page.html HTTP/1.0 User-agent: Mozilla/4.0 Accept: text/html, image/gif,image/jpeg Accept-language:fr (extra carriage return, line feed) header lines Carriage return, line feed indicates end of message
AL: http request • Request line • Method • HTTP 1.0 • GET – return object sepecified by URI • HEAD – return headers only of GET response • POST – send data to the server (forms, etc.) • HTTP 1.1 • PUT – upload file to URI specified • DELETE – remove file specified by URI • OPTIONS, TRACE, CONNECT • Host: header required • Connection: header for persistence • URI • E.g. http://www.cse.ogi.edu/index.html with a proxy • E.g. /index.html if no proxy • HTTP version
AL: http request • Header lines (HTTP request headers) • Authorization • Authentication info • Accept • Acceptable document types, encodings, languages, character sets • From • User email (when privacy is disabled) • If-Modified-Since • For use with caching • Referer • URL which caused this page to be requested • User-Agent • Client software • Host • For multiple web sites hosted on same server • Connection • Keep connection alive for subsequent request or close connection
AL: http request • Blank-line • Separate request headers from POST information • End of request • Body • If POST, send POST information
AL: http 1.1 request example GET / HTTP/1.1 Accept: */* Accept-Language: en-us Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0) Host: www.cse.ogi.edu Connection: Keep-Alive
AL: http message format: response status line (protocol status code status phrase) HTTP/1.0 200 OK Date: Thu, 06 Aug 1998 12:00:15 GMT Server: Apache/1.3.0 (Unix) Last-Modified: Mon, 22 Jun 1998 …... Content-Length: 6821 Content-Type: text/html data data data data data ... header lines data, e.g., requested html file
AL: http response • Status-line • HTTP version • 3 digit response code • 1XX – informational • 2XX – success • 3XX – redirection • 4XX – client error • 5XX – server error • Reason phrase
200 OK request succeeded, requested object later in this message 301 Moved Permanently requested object moved, new location specified later in this message (Location:) 400 Bad Request request message not understood by server 404 Not Found requested document not found on this server 505 HTTP Version Not Supported AL: http response codes In first line in server->client response message. A few sample codes:
AL: http response • Header lines • Location • redirection • Server • server software • WWW-Authenticate • request for authentication • Allow • list of methods supported (GET, HEAD, etc) • Content-Encoding • x-gzip • Content-Length • Content-Type • Expires • Last-Modified • ETag
AL: http response • Blank-line • Separate headers from data • Body • Data being returned to client
AL: http response example HTTP/1.1 200 OK Date: Tue, 27 Mar 2001 03:49:38 GMT Server: Apache/1.3.14 (Unix) (Red-Hat/Linux) mod_ssl/2.7.1 OpenSSL/0.9.5a DAV/1.0.2 PHP/4.0.1pl2 mod_perl/1.24 Last-Modified: Mon, 29 Jan 2001 17:54:18 GMT ETag: "7a11f-10ed-3a75ae4a" Accept-Ranges: bytes Content-Length: 4333 Keep-Alive: timeout=15, max=100 Connection: Keep-Alive Content-Type: text/html …..
POST method: Input is uploaded to server in entity body GET method: Input is uploaded in URL field of request line AL: Handling user input (forms) GET search?name=george&animal=monkey HTTP/1.1 Host: www.somesite.com POST search HTTP/1.1 Host: www.somesite.com Content-type: application/x-www-form-urlencoded name=george&animal=monkey
AL: More HTTP examples • http://www.cse.ogi.edu/class/cse524/http.txt • http://www.cse.ogi.edu/class/cse524/http_post.txt
1. Telnet to your favorite Web server: AL: Trying out HTTP (client side) for yourself Opens TCP connection to port 80 (default HTTP server port) at www.eurecom.fr. Anything typed in sent to port 80 at www.eurecom.fr telnet www.eurecom.fr 80 2. Type in a GET HTTP request: By typing this in (hit carriage return twice), you send this minimal (but complete) GET request to HTTP server GET /~ross/index.html HTTP/1.0 3. Look at response message sent by HTTP server 4. Example using If-Modified-Since:
Non-persistent HTTP 0.9/1.0 One request/response per connection simple to implement server parses request, responds, and closes TCP connection Each object transfer Goes through slow start Incurs a connection setup three-way handshake each time Several extra round trips added to transfer (if done serially) Persistent default for HTTP/1.1 Several requests/responses per connection On same TCP connection: server, parses request, responds, parses new request,.. Client sends requests for all referenced objects as soon as it receives base HTML. AL: Non-persistent and persistent connections But most 1.0 browsers use parallel TCP connections.
AL: Non-persistent connections • Short transfers are hard on TCP • Stuck in slow start • Loss recovery is poor when windows are small • Lots of extra connections • Increases server state/processing • Server also forced to keep TIME_WAIT connection state • More on TIME_WAIT later • Tends to be an order of magnitude greater than #of active connections
AL: Single non-persistent example Client Server 0 RTT SYN Client opens TCP connection SYN 1 RTT ACK DAT Client sends HTTP request for HTML ACK Server reads from disk DAT FIN 2 RTT ACK Client parses HTML Client opens TCP connection FIN ACK SYN SYN 3 RTT ACK Client sends HTTP request for image DAT Server reads from disk ACK 4 RTT DAT Image begins to arrive
AL: Parallel non-persistent connections (Netscape) • Improve non-persistent latency by using multiple concurrent connections • Different parts of Web page arrive independently on separate connections (object demux via connections) • Can grab more of the network bandwidth than other users • Doesn’t necessarily improve response time • TCP loss recovery ends up being timeout dominated because windows are small
AL: Persistent Connection Solution • Multiplex multiple transfers onto one TCP connection • Serialize transfers client makes next request only after previous response • Benefits greatest for small objects • Up to 2x improvement in response time • Server resource utilization reduced due to fewer connection establishments and fewer active connections • TCP behavior improved • Longer connections help adaptation to available bandwidth • Larger congestion window improves loss recovery • HTTP/1.1 vs. HTTP/1.0 example • Multiple requests to www.cse.ogi.edu • Problem: serial delivery of objects (head-of-line object blocking)
AL: Persistent Connection Example Client Server 0 RTT DAT Client sends HTTP request for HTML ACK Server reads from disk DAT 1 RTT ACK Client parses HTML Client sends HTTP request for image DAT Server reads from disk ACK DAT 2 RTT Image begins to arrive
AL: Persistent Connection Solution • Pipelining requests • Getall – request HTML document and all embeds • Requires server to parse HTML files • Embeds returned serially • Doesn’t consider client cached documents • Getlist – request a set of documents • Implemented as a simple set of GETs • Problems with pipelined serialized requests • Stall in one object prevents delivery of others • Much of the useful information in first few bytes (layout info) • Multiple connections allow incremental rendering of images as they come in • Need application-level demux to emulate multiple connections • HTTP-NG, HTTP/2.0, HTTP range requests • Application specific solution to transport protocol problems • SCTP
Nonpersistent HTTP issues: requires 2 RTTs per object OS must work and allocate host resources for each TCP connection solves demux issue on multiple objects Persistent HTTP server leaves connection open after sending response subsequent HTTP messages between same client/server are sent over connection Persistent without pipelining: client issues new request only when previous response has been received one RTT for each referenced object Persistent with pipelining: default in HTTP/1.1 client sends requests as soon as it encounters a referenced object as little as one RTT for all the referenced objects objects returned one at a time (HOL blocking vs. parallel non-persistent connections) AL: Persistent vs. non-persistent summary
AL: Some HTTP headers by function • Authentication • Client • Authorization, Proxy-Authorization • Server • WWW-authenticate, Proxy-Authenticate • User, server tracking • Client • Cookie, Referer, From, User-agent • Server • Set-cookie, Server
AL: Some HTTP headers by function • Caching • General • Cache-control, Pragma • Client • If-Modified-Since, If-Unmodified-Since, If-Match • Server • Last-Modified, Expires, ETag, Age
Authentication goal: control access to server documents stateless: client must present authorization in each request authorization: typically name, password authorization: header line in request if no authorization presented, server refuses access, sends WWW authenticate: header line in response http://www.sandbox.com/clipboard/pub-doc/home.jsp usual http request msg +Authorization:cred usual http request msg + Authorization:cred usual http response msg usual http response msg time AL: Authentication server client usual http request msg 401: authorization req. WWW authenticate: Browser caches name & password so that user does not have to repeatedly enter it.
AL: Authentication example • http://www.cse.ogi.edu/class/cse524/http_ba.txt
Many major Web sites use cookies Four components: cookie header line in the HTTP response message Set-cookie: 2) cookie header line in HTTP request message Cookie: 3) cookie file kept on user’s host and managed by user’s browser 4) back-end database at Web site Example: Susan access Internet always from same PC She visits a specific e-commerce site for first time When initial HTTP requests arrives at site, site creates a unique ID and creates an entry in backend database for ID AL: Cookies (keeping “state”)
client server usual http request msg usual http response + Set-cookie: 1678 Cookie file Cookie file Cookie file amazon: 1678 ebay: 8734 ebay: 8734 amazon: 1678 ebay: 8734 cookie- specific action usual http request msg cookie: 1678 usual http request msg cookie: 1678 usual http response msg usual http response msg cookie- spectific action AL: Cookies: keeping “state” (cont.) server creates ID 1678 for user entry in backend database access access one week later:
What cookies can bring: authorization shopping carts site preferences site personalization user session state (Web e-mail) AL: Cookies and usage aside Cookies and privacy: • cookies permit sites to learn a lot about you • you may supply name and e-mail to sites • search engines use redirection & cookies to learn yet more • advertising companies obtain info across sites
AL: Caching • Do not send content if it has not changed • Can be done directly between client and server (browser) • Can be done along path between client and server (web/proxy caches) Why Web caching? • Reduce response time for client request. • Reduce network traffic • Reduce load on servers
“Conditional GET” client: specify date of cached copy in http request If-modified-since: <date> server: response contains no object if cached copy up-to-date: HTTP/1.0 304 Not Modified http response HTTP/1.0 304 Not Modified AL: Client caching server client http request msg If-modified-since: <date> object not modified http request msg If-modified-since: <date> object modified http response HTTP/1.1 200 OK … <data>
AL: HTTP caching • Additional caching methods • ETag and If-Match • HTTP 1.1 has file signature as well • When/how often should the original be checked for changes? • Check every time? • Check each session? Day? Etc? • Use Expires header • If no Expires, often use Last-Modified as estimate
AL: Example Cache Check Request GET / HTTP/1.1 Accept: */* Accept-Language: en-us Accept-Encoding: gzip, deflate If-Modified-Since: Mon, 29 Jan 2001 17:54:18 GMT If-None-Match: "7a11f-10ed-3a75ae4a" User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0) Host: www.cse.ogi.edu Connection: Keep-Alive
AL: Example Cache Check Response HTTP/1.1 304 Not Modified Date: Tue, 27 Mar 2001 03:50:51 GMT Server: Apache/1.3.14 (Unix) (Red-Hat/Linux) mod_ssl/2.7.1 OpenSSL/0.9.5a DAV/1.0.2 PHP/4.0.1pl2 mod_perl/1.24 Connection: Keep-Alive Keep-Alive: timeout=15, max=100 ETag: "7a11f-10ed-3a75ae4a"