390 likes | 819 Views
HTTP Hypertext Transfer Protocol. RFC 1945 (HTTP 1.0) RFC 2616 (HTTP 1.1). World Wide Web. Web consists of a large set of documents, called Web pages, that are accessible over the Internet. Each Web page is classified as a hypermedia document.
E N D
HTTPHypertext Transfer Protocol RFC 1945 (HTTP 1.0) RFC 2616 (HTTP 1.1)
World Wide Web Web consists of a large set of documents, called Web pages, that are accessible over the Internet. Each Web page is classified as a hypermedia document. The suffix media is used to indicate that a document can contain items other than text (e.g., graphics images); the prefix hyper is used because a document can contain selectable links that refer to other, related documents. Two main building blocks are used to implement the Web on top of the global Internet. Web browser, Web server Pages that contain a mixture of text and other items are represented using HyperText Markup Language (HTML). An HTML document consists of a file that contains text along with embedded commands, called tags, that give guidelines for display.
Uniform Resource Locators (URL) Each Web page is assigned a unique name that is used to identify it. The name,which is called a Uniform Resource Locator (URL), begins with a specification of the scheme used to access the item. http://hostname[:port]/path[; parameters][?query]
URLs http://discovery.bits-pilani.ac.in/index.html http://bitsaa.bitspilani.ac.in/bitsaa.bits?l=campusnews/campusnews.bits http://www.bitspilani.ac.in:12349/Default.aspx Relative URLs: /arcd/arc_nucleus.htm
HTTP HTTP is the protocol that supports communication between web browsers and web servers. HTTP is an application-level protocol with the lightness and speed necessary for distributed, hypermedia information systems The RFC states that the HTTP protocol generally takes place over a TCP connection, but the protocol itself is not dependent on a specific transport layer.
HTTP characteristics Application Level. Request/Response Stateless. Each H'ITP request is self-contained; the server does not keep a history of previous requests or previous sessions. Bi-Directional Transfer Capability Negotiation Support For Caching To improve response time, a browser caches a copy of each Web page it retrieves. If a user requests a page again, HTTP allows the browser to interrogate the server to determine whether the contents of the page has changed since the copy was cached. Support For Intermediaries. HTTP allows a machine along the path between a browser and a server to act as a proxy server that caches Web
Request - Response HTTP has a simple structure: client sends a request server returns a reply. HTTP can support multiple request-reply exchanges over a single TCP connection.
Well Known Address The “well known” TCP port for HTTP servers is port 80. Other ports can be used as well...
HTTP Versions The original version now goes by the name “HTTP Version 0.9” HTTP 0.9 was used for many years. Starting with HTTP 1.0 the version number is part of every request. tells the server what version the client can talk (what options are supported, etc).
HTTP 1.0+ Request Lines of text (ASCII). Lines end with CRLF “\r\n” First line is called “Request-Line”
Request Line MethodURIHTTP-Version\r\n The request line contains 3 tokens (words). space characters “ “ separate the tokens. Newline (\n) seems to work by itself (but the protocol requires CRLF)
Request Method The Request Method can be: GET HEAD PUT POST DELETE TRACE OPTIONS future expansion is supported
Methods GET: retrieve information identified by the URI. HEAD: retrieve meta-information about the URI. POST: send information to a URI and retrieve result.
Methods (cont.) PUT: Store information in location named by URI. DELETE: remove entity identified by URI.
More Methods TRACE: used to trace HTTP forwarding through proxies, tunnels, etc. OPTIONS: used to determine the capabilities of the server, or characteristics of a named resource.
HTTP Version Number “HTTP/1.0” or “HTTP/1.1” HTTP 0.9 did not include a version number in a request line. If a server gets a request line with no HTTP version number, it assumes 0.9
The Header Lines After the Request-Line come a number (possibly zero) of HTTP header lines. Each header line contains an attribute name followed by a “:” followed by a space and the attribute value. The Name and Value are just text.
Headers Request Headers provide information to the server about the client what kind of client what kind of content will be accepted who is making the request There can be 0 headers (HTTP 1.0) HTTP 1.1 requires a Host: header
Example HTTP Headers GET /about.html HTTP/1.1 Host: www.bits-pilani.ac.in //must in HTTP 1.1 Connection: Keep-Alive User-Agent: Mozilla/4.06 [en] (X11; U; Linux 2.1.121 i686) Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png Accept-Encoding: gzip Accept-Language: en Accept-Charset: iso-8859-1,utf-8 <blank line>
End of the Headers Each header ends with a CRLF ( \r\n ) The end of the header section is marked with a blank line. just CRLF For GET and HEAD requests, the end of the headers is the end of the request!
POST A POST request includes some content (some data) after the headers (after the blank line). There is no format for the data (just raw bytes). A POST request must include a Content-Length line in the headers: Content-length: 267
Example POST Request POST /about.html HTTP/1.1 Host: www.bits-pilani.ac.in //must in HTTP 1.1 Connection: Keep-Alive User-Agent: Mozilla/4.06 [en] (X11; U; Linux 2.1.121 i686) Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png Accept-Encoding: gzip Accept-Language: en Accept-Charset: iso-8859-1,utf-8 Content Length:35 idno=2007A1PS001&item=test1&name=Krishna
Typical Method Usage GET used to retrieve an HTML document. HEAD used to find out if a document has changed. POST used to submit a form.
HTTP Response ASCII Status Line Headers Section Content can be anything (not just text) typically an HTML document or some kind of image. Status-Line Headers . . . blankline Content...
Response Status Line HTTP-Version Status-Code Message Status Code is 3 digit number (for computers) Message is text (for humans)
Status Codes 1xx Informational 2xx Success 3xx Redirection 4xx Client Error 5xx Server Error
Example Status Lines HTTP/1.0 200 OK HTTP/1.0 301 Moved Permanently HTTP/1.0 400 Bad Request HTTP/1.0 500 Internal Server Error
Response Headers Provide the client with information about the returned entity (document). what kind of document how big the document is how the document is encoded when the document was last modified Response headers end with blank line
Response Header Examples Date: Sat, 30 Jan 2010 12:48:17 IST Server: Apache/1.17 Content-Type: text/html Content-Length: 1756 //len of content that arrives after headers Content-Encoding: gzip
Content Content can be anything (sequence of raw bytes). Content-Length header is required for any response that includes content. Content-Type header also required.
Single Request/Reply The client sends a complete request. The server sends back the entire reply. The server closes it’s socket. If the client needs another document it must open a new connection. This was the default for HTTP 1.0
Persistent Connections HTTP 1.1 supports persistent connections (this is the default). Multiple requests can be handled over a single TCP connection. The Connection: header is used to exchange information about persistence (HTTP/1.1) 1.0 Clients used a Keep-alive: header
Persistent Connections And Lengths In HTTP 1.0, a client opens a TCP connection and sends a GET request. The server transmits a copy of the requested item, and then closes the TCP connection. Until it encounters an end of file condition, the client reads data from the TCP connection. Finally, the client closes its end of the connection.
Persistent Connections And Lengths The chief advantage of persistent connections lies in reduced overhead A browser using a persistent connection can further optimize by pipelining requests (i.e., send requests back-to-back without waiting for a response). The chief disadvantage of using a persistent connection lies in the need to identify the beginning and end of each item sent over the connection. There are two possible techniques that handle the situation: either send a length followed by the item or send a sentinel value after the item to mark the end. to avoid ambiguity between sentinel values and data, HlTP uses the approach of sending a length followed by an item of that size.
Data Length And Program Output It may not be convenient or even possible for a server to know the length of an item before sending. Servers use the Common Gateway Interface (CGI) mechanism to create dynamic documents. To provide for dynamic Web pages, the HTTP standard specifies that if the server does not know the length of an item a priori, the server can inform the browser that it will close the connection after transmitting the item
Data Length And Program Output HTTP/1.1 200 OK Server: Microsoft-IIS/5.0 Date: Fri, 08 Oct 2010 05:08:14 GMT Connection: close Content-Type: text/html
Conditional Requests HlTP allows a sender to make a request conditional For example If-Modified-Since: Sat, 01 Jan 2000 05:00:01 GMT
HTTP Proxy Server Browser HTTP Server Proxy
Proxy Server Security by filtering Performance by Caching