310 likes | 483 Views
The abs_path in a URI. If the abs_path is not present in the URL, it must be given as "/" in a Request-URI for a resource. Thus, if a user points a browser at http://www.ucc.ie this will make the browser act, when writing the URI in a HTTP request, as if the user had entered
E N D
The abs_path in a URI If the abs_path is not present in the URL, it must be given as "/" in a Request-URI for a resource. Thus, if a user points a browser at http://www.ucc.ie this will make the browser act, when writing the URI in a HTTP request, as if the user had entered http://www.ucc.ie/ where the extra / is the default abs_path, the path for the default resource in the HTTP document-root directory at the server Note that the HTTP document-root directory is not the same as the root directory for the operating system on the server machine. The default resource in a HTTP directory (whether it is the HTTP document-root or not) is usually a file called index.html or welcome.html, depending on configuration of server software)
Example 5: two pipelined requests which actually refer to same resource interzone.ucc.ie> telnet student.cs.ucc.ie 80 Trying 143.239.211.125... Connected to student.cs.ucc.ie. Escape character is '^]'. HEAD http://student.cs.ucc.ie/cs4400/jabowen/ HTTP/1.1 Host: student.cs.ucc.ie HEAD http://student.cs.ucc.ie/cs4400/jabowen/welcome.htm HTTP/1.1 Host: student.cs.ucc.ie Connection: close
Example 5: responses show same resource HTTP/1.1 200 OK Date: Wed, 31 Jan 2001 20:54:06 GMT Server: Apache/1.3.14 (Unix) PHP/4.0.3pl1 Last-Modified: Thu, 25 Jan 2001 13:26:32 GMT ETag: "2160-2e25-3a702988" Accept-Ranges: bytes Content-Length: 11813 Content-Type: text/html HTTP/1.1 200 OK Date: Wed, 31 Jan 2001 20:54:06 GMT Server: Apache/1.3.14 (Unix) PHP/4.0.3pl1 Last-Modified: Thu, 25 Jan 2001 13:26:32 GMT (Same time/date as above) ETag: "2160-2e25-3a702988” (Is given same Etag since it is the same resource) Accept-Ranges: bytes Content-Length: 11813 (Same file size as above -- it is the same file.) Connection: close Content-Type: text/html . Connection closed by foreign host.
URI Comparison • A comparison of two URIs should be case-sensitive, with these exceptions: • A port that is empty or not given is equivalent to the default port for that URI-reference; • Comparisons of host names must be case-insensitive; • Comparisons of scheme names must be case-insensitive; • An empty abs_path is equivalent to an abs_path of "/". • Characters other than those in the "reserved" and "unsafe" sets are equivalent to their escaped encoding (%HexHex encoding). • URL-codes are case-insensitive.
URI comparison (contd.) • For example, the following URIs are equivalent: http://abc.com:80/~smith/home.html http://ABC.com/%7Esmith/home.html http://ABC.com/%7esmith/home.html HTTP://ABC.com/%7esmith/home.html because • scheme and hostnames are case-insensitive, • 80 is the default port and • the escaped (URL) encoding for ~ is %7E • escaped encodings are case-insensitive
URI comparison (contd.) • For example, the following URIs are equivalent: http://abc.com http://abc.com/ because • an empty abs_path is equivalent to an abs_path of "/".
HTTP Message Types • HTTP messages consist of requests from client to server and responses from server to client. • Both types of message consist of • a start-line (a request-line or a status-line) • zero or more header-fields (also known as "headers"), • an empty line (i.e., a line with nothing preceding the CRLF) indicating the end of the header fields, • and (possibly) a message-body. • We will consider first the features that are common to requests and responses: • Header-fields • Message-Bodies • Later, we will consider features specific to requests (request-lines) and responses (status-lines)
HTTP Header-fields • HTTP header-fields include • general-header-fields • request-header-fields (or response-header-fields), and • entity-header-fields, • Each header-field consists of a name followed by a colon and the field value. • There are many different types of header-fields, which we will consider later
Header-fields (contd.) • Field names are case-insensitive. • The field value may be preceded by any amount of white-space. • Header fields can extend over multiple lines by if each extra line starts with at least one SP or HT. • The order in which header fields with differing field names are received is not significant. • However, it is "good practice" to send general-header fields first, followed by request-header (or response- header fields), and ending with the entity-header fields.
Header-fields (contd.) • Multiple message-header fields with the same field-name may be present in a message if and only if • the entire field-value for that header field is defined as a comma-separated list • Appending the multiple header field-values into a comma-separated list must not alter the meaning of the message • Therefore, the order in which header fields with the same field-name are received is significant. • Thus a proxy must NOT change the order of these field values when a message is forwarded.
Message-bodies • The message-body (if any) of a HTTP message is used to carry the entity-body associated with the request or response. • The message-body differs from the entity-body only when a transfer-coding has been applied, as indicated by a header-field called the Transfer-Encoding header field
Transfer Encoding • The Transfer-Encoding header is used to indicate any transfer-codings applied by an application to ensure safe and proper transfer of the message. • Transfer-Encoding is a property of the message, not of the entity, and thus may be added or removed by any application along the request/response chain.
One type of Transfer-Encoding: chunked • The chunked encoding method modifies the body of a message in order to transfer it as a series of chunks, each with its own size indicator • This allows dynamically produced content to be transferred along with the information necessary for the recipient to verify that it has received the full message: • such content cannot be preceded by a Content-Length header if the program producing the content dynamically is not able to predict how long its output will be • A chunk-size indicator is a line containing a string of hex digits, giving the number of octets in the chunk. • The chunked encoding is ended by any chunk whose size is zero
Example: request for dynamic output The resource in the request below is a CGI program interzone.ucc.ie> telnet student.cs.ucc.ie 80 Trying 143.239.211.125... Connected to student.cs.ucc.ie. Escape character is '^]'. GET http://student.cs.ucc.ie/cs1064/jabowen/cgi-bin/short.cgi HTTP/1.1 Host: student.cs.ucc.ie
Example (contd.): chunked response HTTP/1.1 200 OK Date: Wed, 31 Jan 2001 17:52:27 GMT Server: Apache/1.3.14 (Unix) PHP/4.0.3pl1 Transfer-Encoding: chunked Content-Type: text/html 7c <HTML> <HEAD> <TITLE> Short response </TITLE> </HEAD> <BODY> This is a short response produced by short.cgi </BODY> </HTML> 0 Connection closed by foreign host.
Presence of message-body • Not every message (request or response) can have a message-body • The rules for when a message-body is allowed in a message differ for requests and responses.
Message-bodies in requests • Presence of a message-body in a request is signaled by inclusion of a Content-Length or Transfer-Encoding header field in the request's message-headers. • A message-body must NOT be included in a request if the specification of the request method (see later) does not allow sending an entity-body in requests.
Message-bodies in responses • For response messages, whether or not a message-body is included with a message is dependent on both • the method used in the request which prompted the response and • the status-code (see later) in the status-line of the response. • As we have already seen, no response to a HEAD method may include a message-body, even if entity-header fields are present. • No response with one of the following status-codes types may include a message-body: 1xx (informational), 204 (no content), and 304 (not modified) • All other responses do include a message-body, although it may be of zero length.
General Header Fields • These are header fields which can appear in both request and response messages • These header fields apply only to the message being transmitted (as opposed to the entity being carried by the message) • The types of general-header-fields are Cache-Control: Connection: Date: Pragma: Trailer: Transfer-Encoding: Upgrade: Via: Warning: • We have seen some of these headers already (eg, Date:Connection: Transfer-Encoding: ) • Some of the others may be be presented later • Otherwise, use the web to read RFC2616
Request format • Remember that a request message consists of • a request-line, • zero or more header-fields (general-headers or request-headers or entity-headers), • an empty line (i.e., a line with nothing preceding the CRLF) indicating the end of the header fields, • and (possibly) a message-body.
Request-line • The first line of a request message, the request-line, contains • a method-token, followed by • the request-URI and • the protocol version, and ending with • a CRLF. • These elements are separated by SP characters. No CR or LF is allowed except in the final CRLF sequence.
Method-token • The Method token indicates the method to be performed on the resource identified by the Request-URI. • The token is case-sensitive (???). • HTTP/1.1 defines the following method tokens: OPTIONS GET HEAD POST PUT DELETE TRACE CONNECT • The semantics of these predefined method tokens will be defined later • In addition to these predefined methods, HTTP/1.1 allows arbitrary “extension-methods” to appear in request-lines, provided sender and recipient programs have implemented semantics for them
Request-URI • The Request-URI is a Uniform Resource Identifier • It identifies the resource upon which to apply the request. • It must be one of the following forms: "*" | absoluteURI | abs_path | authority • These options are dependent on the nature of the request.
Request-URI (contd.) • The asterisk "*" request-URI means that the request applies to the server itself, rather to any specific resource on the server • therefore, it is allowed only when the method used does not require a resource. • One example request-line would be OPTIONS * HTTP/1.1 in which the client asks for the capabilities of the server
Request-URI (contd.) • We have already seen the abs_path form, as in GET /cs1064/jabowen/ HTTP/1.1 • We could have used the absoluteURI form instead, as in GET http://student.cs.ucc.ie/cs1064/jabowen/ HTTP/1.1 • However, the absoluteURI form is required when the request is being made to a proxy. • The proxy is requested to forward the request, or service it from a valid cache, and return the response. • Note that the proxy may forward the request on to another proxy or directly to the server specified by the absoluteURI.
Request-URIs (contd.) • The last form of Request-URI, the authority form, is only used by a method we have not seen yet • the CONNECT method, which is reserved by the protocol for use with a proxy that can dynamically switch to being a tunnel, e.g. in Secure Sockets Layer (SSL) tunneling.
Request Header Fields • The request-header fields allow the client to pass additional information about the request, and about the client itself, to the server. • The following are the types of request-header-fields defined in HTTP/1.1: Accept: Accept-Charset: Accept-Encoding: Accept-Language: Authorization: Expect: From: Host: If-Match: If-Modified-Since: If-None-Match: If-Range: If-Unmodified-Since: Max-Forwards: Proxy-Authorization: Range: Referer: TE: User-Agent: • The Host: header, which we have seen before, must appear in all HTTP/1.1 requests • The semantics of some of these fields will be given later. • Otherwise, use the web to read RFC 2616
Identification of resource in a request • The exact resource identified by an Internet request is determined by examining both • the Request-URI in the request-line and • the Host: request-header field. • HTTP/1.1 allows origin servers to support several “virtual” hosts and the Host: header is used to distinguish among the virtual hosts supported by the server listening to a connection • An origin server that does not support virtual hosts may ignore the Host: header field value when determining the resource identified by an HTTP/1.1 request. • An origin server that does support virtual hosts must use the following rules for determining the requested resource on a HTTP/1.1 request:
Identification (contd.) 1. If the Request-URI in the Request-line is an absoluteURI, the host is part of the Request-URI. • so any Host: request-header in the request must be ignored. 2. If the Request-URI is not an absoluteURI, and the request includes a Host request-header, the host is determined by the value in the Host request-header. 3. If the host as determined by rule 1 or 2 is not a valid host on the server, the response must be a 400 (Bad Request) error message.