480 likes | 719 Views
HTTP messages Entities and Encoding. Herng-Yow Chen. Outline. The format and behavior of HTTP message entities as HTTP containers How HTTP describes the size of entity bodies, and what HTTP requires in the way of sizing
E N D
HTTP messagesEntities and Encoding Herng-Yow Chen
Outline The format and behavior of HTTP message entities as HTTP containers How HTTP describes the size of entity bodies, and what HTTP requires in the way of sizing The entity headers used to describe the format, alphabet, and language of content, so clients can process it properly
Reversible content encoding transforms data format to take up less space or be more secure Transfer encoding modifies how HTTP ships data to enhance the communication of some kinds of data Chunked encoding chops data into multiple pieces to deliver content of unknown length safely
The assortment of tags, labels, times, and checksums help clients get the latest version of requested content Ranges are useful for continuing aborted downloads where they left off Delta encoding extensions allow client to request just those parts of a web page that actually have changed since a previously viewed revision
Checksums of entity bodies are used to detect changes in entity content as it passes through proxies
Message is made up of header and body HTTP/1.0 200 OK Server: Netscape_Enterprise/3.6 Date: Sun, 17 Sep 2000 00:01:05 GMT Content_type: text/plain Content-length :18 Hi!I’m a message! Entity headers Entity Entity body
HTTP 1.1 defines 10 entity headers Content-Type Content-Length Content-Language Content-Encoding Content-Location Content-Range Content-MD5 Last-Modified Expires Allow ETag Cache-Control
Why content-length is important? Detecting Truncation Incorrect Content-Length problems? When connection is persistent, where one entity body ends and the next message begins. Chunked encoding is an alternate, sending the data in a series of chunks, each with a specified chunk size. When content-encoding is applied Content-length refers to the encoded body, not the length of the original, unencoded body.
Entity Digest Content-MD5 Is used to check message integrity Also can be used as a key into a hash table to quickly locate documents and reduce duplicate storage of content.
Media type and Charset Content-type refers to original entity body type before encoding. Support optional parameters to further specify the content type. Character Encodings for Text Media Content-Type: text/html; charset=iso-8859-4
Multipart Media Types MIME “multipart” email messages contain multiple messages stuck together and sent as a single, complex message. Each component is self-contained, with its own headers describing its contents; the different components are concatenated together and delimited by a string. HTTP also supports multipart bodies; however, only used in two cases: fill-in form submission and range responses carrying pieces of a document.
Multipart Form Submissions <form action=http://xxx/cgi enctype="multipart/form-data“ method=POST> <P> Your Name? <INPUT type=“text” name=“submit-name”><br> Your File to send? <INPUT type=“file” name=“files”> <br> <INPUT type=“submit” value=“send”> <INPUT type=“reset”><form>
If the user enters “John” and selects the text file “hello.txt” Content-Type: multipart/form-data; boundary=AaBo3x --AaBo3x Content-Disposition: form-data; name=“submit-name” John --AaBo3x Content-Disposition: form-data; name=“files”; filename=“hello.txt” Content-Type: text/plain … contents of hello.txt … --AaBo3x
If selects the text file “hello.txt” and the second image file “image.gif” Content-Type: multipart/form-data; boundary=AaBo3x --AaBo3x Content-Disposition: form-data; name=“submit-name” John --AaBo3x Content-Disposition: form-data; name=“files”; Content-type: multipart/mixed; boundary=BbC04y --BbC04y Content-Disposition: file: filename=“hello.txt” Content-type: text/plain … contents of hello.txt … --BbC04y Content-Disposition: file: filename=“image.gif” Content-Type: image/gif Content-Transfer-Encoding: binary … contents of image.gif … --BbC04y --AaBo3x
Multipart Range Response HTTP/1.0 206 Partial Content Server: Microsoft-IIS/5.0 Content-Location: http://xxx/hello.txt Content-Type:martipart/x-byteranges; boundary=--[abcdefghik…z]-- ----[abcdefghik…z]— Content-Type: text/plain Content-Range: bytes 0-174/1441 …. Part I content --- --[abcdefghik…z]-- Content-Type: text/plain Content-Range: bytes 1344-1441/1441 …. Part II content --- --[abcdefghik…z]--
Content-Encoding HTTP applications sometimes want to encode content before sending it, to help lesson the time it takes to transmit the data. Content-Type is the type of the original format, before encoding Content-Length is the length of the encoded length
Content Encoding Content-encoded content Content-Type: text/html Content-Length: 5746 content-encoding: gzip Original content Content-Type: text/html Content-Length: 17571 Original content Content-Type: text/html Content-Length: 17571 01110001 00110010 Gzip content decoder Gzip content encoder
Accept-Encoding Headers Request message GET /logo.gif HTTP/1.1 Accept-encoding: gzip […] client server HTTP/1.1 200 OK Content-type: image/gif Content-encoding: gzip […] gunzip gzip Response message …00101101… …00101101… The server compresses the image with gzip to transport a smaller file over the thin Network connection between itself and the client.This saves network bandwidth And reduces the amount of time that the client waits for the transfer.Though,the Client will have to spend time decompressing the image once the image is served.
Client can indicate preferred encodings by attaching Q values Accept-Encoding: compress, gzip Accept-Encoding: Accept-Encoding: * Accept-Encoding: compress;q=0.5, gzip;q=1.0 Accept-Encoding: gzip;q=1.0, identity;q=0.5; *;q=0
Transfer Encoding Content-Encodings are to deal with the entity content to be encoded for less-space or security reason, tightly associated with the content format. In comparison, transfer encodings are applied for architectural reasons and are independent of the content format.
Content encoding vs. transfer encoding HTTP/1.0 200 OK content-encoding: gzip Content-Type: text/html […] [encoded message] HTTP/1.1 200 OK Transfer-encoding: Chunked 10 abcdefghijk 1 a Content-encoded response Normal header block Normal entity (just encoded) A content-encoded message just encodes the entity Section of the message. With Transfer-encoded Messages the encoding is a function of the entire Message, changing the structure of the message itself Transfer-encoded response Basic header Encoded blocks
Transfer-Encoding Headers TE Used in the request header to tell the server what extension transfer encoding are okay to use. Transfer-Encoding Used in the response header to tell the receiver (client) what encoding has been perform
Example GET /1.html HTTP/1.1 Host: www.csie.ncnu.edu.tw User-Agent: Mozilla/4.61 TE: trailers, chunked HTTP/1.1 200 ok Transfer-Encoding: chunked Server: Apache 3.0
Chunked Encoding (continued) Chunking and Persistent connection Trailers in chunked messages Combining Content and Transfer Encoding
Combining Content and Transfer Encodings Content-type: text/heml Content encoding 9BF2578EA4 2670CD Content-Type: text/html content-encoding: gzip 9BF2578EA4 2670CD Transfer encoding (chunking) 426 426 Content-Type: text/html content-encoding: gzip Transfer-encoding: chunked 8EA 8EA 257 257 98B 98B
Time-Varying Instance Web objects usually are not static. The same URL can, over time, point to different versions of an object. For example, the website of any media company like CNN, and BBC.
Validators and Freshness In the previous CNN example, the client got the initial resource V1 and can cache this copy, but for how long? Once the document has “expired” at the client, it must request a fresh copy from the server. Using a “conditional request” to tell the server which version it currently has, using a validator, and ask for a copy to be sent only if its current copy is no long valid.
Range Request HTTP allows clients to actually request just part or a range of a document. Applications: Request RoI (Region of Interest) Media Indexing and Access Streaming applications
Range Requests Request message GET /bigfile.html HTTP/1.1 […] client Response message HTTP/1.1 200 OK Content-Type: text/html Content-Length: 65537 Accept-Ranges: bytes […] 110100 111001 101001 110010 www.csie.ncnu.edu.tw Range request message GET /bigfile.html HTTP/1.1 Range: bytes=20224- […] Range response message HTTP/1.1 200 OK Content-Type: text/html Range: bytes=20224- Accept-Ranges: bytes […] The client’s original request was Interrupted,but a second request For the part of the message that Was not received allows the Client to resume form the point Of the interruption www.csie.ncnu.edu.tw
Delta Encoding An extension to the HTTP protocol that optimizes transfer by communicating changes instead of entire objects. RFC 3229 describe delta encoding.
Delta-encoding headers Etag If-None-Match A-IM IM Delta-Base
For More Information http://www.ietf.org/rfc/rfc2616.txt Hypertext Transfer Protocol -- HTTP/1.1 http://www.ietf.org/rfc/rfc3229.txt Delta encoding in HTTP http://www.ietf.org/rfc/rfc1521.txt MIME (Multipurpose Internet Mail Extensions) Part One:Mechanisms for Specifying and Describing the Format of Internet Message Bodies http://www.ietf.org/rfc/rfc2045.txt Multipurpose Internet Mail Extensions(MIME) Part One:Format of Internet Message Bodies http://www.ietf.org/rfc/rfc1864.txt The Content-MD5 Header Field http://www.ietf.org/rfc/rfc3230.txt Instance Digests in HTTP