360 likes | 498 Views
Content Negotiation and Transcoding. Herng-Yow Chen. Outline. A single URL may need to correspond to different resources: multiple language support for different request language users.
E N D
Content Negotiation and Transcoding Herng-Yow Chen
Outline A single URL may need to correspond to different resources: multiple language support for different request language users. HTTP provides content-negotiation methods that allow clients and servers to make such determinations, accessing a single URL corresponding to different resources (e.g., French or English version) called variants.
Servers also can make other types of decisions about what content is best to send to a client for a particular URL. Servers even can automatically generate customized pages– for instance, converting an HTML page into a WML page for your handheld device. This kind of dynamic content transformations are called transcodings.
Content-Negotiation Techniques There are three distinct methods for deciding which page at a server is the right one for a client: Present the choice to the client Decide automatically at the server Ask an intermediary to select.
Client-Driven Negotiation Client makes a request. Server sends list of choices to client. Client chooses. Disadvantage: two requests are needed One to get the list and a second to get the selected copy, leading slow (increased latency) , tedious decision process made manually at the client side in the browser.
For servers, two ways to present choices (are manually decided) By sending back an HTML with links to the different versions of the page and descriptions, By sending back an HTTP/1.1 response with the 300 Multiple response code. The client browser may receive this response and display a page with the links, as in the first method, or it may pop up a dialog asking for selection. Another problem: requires multiple URLs One for the main page, one for each specific page
Server-Driven Negotiation Client-driven approach has several drawbacks, as discussed previously; however, the most one is the increased communication between client and server to decide on the best page. Why do we let the server decide which page to send back? Client must send enough information about its preferences
Two mechanisms to evaluate the proper response Examining the set of content-negotiation headers. The server looks at the clients’ Accept header and tries to match them with corresponding response headers. Varying on other (non-content-negotiation) headers. For example, the server could send responses based on the client’s User-Agent header.
We have discussed the entity header in Chapter 15, which are like a shipping label for describing the attributes of the message body. Content-negotiation header, on the other hand, are used by clients and servers to exchange preference information and to choose between different versions, so that the best, or the most closely one (q values) matching the preferences is served.
Content-Negotiation Header Quality Values For example, clients send an Accept-Language header as below Accept-Language: en;q=0.5, fr;q=0.0, nl; q=1.0, tr;q=0.0 Where q value ranges from 0.0 to 1.0 (the highest preference) In this case, the client prefers to receive a Dutch (nl) version, but an English (en) version will do. Under no circumstance does the client want a French (fr) or Turkish (tr) version. Order is not important. Occasionally, the server may not have any documents that mach any of the client’s preference. In this case, the server may change or transcode the document to match the client’s preference (discussed later).
Varying on Other Headers Servers also can attempt to match up responses with other client request headers, such as User-Agent. Server may know that old versions of a browser or browser types do not support JavaScript, for example, and may therefore send back a version without Javascript. In this case, there is no q-value to look for approximate best match. The server either looks for an exact match or simply serves whatever it has.
Because caches must attempt to server correct “best” versions of cached document, the HTTP defines a Vary header that the server sends in responses; The Vary headers tells caches (and clients, and any downstream proxies) which headers the server is using to determine the best version of the response to send. (discussed later)
Content Negotiation on Apache A web site content provider– Joe, for example– to provide different version of Joe’s index page. Joe must put all his index page files in the appropriate directory of the Apache server. There are two ways to enable this. In the web site directory, create an type-map file for each URI in the web site that has variant. Enable the MultiViews directive, which causes Apache to create type-map files for the directory automatically.
Using type-map file AddHandler type-map .var Here is a smaple type-map file URI: joes-hardware.html URI: joes-hardware.en.html Content-type: text/html Content-language: en URI: joes-hardware.en.html Content-type: text/html;charset=iso-885902 Content-language: fr, de
Using Multi-Views Use Options directive to enable multi-view for the directory (<Directory>, <Location>, or <Files>). The server looks for all files with “joe-hardware” in the name and creates a type-map file for them. Based on the names , the server guesses the appropriate content-negotiation header to which the files correspond. Another two ways to implement content negotiation at the server is by Server-side extension, such as Microsoft’s Active Server Page (ASP) any CGI-program,i.e., doing this by yourself
Transparent Negotiation Seeks to move the load of server-driven negotiation away from the server, while minimizing message exchange with the client by having an intermediary proxy negotiate on behalf of client. The proxy is assumed to have knowledge of the client’s expectations and be capable of performing the negotiations on its behalf.
Caches use content-negotiation headers to send back correct responses to client GET / HTTP/1.1 Host: www.joes-hardware.com User-agent: spiffy multimedia browser Accept-language: fr;q=1.0 Hi! Welcome to Joe’s Hardware Store. Hola! Bienvenido a Joe’s Hardware Store. Bonjour French-speaking user Bonjour! Bienvenue a Joe’s Hardware Store. Web server Cache
Caches use content-negotiation headers to send back correct responses to client GET / HTTP/1.1 Host: www.joes-hardware.com User-agent: spiffy multimedia browser Accept-language: es;q=1.0 Hola! Bienvenido a Joe’s Hardware Store. Bonjour Bienvenido Spanish-speaking user Web server Cache
The Vary Header The huge number of different User-Agent and Cookie values could generate many variants: Vary: User-Agent, Cookie
Caches match request headers GET / HTTP/1.1 Host: www.joes-hardware.com User-agent: spiffy multimedia browser Accept-language: fr;q=1.0 I need to send her French document. Since she has such a cool browser, I’ll send her a media-rich version of the page. HTTP/1.1 200 OK Content-language: fr Vary: User-agent Bonjour […media-rich content] Bonjour French-speaking user 1 Web server Cache
Caches match request headers GET / HTTP/1.1 Host: www.joes-hardware.com User-agent: simpy wireless device Accept-language: fr;q=1.0 He wants a French copy of the document and I have it in my cache, but I’d better not send it to him. The server said my cached copy was for a spiffy browser. This guy has a wimpy wireless one. I had better ask the server for a French version for the wireless browser. Bonjour HTTP/1.1 200 OK Content-language: fr Vary: User-agent Bonjour […simple text content] Web server Bonjour French-speaking user 2 Cache
Transcoding We have discussed the mechanism by which clients and servers can choose between a set of documents for a URL and send the one that best matches the client’s needs. What happens, however, when a server does not have a document that matches the client’s needs at all? Respond to client with an error, but Yet another solution– transcoding, transforming the unsatisfactory one into something that the client can use.
Three categories of Transcoding Format Conversion Compatible problem Bandwidth issues Information Synthesis Information summary Advertisement removal Content Injection (increasing the amount of content) Automatic ad generator User-tracking system Collect statistics about how the page is viewed and how the clients surf the Web.
Transcoding Versus Static Pregeneration An alternative to transcoding is to build different copies of web pages at the web server. For example, one with HTML, one with WSML, one with high-resolution, one with low-resolution. Is this practical? Storage cost, management problem
Content transformation or transcoding at a proxy cache GET / HTTP/1.1 Host: www.joes-hardware.com User-agent: wimpy wireless deviceAccept-language: fr;q=1.0 I have a French copy of the document that the wants, but my copy is very media-rich and he has a wimpy wireless browser. I will strip out all of the multimedia content and send it to him. Bonjour Bonjour Transmogrifier Web server French-speaking user Cache HTTP/1.1 200 OK Content-language: fr Vary: User-agent Bonjour […simple text content] Since I have transformed this document for a wireless device, I will store the transformed copy as an alternate in case someone else wants it as well.
For More Information RFC 2616, Hypertext Transfer Protocol--HTTP 1/1 RFC 2295, Transparent Content Negotiation in HTTP RFC 2296, HTTP Remote Variant Selection Algorithm-RSVA 1.0 RFC 2936, HTTP MIME Type Handler Detection http://www.imc.org/ietf-medfree/index.html a link to the Content Negotiation (CONNEG) working group