190 likes | 353 Views
The Web. Core Ideas and Technologies HTTP MIME Types URIs ReST. Uniform Access - HTTP. Main access/manipulation protocol of the Web A means of exchanging resource representations across the network Request/response based interactions Supports a limited number of requests
E N D
The Web • Core Ideas and Technologies • HTTP • MIME Types • URIs • ReST
Uniform Access - HTTP • Main access/manipulation protocol of the Web • A means of exchanging resource representations across the network • Request/response based interactions • Supports a limited number of requests • GET – retrieve a resource – no payload • POST – send a resource to a server – payload • PUT – update or alter a resource on a server - payload • If the resource does not exist, the server may allow the resource to be created • DELETE – remove a resource from a server – no payload • HEAD – find out what metadata you would get back if you performed a GET – no payload • OPTIONS – find out what operations are permitted on a resource – no payload
HTTP • When a server receives a request, it returns a response containing a response code, headers and possibly a resource representation. • Codes are divided into types • 100 – 199 – informational • 200 – 299 – success • 300 – 399 – redirect • 400 – 499 – client error • 500 – 599 – server error • examples: • 200 OK • 201 Created • 403 Forbidden • 301 Redirect (with ‘Location’ header) • 500 Internal Server Error
HTTP Headers • The HTTP message is an envelope • starts with headers • Then the body (payload), if any. • request: GET /weather/ HTTP/1.1 Host: www.bbc.co.uk User-Agent: curl/7.16.3 Accept: */* • response: HTTP/1.1 200 OK Server: Apache/2.0.59 (Unix) Content-Length: 12345 Content-Type: text/html …payload…
MIME • Multipurpose Internet Mail Extensions • not used just in email now • The Content-Type header field points to a MIME Type • A (slowly growing) list of known media types • text/plain • text/xml • text/html • image/gif • image/jpeg • Developers/designers are encouraged to reuse MIME types • keeping the list manageable helps interoperability • A different approach to Object interfaces
Common HTTP Headers • Host: the target host • Content-Length: length (in bytes) of the resource representation • Content-Type: MIME type of representation • Accept: Accepted MIME type • Transfer-Encoding: can be chunked. This means the data is sent in chunks, rather than all in one go. • Useful for data for which the length is not known when starting transmission. • Content-Encoding: e.g. gzip, deflate. Allows zipping up content. • Etag: a server defined ID for a resource • If-Match, If-None-Match: allows conditional requests • Date: • If-Modified-Since, If-Not-Modified: allows conditional requests • Expect: e.g. 100-continue: • server responds with a 100 status code if the client should continue with the request by sending the payload. • WWW-Authenticate: authentication challenge issued by a server • Many Caching-related headers
HTTP Headers are extensible • Typically start with ‘X-’ • Allows application specific metadata • Examples from Rackspace: • X-Auth-User – user name in request • X-Auth-Key – user API key in request • X-Auth-Token – user token with lifetime in response • Mobiles: • x-roaming - is the user roaming? • x-nokia-msisdn – user’s mobile number in plain text • HTTP_X_BROWSER_HEIGHT • HTTP_X_DEVICE_TYPE • Even: • HTTP_X_ME_CUSTOM_ITEM_1
Uniform Naming - URIs • Uniform Resource Identifier • An identifier for a resource on the Web. • A superset comprising URNs (Uniform Resource Name) and URLs (Uniform Resource Location). URLs can be dereferenced (i.e. they represent an ‘address’) • Examples: • http://cs.cf.ac.uk/user • urn:isbn:0-395-36341-1 • urn:jxta:uuid:59616261646162614A78746150325033F3BC76 • The first is a URL - it represents an address - you can GO there - PPT spotted that • The second is a URN - it identifies a book uniquely • The third is a URN, but using the Jxta Protocol, one may be able to resolve this to a URL. PPT obviously does not support Jxta.
URIs • URIs can be either hierarchical or non-hierarchical. • A non hierarchical URI is considered opaque. • A hierarchical URI has a number of elements. • All URIs start with a scheme element • This example is hierarchical: http://cs.cf.ac.uk:8080/harrison/resume.html#intro authority scheme host port path fragment • This example is not hierarchical: mailto:a.b.harrison@cs.cf.ac.uk opaque scheme
Resources and URIs • Hierarchical URIs can also have a query string: http://www.google.com/search?q=web key value • This is a simplified URI generated by searching Google for the word “web” • q is a key to google to say the bit of the query after the = is what I typed in. • Query strings can contain multiple key/value pairs using an ampersand ?client=safari&rls=en-us&q=web&ie=UTF-8&oe=UTF-8 • Experiment - change the value after the q=. Add &hl=ja to the end of the URI and see what happens
Resources and URIs • Resources have state but not operations over that state. • This is supported through URIs • The scheme of the URI determines the operations permitted on a resource. • By exposing representations of resources at URIs with different schemes, you offer different operations on that resource.
Representational State Transfer (ReST) • Term coined by Roy Fielding in his PhD dissertation in 2000 • A PhD thesis that has actually had an impact!!! • ReSTis an architectural style derived from looking at the Web and what makes it scalable • It is a client/server architectural style in which clients retrieve representations of resources. When a new resource is retrieved, the state of the client, e.g., the browser, is changed. • The term is now heavily used and misused • ReST!= HTTP • Not all the Web is ReSTful
ReST • Primary constraints • Client/server • Request/response • Statelessness • interactions are stateless • Caching • reduces network load • HTTP headers support lots of caching policies • Uniform Interface • URIs • HTTP • MIME types • Layering • Allows servers to act as gateways to areas of the network for firewalling, caching • Allows evolution of parts of the network to happen incrementally without affecting the whole network • Code-on-Demand • e.g. Applets, SWF
Statelessness • Interactions on the Web are usually stateless • Resources have (represent) state, but the interactions do not • When I query Google, each query happens in isolation. There is no state preserved at the server regarding my query, even if I move through a number of results pages • Experiment: add &start=20 to the end of your query • What happens? You move to a new URI which is independent of the previous URI. (Look at the Gooooooogle links at the bottom of the page) • The Cookie is typically used to represent interaction state • Cookie Header is an opaque pointer to state • Cookie is stored at the client • Understood by the server – the state of interactions is on the server – not ReSTful
Caching and Linkability • Caching • Allows resources to be stored and retrieved locally, or closer to the source of the request • This saves on bandwidth • Is a form of load balancing • Is possible because of the resource abstraction. • Caching also applies to storage outside of the Web via URIs - in desktop applications, on TV, newspapers, billboards, human memory • Linkability • Web Architecture (Vol 1): It is a strength of Web Architecture that links can be made and shared; a user who has found an interesting part of the Web can share this experience just by republishing a URI. • Makes the Web a web
Caching and Linkability • Linkabilityand cacheability are achievable through two primary qualities • URIs being ‘addressable’ • I can share them, publish them knowing others will get the same result as me when they de-reference the URI. • A certain longevity to URIs - If I link to a resource today, will it still be there tomorrow? • Safe and idempotent interactions • Safe interactions are those with no side-effects. • the user is not to be held accountable for the result of the interaction. • I am not entering into a contract or agreeing to terms and conditions when I follow a link. • If this were the case, how could you be sure that you were not at a deeper level within the domain, having unknowingly bypassed the terms an conditions page? – Servers use redirects for this. • Idempotent interactions. No matter how many times the identical request is repeated, it does not change the side-effects of the request.
Uniform Interface • separation of resource from representation • What you see is not the actual thing, but a representation. Things may have multiple representations. • manipulation of resources by representations • Changing the representation may induce a change in the state of the resource itself • self-descriptive messages • Allows each message to be independent • No need to track state changes across multiple exchanges • hypermedia as the engine of application state (?) • Hypermedia is a medium that allows non-linear progression through states. • A page with links - its up to you where you go. • So state is only perceived and controlled by the client
“Hypermedia as the engine of application state” URL Every GET to Google is independent. Google doesn’t perceive your state changes Possible next states The client chooses the next state Only the client percieves the continuum of states
Conclusion • The Web core concepts • Uniform Access • HTTP • Naming • URIs • Representation • MIME types • primarily HTML • ReST • stateless interactions • caching • state at the client side