280 likes | 416 Views
Internet and Intranet Fundamentals. Class 4 Session A&B. HTTP Topics. Overview Source Documentation How It Works Status Future Directions. HTTP Overview. H yper T ext T ransfer P rotocol Applications Layer Protocol Generic Protocol gateway to SMTP, NNTP, FTP, Gopher, WAIS
E N D
Internet and Intranet Fundamentals Class 4 Session A&B
HTTPTopics • Overview • Source Documentation • How It Works • Status • Future Directions
HTTPOverview • HyperText Transfer Protocol • Applications Layer Protocol • Generic Protocol • gateway to SMTP, NNTP, FTP, Gopher, WAIS • Uses TCP Port 80 (by default) • presumes reliable transport
HTTPOverview • Language of the World Wide Web • Provides Open-Ended Set of Methods • indicating purpose of request • Builds on URI, URL, URN disciplines
HTTPOverview • URI = Uniform Resource Identifier • identifies points of content • mechanism used to access resource • specific computer housing the resource • specific name of resource on computer • formatted strings which indicate characteristics of a resource
HTTPOverview • URL = Uniform Resource Locator • a particular form of URI • Web page address • URN = Uniform Resource Name • institutional persistence • identifies agency responsible for a definition, for example, but not the location • namable resource may exist at none or several locations.
HTTP/1.0Source Documentation • RFC 1945 • HTTP/1.0 (deprecated) • http://www.ics.uci.edu/pub/ietf/http/rfc1945 • May 1996 • HTTP in use since 1990 • Authors • Tim Berners-Lee • Roy Fielding • Henryk Frystyk
HTTP/1.0 • HTTP/1.0 superceded HTTP/0.9 • HTTP/0.9 allowed raw data transfer • HTTP/1.0 introduced MIME types • Multipurpose Internet Mail Extensions • Content type: text/html • Content type: text/plain • modify request/response semantics
HTTP/1.0 • Shortcomings of HTTP/1.0 • weak on proxies, caching, persistent connections, and virtual hosts • proliferation of imposters: • incompletely implemented applications • stateless • new connection for each request/response exchange
HTTP/1.1Source Documentation • RFC 2068 • HTTP/1.1 • http://www.w3.org/Protocols/rfc2068/rfc2068 • January 1997 • addresses shortcomings of HTTP/1.0
How HTTP WorksRequest/Response Protocol • Request from client contains ... • request method • URI • protocol version • MIME-like message with request modifiers, client info, possible body content
How HTTP WorksRequest/Response Protocol • Response from server contains … • status line • message protocol version • success or error code • MIME-like message • server info • entity meta-information • possible entity body content
How HTTP WorksRequest/Response Protocol • More Sophisticated Interactions • proxies • forwarding agent • gateways • receiving agent • tunnels • relay point between two connections • firewalls • non-caching
How HTTP WorksURIs • Two Forms of URIs • absolute • relative to some known base URI • Absolute • http: “//” host [: port] [abs_path] • http://www.csz.com:8023/directory/file.htm • Relative • [abs_path] • See RFC 1738: • http://www.ietf.org/rfc/rfc1738.txt
How HTTP Works • Caching • not all responses are cacheable • national hierarchies of proxy caches to save transoceanic bandwidth • systems that broadcast or multicast cache entries • organizations that distribute subsets of cached data via CD-ROM
How HTTP Works Media Types • Type / Subtype • followed by 0 or more optional parameters delimited on the left by “;” • parameter are of form attribute=value • Content-type: text/html • Content-type: text/plain (default) • Content-Type: multipart/mixed; boundary=gc0p4Jq0M2Yt08jU534c0p • Assigned by IANA
How HTTP Works Media Types • Media Type Parameters: charset • default is ISO-8859-1 • Multipart Types • multipart/form-data • multipart/mixed • multipart/parallel
HTTPLanguage Tags • Identifies language • Controlled by IANA • en • en-US • en-cockney • x-pig-latin • I-cherokee
HTTPMessages • Request or Response • Use RFC 822 for Transferring Entities • I.e., the payload of a message generic-message = start-line *message-header CRLF [ message-body ] start-line = Request-Line | Status-Line
HTTPMessages • Methods • GET, HEAD must be supported • POST • for sending data back to server • although GET can also be used indirectly to pass parameter information back to the server
HTTP • Authentication • .htaccess files • Secure Sockets Layer (SSL) • https • RSA Encryption • public key / private key • not really part of HTTP
HTTP • Dynamic Pages: • .pl, .asp, .jsp, .stm • Information from “environment” and from “forms” • examples: • http://csz.com/cgi-bin/test-cgi?x=y {get syntax} • https://secure1.csz.com/cgi-bin/ssienvirodump.pl
HTTP • http://csz.com/cgi-bin/test-cgi?x=y HTTP_USER_AGENT = Mozilla/4.72 [en] (Win95; U) HTTP_REFERER = HTTP_COOKIE = SERVER_SOFTWARE = Apache/1.3.3 (Unix) SERVER_NAME = www.csz.com GATEWAY_INTERFACE = CGI/1.1 SERVER_PROTOCOL = HTTP/1.0 SERVER_PORT = 80 REQUEST_METHOD = GET HTTP_ACCEPT = image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */* PATH_INFO = PATH_TRANSLATED = SCRIPT_NAME = /cgi-bin/test-cgi QUERY_STRING = x=y REMOTE_HOST = REMOTE_ADDR = 38.253.188.243 REMOTE_USER = AUTH_TYPE = CONTENT_TYPE = CONTENT_LENGTH =
HTTP • https://secure1.csz.com/cgi-bin/ssienvirodump.pl HTTP_USER_AGENT = Mozilla/4.72 [en] (Win95; U) HTTP_REFERER = https://secure1.csz.com/vitafree/order.html HTTP_COOKIE = www.symantec.com FALSE / FALSE 972781498 Pass 1 SERVER_SOFTWARE = Apache/1.3.3 (Unix) SERVER_NAME = www.csz.com GATEWAY_INTERFACE = CGI/1.1 SERVER_PROTOCOL = HTTP/1.0 SERVER_PORT = 80 REQUEST_METHOD = GET HTTP_ACCEPT = image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */* PATH_INFO = PATH_TRANSLATED = SCRIPT_NAME = /cgi-bin/test-cgi QUERY_STRING = x=y REMOTE_HOST = REMOTE_ADDR = 38.253.188.243 REMOTE_USER = mussatto AUTH_TYPE = basic CONTENT_TYPE = CONTENT_LENGTH =
HTTP-NG • Limitations of HTTP/1.1 • lack of modularity • message transport, method invocation, document processing too tightly interwoven • performance concerns • HTTP accounts for too much of the load on the Net • wireless at a disadvantage • OBEd by XML
HTTP-XLMXML- Definitions • XML is a method for putting structured data in a text file • XML looks a bit like HTML but isn't HTML • XML is text, but isn't meant to be read • XML is a family of technologies • XML is verbose, but that is not a problem • XML is new, but not that new • XML is license-free, platform-independent and well-supported