1 / 30

HTTP WEB

HTTP WEB. Risanuri Hidayat, Ir., M.Sc. World Wide Web. T. Berners-Lee, R. Fielding, H. Frystyk: “Hypertext Transfer Protocol - HTTP/1.0”, RFC 1945, 1996. Naming scheme for resources URL, URN, URI Multimedia documents MIME encoding (RFC) Transfer protocol HTTP/1.0, HTTP/1.1

cardea
Download Presentation

HTTP WEB

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HTTPWEB Risanuri Hidayat, Ir., M.Sc.

  2. World Wide Web • T. Berners-Lee, R. Fielding, H. Frystyk: “Hypertext Transfer Protocol - HTTP/1.0”, RFC 1945, 1996. • Naming scheme for resources • URL, URN, URI • Multimedia documents • MIME encoding (RFC) • Transfer protocol • HTTP/1.0, HTTP/1.1 • Implemented over TCP/IP • Integrated with Internet infrastructure • DNS, SMTP

  3. Sejarah • Hypertext systems: • no network access protocol • Gopher, WAIS • no hyperlinks • WWW @ CERN (Tim Berners-Lee, 1990) • HTTP/0.9 (1992)

  4. Application layer protocol smtp [RFC 821] telnet [RFC 854] http [RFC 2068] ftp [RFC 959] proprietary (e.g. RealNetworks) NSF proprietary (e.g., Vocaltec) Underlying transport protocol TCP TCP TCP TCP TCP or UDP TCP or UDP typically UDP Application e-mail remote terminal access Web file transfer streaming multimedia remote file server Internet telephony Aplikasi Internet

  5. What is HTTP • HTTP stands for Hypertext Transfer Protocol. It's the network protocol used to deliver virtually all files and other data (collectively called resources) on the World Wide Web, whether they're HTML files, image files, query results, or anything else. Usually, HTTP takes place through TCP/IP sockets (and this tutorial ignores other possibilities). • A browser is an HTTP client because it sends requests to an HTTP server (Web server), which then sends responses back to the client. The standard (and default) port for HTTP servers to listen on is 80, though they can use any port. • HTTP is used to transmit resources, not just files. A resource is some chunk of information that can be identified by a URL

  6. HTTP version status code reason headers message body method URL or pathname HTTP version headers message body HTTP/1.1 200 OK resource data GET //www.dcs.qmw.ac.uk/index.html HTTP/ 1.1 HTTP • Methods: • GET, HEAD, POST • PUT, DELETE, TRACE, OPTIONS, CONNECT • Resource := MIME-encoded data • Content negotiation • Authentication

  7. URL http://www.cdk3.net:8888/WebExamples/earth.html DNS lookup Resource ID (IP number, port number, pathname) 55.55.55.55 8888 WebExamples/earth.html Web server Network address file 2:60:8c:2:b0:5a Socket URL

  8. HTTP Transactions • HTTP uses the client-server model: • An HTTP client opens a connection and sends a request message to an HTTP server; • the server then returns a response message, usually containing the resource that was requested. • After delivering the response, the server closes the connection (making HTTP a stateless protocol, i.e. not maintaining any connection information between transactions).

  9. http: hypertext transfer protocol WWW’s application layer protocol client/server model client: browser that requests, receives, “displays” WWW objects server: WWW server sends objects in response to requests http1.0: RFC 1945 http1.1: RFC 2068 http request PC running Explorer http response http request Server running Apache Web server http response SUN running Netscape Navigator HTTP Protocol

  10. http: TCP transport service: client initiates TCP connection (creates socket) to server, port 80 server accepts TCP connection from client http messages (application-layer protocol messages) exchanged between browser (http client) and WWW server (http server) TCP connection closed HTTP Protocol http is “stateless” • server maintains no information about past client requests Protocols that maintain “state” are complex! • past history (state) must be maintained • if server/client crashes, their views of “state” may be inconsistent, must be reconciled

  11. HTTP Protocol • The format of the request and response messages are similar, and English-oriented. Both kinds of messages consist of: • an initial line, • zero or more header lines, • a blank line (i.e. a CRLF by itself), and • an optional message body (e.g. a file, or query data, or query output).

  12. Request • Initial Request Line • A request line has three parts, separated by spaces: a method name, the local path of the requested resource, and the version of HTTP being used. • A typical request line is: • GET /path/to/file/index.html HTTP/1.0 • GET is the most common HTTP method; it says "give me this resource". Other methods include POST and HEAD-- more on those later. Method names are always uppercase. • The path is the part of the URL after the host name, also called the request URI (a URI is like a URL, but more general). • The HTTP version always takes the form "HTTP/x.x", uppercase

  13. HTTP Request Header Format • Two types of messages: request, response • http request message: • ASCII (human-readable format) request line (GET, POST, HEAD commands) GET /somedir/page.html HTTP/1.1 Connection: close User-agent: Mozilla/4.0 Accept: text/html, image/gif,image/jpeg Accept-language:en (extra carriage return, line feed) header lines Carriage return, line feed indicates end of message

  14. HTTP Request Header Format

  15. Response/Reply • Initial Response Line (Status Line). The initial response line, called the status line, also has three parts separated by spaces: • the HTTP version, • a response status code that gives the result of the request, and • an English reason phrase describing the status code. • Typical status lines are: • HTTP/1.0 200 OK or • HTTP/1.0 404 Not Found Notes:

  16. HTTP Reply Header Format status line (protocol status code status phrase) HTTP/1.1 200 OK Connection: close Date: Thu, 06 Aug 1998 12:00:15 GMT Server: Apache/1.3.0 (Unix) Last-Modified: Mon, 22 Jun 1998 …... Content-Length: 6821 Content-Type: text/html data data data data data ... header lines data, e.g., requested html file

  17. HTTP Reply Status Code • 200 OK • request succeeded, requested object later in this message • 301 Moved Permanently • requested object moved, new location specified later in this message (Location:) • 400 Bad Request • request message not understood by server • 404 Not Found • requested document not found on this server • 505 HTTP Version Not Supported

  18. Sample HTTP Exchange • To retrieve the file at the URL • http://www.somehost.com/path/file.html first open a socket to the host www.somehost.com, port 80 (use the default port of 80 because none is specified in the URL). Then, send something like the following through the socket: GET /path/file.html HTTP/1.0 From: someuser@jmarshall.com User-Agent: HTTPTool/1.0 [blank line here]

  19. Sample HTTP Exchange • The server should respond with something like the following, sent back through the same socket: HTTP/1.0 200 OK Date: Fri, 31 Dec 1999 23:59:59 GMT Content-Type: text/html Content-Length: 1354 <html> <body> <h1>Happy New Millennium!</h1> (more file contents) . . . </body> </html> • After sending the response, the server closes the socket.

  20. Authentication goal: control access to server documents stateless: client must present authorization in each request authorization: typically name, password authorization: header line in request if no authorization presented, server refuses access, sends a WWW authenticate: header line in response usual http request msg + Authorization:line usual http request msg + Authorization:line usual http response msg usual http response msg time User-server interaction: authentication server client usual http request msg 401: authorization req. WWW authenticate:

  21. Server sends “cookie” to client in response Set-cookie: # Client present cookie in later requests cookie: # Server matches presented-cookie with server-stored cookies authentication remembering user preferences, previous choices usual http request msg cookie: # usual http request msg cookie: # usual http response msg usual http response msg User-server interaction: cookies server client usual http request msg usual http response + Set-cookie: # cookie- spectific action cookie- spectific action

  22. Goal: don’t send object if client has up-to-date stored (cached) version client: specify date of cached copy in http request If-modified-since: <date> server: response contains no object if cached copy up-to-date: HTTP/1.0 304 Not Modified http response HTTP/1.0 304 Not Modified User-server interaction: conditional GET server client http request msg If-modified-since: <date> object not modified http request msg If-modified-since: <date> object modified http response HTTP/1.1 200 OK … <data>

  23. MIME: multimedia mail extension, RFC 2045, 2056 additional lines in msg header declare MIME content type From: alice@crepes.fr To: bob@hamburger.edu Subject: Picture of yummy crepe. MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Type: image/jpeg base64 encoded data ..... ......................... ......base64 encoded data . Message format: multimedia extensions MIME version method used to encode data multimedia data type, subtype, parameter declaration encoded data

  24. Text example subtypes: plain, html Image example subtypes: jpeg, gif Audio exampe subtypes: basic (8-bit mu-law encoded), 32kadpcm (32 kbps coding) Video example subtypes: mpeg, quicktime Application other data that must be processed by reader before “viewable” example subtypes: msword, octet-stream MIME types

  25. HTTP Headers (samples) Mean #bytes per header: 300 (requests), 160 (responses) * Require parsing ! • User-Agent • Mozilla/4.0 • Accepts: (client-side) • text/html, image/* • Content-type: (server-side) • text/html • Expires, Last-Modified, If-Modified-Since • absolute time stamps (1-sec resolution) • Eg: Thu, 03 Jun 1999 20:16:34 GMT= • Accept-Language, Accept-Charset • Content-encoding

  26. HTTP/1.1 Improvements • B/W optimization • persistent connections • pipelining • does not block waiting for previous responses • end-of-message mechanism • Content-range • access only specified “range” of a resource • Explicit cache control (Cache-control) • Digest authentication (Content-MD5)

  27. User sets browser: WWW accesses via web cache client sends all http requests to web cache if object at web cache, web cache immediately returns object in http response else requests object from origin server, then returns http response to client Web Caches (proxy server) Goal: satisfy client request without involving origin server origin server Proxy server http request http request client http response http response http request http request http response http response client origin server

  28. Assume: cache is “close” to client (e.g., in same network) smaller response time: cache “closer” to client decrease traffic to distant servers link out of institutional/local ISP network often bottleneck Why WWW Caching? origin servers public Internet 1.5 Mbps access link 10 Mbps LAN institutional network institutional cache

  29. Web caching (in)effectiveness • Observed hit ratios below 50% • even lower byte-weighted ratios ! • Possible remedies ? • Prefetching • Delta-encoding • HTML macros • Duplicate suppression (digest-based)

  30. HTTP status & perspective • J. C. Mogul, “What’s wrong with HTTP (and why it doesn’t matter)”, Proc. USENIX Technical Conference, 1999 • Definitely not optimal • Probably adequate • It works well enough • It’s not the only game in town • Two-way initiation of operations • Real-time • Deferred delivery • Revising it again would be too hard • HTTP/1.0 -> HTTP/1.1 evolution took 4+ years !

More Related