320 likes | 331 Views
Application Level Protocols. Outline Simple Mail Transfer Protocol (SMTP) HyperText Transport Protocol (HTTP). Three Points. Application protocols are not application programs HTTP is a protocol used to fetch web pages, while Netscape and Mosaic are programs that use HTTP
E N D
Application Level Protocols Outline Simple Mail Transfer Protocol (SMTP) HyperText Transport Protocol (HTTP) CMSC 332
Three Points • Application protocols are not application programs • HTTP is a protocol used to fetch web pages, while Netscape and Mosaic are programs that use HTTP • Both protocols examined here use request/reply paradigm, yet they are not implemented on top of RPC (but instead on TCP) • Both have a companion protocol that specifies the format of data that can be exchanged • RFC 822 and Multipurpose Internet Mail Extensions (MIME) are companion to SMTP • HTML is companion that specifies form of web pages CMSC 332
RFC 822 • Defines header and body, both ASCII • Augmented by MIME to allow message body to carry all sorts of data, but data still represented as ASCII text • Header • Series of <CRLF> terminated lines (<CRLF> stands for carriage return + linefeed ASCII characters) • Separated from body by blank line • Each header line has form <type>:<value> • Ex. To:, Subject:, Date:, From: CMSC 332
MIME • RFC 822 extended in 1993 & 1996 to allow email to carry video, Word files, etc. • Three basic pieces • Header lines augmenting RFC 822 • Set of content types and subtypes • Encoding schemes for various data types to allow shipment as ASCII CMSC 332
MIME (cont.) • Header extensions describe data being carried by message body • MIME-Version: • Content-Description: human readable description of what’s in message • Content-Type: type of data • Content-Transfer-Encoding: how data is encoded CMSC 332
MIME Content Types • image/gif, image/jpeg • text/plain, text/rich: “marked up” text (I.e. special fonts, etc) • application/ • application/postscript, etc • multipart: how a message carrying more than one data type is structured • Like a programming language with base types and structs, etc CMSC 332
MIME Encoding • Ex. JPEG: Each byte can assume one of 256 values, not all of which are valid ASCII characters • Must use ASCII encoding because gateways can corrupt non-ASCII portions • Solution is the base64 encoding scheme: • Three bytes of binary data converted to four ASCII characters • 24 bits divided into four groups of 6 bits, each of which maps to a valid ASCII character (so note that such an encoding only contains alphanumeric and + and – characters) • Regular text can be encoded using 7-bit ASCII CMSC 332
Example MIME-Version: 1.0 Content-Type: multipart/mixed; boundary=“-------417CA6E2DE4ABCAFBC5” From: Doug Szajda <dszajda@richmond.edu> To: registrar@richmond.edu Subject: grades for CMSC 332 class this semester Date: Thurs, 11 April 2002 12:22:19 -0400 -------417CA6E2DE4ABCAFBC5 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7 bit Dear Ron, I’ve decided to fail them all! It will save me the trouble of having to grade their finals. I’ve also included the picture you requested, and a copy of my latest paper. Doug CMSC 332
Example -------417CA6E2DE4ABCAFBC5 Content-Type: image/jpeg Content-Transfer-Encoding: base64 . . Unreadable encoding of a jpeg image . -------417CA6E2DE4ABCAFBC5 Content-Type: application/postscript; name=“draft.ps” Content-Transfer-Encoding: 7bit . . Readable encoding of a PostScript document . CMSC 332
SMTP • SMTP is the protocol that handles the actual transfer • The players: • mail reader: program with which user interacts (I.e. Pine, Elm, Netscape Mail, etc.). Transfers messages to/from mail daemon • mail daemon: process running on each host. Uses SMTP over TCP to transfer message to mail daemon running on other host. Though any code implementing SMTP could be used, typical daemon is descendent of original Berkeley implementation of sendmail CMSC 332
SMTP • Players (cont.) • mail gateway: similar to an IP gateway (what we have called a router), but willing to buffer messages and retry sending for days CMSC 332
Why Gateways? • I.e. why not just send message directly to receiver’s host? • Receiver doesn’t want to include specific host on which he/she reads email • May read mail on several hosts • Host on which mail is read may not be connected to network • Solution: have mail delivered to a mail gateway at destination (I.e. at umiacs.umd.edu) • Mail then forwarded (using another SMTP/TCP connection) to user (after dbase lookup) OR • User fetches mail from server using something like Post Office Protocol (POP3) CMSC 332
SMTP (cont.) • Independent SMTP connection used between each mail gateway. • SMTP sessions are dialog between two daemons, one acting as client, the other as server • SMTP is ASCII based (makes sense since RFC 822 defines messages using this base representation) • Means a human at a keyboard can pretend to be an SMTP client program CMSC 332
Example SMTP Dialogue HELO richmond.edu 250 Hello daemon@mail.richmond.edu [141.166.127.54] MAIL FROM:<Ron@registrar.richmond.edu> 250 OK RCPT TO:<dszajda@richmond.edu> 250 OK DATA 354 Start mail input; end with <CRLF>.<CRLF> Hi Doug, I realize they whine a lot, and they probably deserve to fail, but they’ll be alums some day and that means money! And thanks for the picture. Ron <CRLF>.<CRLF> 250 OK QUIT 221 Closing connection CMSC 332
SMTP (cont.) • Client posts commands: HELO, MAIL, etc • Server responds with a numeric code along with human readable explanation • Note the server verifies that name supplied in HELO command corresponds to IP address being used for TCP connection • Mail daemon parses message for info needed to run SMTP (info is called envelope for message). Daemon then uses this to parameterize SMTP exchange. sendmail popular because no on wanted to reimplement parsing function! CMSC 332
HTTP • Assume you know and understand the following terms • Universal Resource Locator (URL) • Web browser • Hypertext link • Basic Web mechanism • User “selects” page, browser (client) fetches page from server using HTTP running over TCP CMSC 332
HTTP (cont.) • Like SMTP, HTTP is text-oriented • HTTP message form • START_LINE <CRLF> :request or response • MESSAGE_HEADER <CRLF> :as in email messages • <CRLF> :blank line that terminates header • MESSAGE_BODY <CRLF> :typically empty for requests CMSC 332
HTTP Request Messages • START_LINE specifies: • Operation to be performed (see next slide) • GET: retrieve and display a web page • HEAD: test validity of a hypertext link or check if page has been modified since last fetched • Web page operation should be performed on • Version of HTTP being used • Ex. GET http:www.mathcs.richmond.edu/index.html HTTP/1.1 • Ex.GET index.html HTTP/1.1 Host: www.mathcs.richmond.edu CMSC 332
HTTP Request Operations OPTIONS: request information about available options GET: retrieve document identified in URL HEAD: retrieve metainformation about document identified in URL POST: give information (e.g. annotation) to server PUT: store document under specified URL DELETE: delete specified URL TRACE: loopback request message CONNECT: for use by proxies CMSC 332
HTTP Response Message • START_LINE specifies: • Version of HTTP being used • Three digit code indicating success/failure of request • Text string giving reason for response • Ex. HTTP/1.1 202 Accepted • Ex. HTTP/1.1 404 Not Found CMSC 332
HTTP Response Message • Possible header lines: • Location: requested URL available at another location • Content-length • Expires • Last-modified: time at which contents last modified at server • HTTP/1.1 301 Moved Permanently Location:http://www.mathcs.richmond.edu/cs/index.html CMSC 332
HTTP Result Codes CMSC 332
HTTP and TCP • HTTP Version 1.0 • Separate TCP connection established for each data item retrieved from server • Connection setup and teardown required even if just checking on page status • Text plus a dozen icons requires more than 13 separate connections being established and closed • Needless to say, efficiency concerns CMSC 332
HTTP and TCP • HTTP Version 1.1 (latest) • Persistent connections: client and server can exchange multiple request/response messages over same TCP connection • Eliminate connection setup/teardown, reducing load on server, load on network, and delay perceived by user • TCP congestion window operates more efficiently, because it is not necessary to go through slow start phase for each page • The cost (there always is one): Neither client nor server knows how long to keep connection open. • Especially problematic for server • Both sides must watch for signal to close their side of connection CMSC 332
Caching Web Pages • Active area of research (and entrepreneurship) • Advantages: • Pages can be displayed faster is fetched from closer server • Reduced server load since less requests to handle • Where to cache? • User’s browser • Sitewide cache (user browsers configured to check this proxy cache first) • ISPs: router snoops packets on way in and way out • Various HTTP design features support this • Server assigns expiration date to each page it sends to client • Various cache directives (can a document be caches, how long, etc) CMSC 332
Simple Network Management Protocol (SNMP) • As sysadmin, might want to know about state info at any of dozens of routers, hundreds of hosts • Address translation tables • Routing tables • TCP connection state • Number of IP datagram reassemblies aborted (does timeout that garbage collects IP fragments need to be tuned?) • Loads on various nodes CMSC 332
SNMP • Allows reading and writing of various pieces of state information on different nodes • Specialized request/response protocol (that runs over UDP) supporting • GET • SET • GET-NEXT (explained below) • Typically, sysadmin uses browser like interface, which turns requests into SNMP operations, which are in turn fulfilled by SNMP server running on host in question CMSC 332
Management Information Base (MIB) • Two questions • How does client indicate which piece of info it wants to retrieve? • How does server know which variable in memory to read to satisfy request? • MIB is a companion specification that defines specific pieces of information (MIB variables) that can be retrieved from network node • Current version: MIB-II CMSC 332
MIB-II • Variables organized in 10 groups • System: general system (node) parameters (name, location, etc) • Interfaces: address(es), how many packets sent/received • Address translation: ARP info, including ARP table • IP: routing table entries, how many packets forwarded, stats about datagram reassembly, dropped packets, etc. • TCP: number of open connections, timeouts, default timeout settings. Also per-connection info while connection exists • UDP: total number of UDP datagrams sent/received • Also groups for Internet Control Message Protocol (ICMP) , EGP, SNMP, “different media” CMSC 332
Two More Problems • Precise syntax used to state which MIB variables wanted • Precise representation for values returned by server • ASN.1/BER: Abstract Syntax Notation One/ Basic Encoding Rules (Recall text Section 7.1) • Takes care of second problem • ASN.1/BER also defines an object identification system (not described in text) and this is used to give globally unique ID to all MIB variables CMSC 332
Example • 1.3.6.1.2.1.4.3 is unique ASN.1 identifier for IP related MIB variable ipInReceives • 1.3.6.1.2: prefix identifies MIB database (ASN.1 object ID’s are for all possible objects in world) • 4: IP grou • 3 for third variable in this group CMSC 332
SNMP Summary • SNMP client puts ASN.1 identifier for MIB variable into request message, sends this to server • Server maps identifier into a local variable, retrieves value held in variable, uses ASN.1/BER to encode value it sends back to client • GET-NEXT: Many MIB variables are structs or tables. GET-NEXT, when applied to a variable ID, returns the value of variable plus the ID of the next variable (next table entry or field in struct). Thus helps walk through data structures. CMSC 332