140 likes | 345 Views
Internet Standards: E-Mail. Dr. Jussi Kangasharju Telecooperation Group TU Darmstadt. A bit of history. E-Mail predates the Internet First electronic mail existed on time-sharing computers Sending mail is one of the earliest services on the Internet Used most bandwidth until early 90ies
E N D
Internet Standards: E-Mail Dr. Jussi Kangasharju Telecooperation Group TU Darmstadt
A bit of history • E-Mail predates the Internet • First electronic mail existed on time-sharing computers • Sending mail is one of the earliest services on the Internet • Used most bandwidth until early 90ies • E-Mail is one of the few actual standards in the net (STD 11) – most protocols are just Request for Comments • Was a driving force for interconnecting networks through the Internet
Standards for E-Mail • Layered model • Layers are independet from each other • Various protocols for each layer RFC2045 RFC2822, RFC2045ff, … Body Body STD11 / RFC2822, … Headers IMAP Headers POP SMTP Mail-Transport UUCP Network TCP/IP
Mail Transport • Standard Internet Mail Transport protocols (SMTP, POP) are text based • Human readable • Easier to debug • Possibly easier to implement • Downside • Uses the smallest common base of “what is text” • Text = 7-Bit US-ASCII, i.e., standard Latin characters, Arabic digits, punctuation marks • Other characters or binary data have to be mapped upon ASCII
SMTP • Simple Mail Transfer Protocol – STD 10 • Really very simple • Connect to SMTP server • Specify sender- and receiver-address • Dump header + body of the mail • May contain different “From” and “To” header fields than the ones given at (2) • No authentication, no authorization in original protocol • Ever wondered where all the spam is coming from?
POP • Post Office Protocol – RFC 1939 • SMTP servers are designed to be always online • No problem for multiuser machines • Not feasible for most PC users (especially before DSL) • Solution: POP • Stores mail on a server • POP clients must actively poll new emails • Clients are expected to delete mails on the server • Text based, few commands • Responses always start with “+OK” or “-ERR” • Responses containing more than one line always end with a<CRLF>.<CRLF>
POP sample session • Connect to POP server +OK Really Simple POP3 Server ready • Send credentials user user +OK Name is a valid mailbox pass pass +OK Login succeeded • Get number of mails stat +OK 5 140875 • Retrieve Mail retr 1 +OK 1826 octets <Mail content> . • Exit quit +OK Goodbye
E-Mail • E-Mail is an old protocol with much legacy • Very limited range of characters – 7 bit • CRLF to separate lines instead of automatic line-wrap • Maximum line length: at most 998 characters, 78 characters recommended • Headers are just standard text • Separated from body by empty line • Syntax: <Header-Name> “:” <Header-Value> • Header-Names are case-insensitive, Header-Values may be case-sensitive • Values may span over several lines:line continuation signaled by CRLF followed by one or more space characters • E-Mail headers are defined in various documents • Standard headers – RFC 2822: From, To, CC, Subject, Date
MIME • Multipurpose Internet Mail Extensions • RFCs 2045-2047 • Extends plain text mail by: • Sending non-ASCII characters • Sending binary data • Multipart messages: attachments, formatted mail • Important headers • Mime-Version: Must be 1.0, E-Mails without this header are not MIME compliant • Content-Type • Define type of data – useful for selecting viewing/editing-program • Also used outside mail now (HTTP, operating systems) • Content-Transfer-Encoding: Representation of non-ASCII characters
Binary mail transfer • How can I send binary data? • Simple idea • Map range [0;255] to [0;127] • Form group of 7 bytes (byte = 8 bits) • Create extra-character out of most significant bit • Send 8 7-bit “bytes” • Does not work • Maximum line length: no guarantee that CRLF is part of data • Characters 0-31 are control characters, mail servers are allowed to change them corrupt data • What if the number of bytes is not divisible by 7?
Base 64 • Same idea for binary data • Uses less characters • Guaranteed to work, even if servers change control characters • Content-Transfer-Encoding: Base64 • Idea (RFC2045) • Group 3 bytes together • Make 4 groups of 6 bits each out of them • Convert each 6-bit-group into character [A-Za-z0-9+/] • Add CRLF after 78 characters • If data not divisible by 3: stuff with 0-bytes, use “=” as encoding of stuffed 0 to indicate this • Base64 has ~33% overhead to binary transmission
Quoted printable • Second way of sending binary data • Content-Transfer-Encoding: quoted-printable • Any byte may be represented by equals-sign + 2-digit hexadecimal encoding of the byte (e.g. “A” “=41”) • Any standard US-ASCII-character may also be used without encoding (with exception of the equals-sign) • “Quoted unreadable” • Used if standard text is interspersed with some non-ASCII characters, e.g., German text with umlauts • Compared to Base64, the text is still almost readable without explicit decoding
Multipart messages • MIME-mails may contain other MIME-mails recursively • Indicated by special Content-Type • multipart/mixed: attachments • multipart/alternative: Mail has different representations (e.g. plain text and HTML), the client may select any alternative • How are these mails separated in the content? • Content-Type defines a boundary = a line that separates the mail parts • Split the mail-body along the boundary – each chunk is a separate mail • Each mail may have its own header, separated from its body by an empty line
Assignment 2 • Create a simple POP-mail client • Only for receiving mail, no mail sending needed • Simple POP server for local testing provided at the homepage • Mail client should display • Plain text mails • MIME-mails with Base64 and Quoted-Printable encoding in the body • MIME mails with attachments • Saving attachments should be possible • Each alternative of multipart/alternative mails should be displayed