290 likes | 846 Views
The Secret Life of your email server Colin Chaplin Bsc (Hons) MCSE Technical Architect, Unisys Global Information Services Topics Email Server Concepts Email Components Email Routing Design Considerations Availability Backup and restore Sizing History
E N D
The Secret Life of your email server Colin Chaplin Bsc (Hons) MCSE Technical Architect, Unisys Global Information Services
Topics • Email Server Concepts • Email Components • Email Routing • Design Considerations • Availability • Backup and restore • Sizing
History • Multitude of different and non-communicative systems since 1970s… • Internet and SMTP (RFC821 – Jonathan Postel) made email de-facto from 1982 • Massive growth as a serious business and personal communication tool
Email System Components • Client – what the end user controls • Server(s) – provides the following services • Database – holds the emails, Communicate with clients • Mail Transfer Agent (MTA) – responsible for shipping the email to the outside world. Called a Connector • Directory – Tells the MTA where to route emails and can also allow clients to pick addresses
Check Where to Send Email Backup/ Restore Email Server(s) The Raw Parts To/From foreign MTAs Client Actions – receive, retrieve, send email Email In/Out
Email Routing across internet • mailbox @ organisation, e.g Colin.Chaplin@Unisys.com • MTA uses DNS to translate organisation into one (or more) servers it can ship to • unisys.com -> smtp1.unisys.com -> 151.176.34.2 • MTA ships email to remote MTA using SMTP protocol • Remote mail system decides what to do with the email
Mailbox1.unisys.com Mailbox2.unisys.com Mailbox3.unisys.com Mailbox4.unisys.com Email Routing Example Smtp1.unisys.com Smtp2.unisys.com Trifle.dessert.local Smtp3.unisys.com DNS Active Directory
SMTP Communication 220 usbb-lacimss1 Trend Micro InterScan Messaging Security Suite, Version: 5.5 ready at Sun, 27 Nov 2005 07:36:10 -0500 helo trifle.dessert.local 250 usbb-lacimss1 Hello [62.49.21.166] mail from:<colin@chaplin.me.uk> 250 <colin@chaplin.me.uk>: Sender Ok rcpt to:<colin.chaplin@unisys.com> 250 <colin.chaplin@unisys.com>: Recipient Ok data 354 usbb-lacimss1: Send data now. Terminate with "." subject: test hello 123 How Now Brown Cow . 250 usbb-lacimss1: Message accepted for delivery
SMTP Continued • Simple ! • NO Authentication of Sender (president@whitehouse.gov) • The reason why spam exists • Only a method for sending email, NOT a method for retrieving
Microsoft Exchange • Industry Standard corporate messaging system • 5th Version (2003) – mature • Own Client and Protocol (Outlook & MAPI) plus industry standards (POP3, IMAP4, HTTP) • Rich Functionality • Scalable • Probably No.1 common business app
Design Considerations • Size • How Many Users (50 – 100,0000 +) • How much email (50MB per user ?) • Where are the users? • Availability • How much downtime is tolerated (SLA) • how to mitigate single points of failure • BACKUP AND RESTORE
Sizing • An email server typically demands a high bandwidth connection with its clients and connection speed can dictate placement of servers • Political and administrative boundaries can also dictate placement of servers • Cheaper bandwidth meaning business can run email from one datacenter
Less than 500 Users • Single Server Solution, basic configuration • Local Backup (Direct Attached Tape Drives) • Supported occasionally by local techie
500 Plus Users • Multiple Mailbox Servers • Perhaps multiple connectors (MTA) servers • Perhaps Clustered • Workgroup Class Backup (small tape robots, etc) • Supported by support team as part of their duties
1000 – 100,000 Plus Users • Multiple Mailbox Servers, very high spec • Multiple Servers for other features - MTA, Web Access, FAX-to-email gateway, Blackberry • Enterprise Class Storage and Backup (SAN and Tape Library) • Cold Standby parts • Dedicated Support Staff
Backup and Restore • Restore time drives many design decisions even the fastest backup/restore system can take hours with big databases • Backups typically done nightly to tape, direct-to-disk, tape robots. Stored offsite • Restore process documented, practised, and tested regularly • Goal for backup & restore is NO email loss and quick recovery
The Wonder of Transaction Logs.. • Disaster at 16:00 ! • Restore last nights backup • Logfiles will feed data back into database
Transactions Logs cont • Transaction Logs record all changes to database • If transaction logs are applied to restored database, Exchange will ‘play forward’ and bring database up to the point it was lost • Transaction logs are serial read/writes, database access is random read/write
Achieving High Availability • Resilient Servers – redundant fans, disks (RAID) anything that moves ! • Multiple Servers performing same job • Clustering (NOT perfect!) • Proper Design • Monitoring and Support
5 Minute RAID • Redundant Array Inexpensive/ Independent Disks • Split data across multiple disks for resiliency and/or performance. ‘Hot Swappable’ • RAID 0 = split across two or more disks, no resilience, but fast (Never use in high-availability design) • RAID 1 = Mirror data across two or more disks (often called mirroring). ½ of diskspace ‘wasted’ in providing resiliency • RAID 5 = Compute a checksum and use maths to figure out what information is on the missing disk. ‘Wastes’ at least one disk • More Spindles, more speed !
RAID 1 Lets’ store the number ‘123456’ on a RAID set 1 3 5 2 4 6 ALL Data lost ! RAID 0 1 2 3 4 5 6 1 2 3 4 5 6 RAID 5 All OK, all data intact Split data into chunks Compute a checksum every nth chunk, where n is the number of disks Write the chunks to disk, and put the checksum on a different disk every time we write a ‘line’ Use simple maths if we loose a disk 3 1 2 No Maths needed 3 7 4 7 – 3 = 4 11 5 6 11 – 5 = 6
Clustering • 2 or more node servers each connected to common disk backend (can access the same data) • Virtual Server presented to the network, owned by one of the machines (the active node) • When one of the other nodes detects the active node is unavailable, it will take over • Active/Passive – Active/Active also possible
Email Server(s) Email Server(s) Client PC Other node detects failure, obtains resources, and responds to requests to Virtual Server Active Node will Respond to request to virtual node Virtual Server
Great…so what ? • Provides a method for complete server resiliency • Technically possible to have nodes in separate locations • Applications must be ‘cluster aware’ (exchange is) • BUT… • Database is still single point of failure, perception challenge • Hardware costs double • Clustering complexity can cause outages !
Design a server for 1000 people • Mailbox size = 50MB • Additional part of existing infrastructure • 4 hours SLA • On-site computer room lacking in space • GBit backbone network • Large Tape Robot System • Single Site, LAN speed • 50 x 1000 users = ~ 50GB Database size • Single Server • Use existing backup system, over LAN • Exchange 2003 running on Windows 2003 Server, Mcafee GroupShield for email virus scanning, NetIQ Appmanager for monitoring
Typical Spec • HP Compaq DL380 G4 • 4GB RAM • 2 Processors • Onboard SCSI RAID • 4 x18GB Disks, 2 x 144GB Disks • Redundant fans, Power supply, network connection
Design a server for 1000 people 2 x 18 GB RAID 1 C: Drive System (Windows) 2 x 18 GB RAID 1 L: Drive TransactionLogs 2 x 144 GB RAID 1 N: Drive Database