470 likes | 773 Views
Exchange 2007 and NetApp – Best Practices and Guidelines. Mark Arnold Consulting Systems Engineer Microsoft Solutions NetApp. Agenda. Exchange 2007 Server Recap: Roles Server Sizing Exchange Sizing NetApp Sizing Disks Aggregates Volumes & LUNs Snap* Exchange Care & Maintenance
E N D
Exchange 2007 and NetApp – Best Practices and Guidelines Mark Arnold Consulting Systems Engineer Microsoft Solutions NetApp
Agenda • Exchange 2007 Server Recap: • Roles • Server Sizing • Exchange Sizing • NetApp Sizing • Disks • Aggregates • Volumes & LUNs • Snap* • Exchange Care & Maintenance • Exchange Backup, DR & HA
Exchange 2007 Roles Mailbox • Clustered role if necessary Hub Transport • Internally load balanced • External facing need a load balancing solution (NLB, hardware etc.) Client Access • Not load balanced • Use DNS RR, NLB, hardware etc) Edge • DMZ • Anti-spam engine, content filtering, Outlook safe list sync Unified Messaging • Voicemail & inbound faxing
Server Sizing http://msexchangeteam.com/archive/2007/01/16/432222.aspx Mailbox Servers • Start with 2GB minimum • Add 2GB per 4 SGs added • Add @5MB per mailbox HT • Start with 2GB • Add 3KB per message (simultaneous) • Add 1KB per recipient (per message) CAS • Start with 2GB • Add 2MB per concurrent connection Multi Role • Start with mailbox role • Add 1GB per additional role • Add further memory as per calcs for individual role(s)
Host Memory Maximize it Our sizing example • 4000 users, 8GB server = 1.48MB cache per user • 4000 users, 16GB server = 3.48MB cache per user IOPS • More hits from cache = less reads from disk • Microsoft tout 70% reduction • Verifiable, realistic and repeatable
Database Sizing Microsoft: • 100 to 200GB per store • http://technet.microsoft.com/en-us/library/bb331954(EXCHG.80).aspx NetApp • Varies, depending on scenario • Always larger Limited Microsoft capability for rapid restores Microsoft recommendation based on RTO deliverables
Why That Matters Insufficient memory & processor will lead to “Back Pressure” • Server throttles back – reduced traffic flow • http://technet.microsoft.com/en-us/library/bb201658(EXCHG.80).aspx The Exchange admins will see suspicious event log entries Server reboots will impact FAS I/O • Cache rebuild • Maximize the memory and suffer the short-term performance degredation
Bottom Line Use the Microsoft spreadsheet Review their disk results Remember that NetApp stores can be far larger • Restore times are better and steps easier NetApp will be better on IOPS due to larger disk availability
Disks FC disks best, naturally • Best reliability and capacity sweet-spot • Always FC for the stores if SATA/FC mix SATA? • Acceptable solution for logs • If there are enough spindles • Have more spares SAS (20n0 series) • Acceptable for both • Usual aggregate separation rules
Aggregates One Aggregate • Maximizes disk utilization • Lose three disks, lose everything • Perfectly acceptable for low to mid I/O requirements • Review FlexShare Two Aggregates • No dependency between logs and store disks • More parity (at least four rather than two) • Necessary for highest I/O requirements
Volumes Always FlexVols • Space utilization flexibility Volume limitations • 5.3.7x = 23, 6.0.x = 25, 6.1.x = 150, 6.2x = 200, 6.3x = 200, 6.4x = 200, 6.5x = 200, 7.0 = 200 volumes per storage system (up to 100 traditional volumes or aggregates), 7.1 = 200 volumes per storage system (up to 100 traditional volumes or aggregates) • 7.2 = 200 to 500 volumes on a storage system, depending on the storage system model See the problem? • Co-host logs first to reduce volume count • Likelihood of restoring those volumes minimal • But don’t share volumes across servers • Snapshot busy
LUNs Dedicate a LUN per store Have separate LUNs for DB and Logs Can share SMTP queue and Logs Don’t share a LUN for DB and SMTP queues • Changes random read / seq write profile Assign a Volume / LUN for the Hub Transport • Execute http://technet.microsoft.com/en-us/library/bb125177.aspx to move the database • SnapMirror this to the DR site • Makes your DR faster
snap_info Place transaction logs and snap_info on same volume • Same LUN for high I/O (i.e. many logs) • Keeps LUN count in check on big deployments SME Utilizes NTFS “Hard Links” • SME edits pointers rather than executes a move • More efficient on I/O for NTFS operations
Sample Layout http://media.netapp.com/documents/tr-3578.pdf
FlexShare Clients Server Server Switch Windows Applications CIFS Home Directories High Priority Medium Priority FAS Storage System Running Data ONTAP® with FlexShare™ Option for Single Aggregate operation Select transaction logs for high priority
Fractional Reserve Set to 100% for all Volumes….. • Certainly for Logs • Highly recommended for databases Considerations for Reducing • Database growth • Archiving solution ~ stable DB ~ reduction in FR • Backup failure • Remote environment, accessibility ~ out-of-space Retained SnapShots • Reduced online retention (SnapMirroring perhaps) promotes LUN size stability
BlackBerry’s Can be pain point in an Exchange environment Easily avoided • Add disks & monitor Symptoms: Log Record Stalls • Too much data being added • Not enough IO capability to logs • Outlook client writes stop • Outlook client gets a “Waiting for Exchange Server” message Increase IOPS calculations for BB environments
Exchange Online Maintenance By default between midnight and 4am • Avoid Snaps at these times • Co-ordinate Snaps and Maintenance across controllers • Don’t schedule Maintenance and SME verification at the same time on the same controller • Especially if other operations are active on the controller • Such as MRM – Message Records Management
Overnight Operations If time proves a constraint change the schedule to alternate days MRM could easily be a weekend only operation Beware extensive MRM operations • Increases cross site SnapMirror traffic if used
Offline Defragmentation What? • Physical reduction in the EDB file size • Exchange admins monitor for event id 1221 When? • Never (never say never) Why? • Keep a store available and move mailboxes, dismount and delete store files
Content Indexing Now “MS Search” Turned ON by default in 2007 Consumes additional @5% of database • Factor into LUN sizing Useful for: • Outlook Web Access environments • Non “Cached Mode” Outlook environments • Cross mailbox searches Improvements • More efficient than 2003, far less I/O, far less space consumed
Microsoft DPM Solution Exchange Server Backup server Typical recovery time: Hours / days Data Disk Space Logs Disk Space Empty tape Tape Single copy of data Periodic backup Offsite tape
WAN A SnapMirror B C Snap 1 • Add new system • Copying snapshot 1 • Backup, DR and database portability • Configure as mirror Asynchronous Mirroring with SnapMirror® Blocks in LUN or File Blocks on the Disk A A B B C C A B C Snap 1
Exchange 2007 HA and DR solutions When to use Microsoft solutions and when to use NetApp to enhance availability
Exchange Clustering & NetApp LCR • One server, two copies of nominated stores SCC • Two servers, one copy of data CCR • Two servers, two copies of all data SCR • A target server for LCR, SCC & CCR deployments • SCR is the “DR” solution to the others’ HA solution Database Portability • The solution for SnapMirror
Local Continuous Replication Single server environment Separate spindles for respective databases Log shipping between log locations No chance for log delays No chance for database forking/cloning etc.
Single Copy Clustering Traditional model Two servers, one copy of the data No longer the “recommended” solution DB still prone to corruption • Rendering the servers useless Can’t use the passive to backup the stores
Clustered Continuous Replication • Dual server environment • Active/Passive only • Log shipping between log locations • No chance for log delays • No chance for database forking/cloning etc.
Clustered Continuous Replication (CCR) Uses MSCS Automatic failover • Should not assume a failback operation Backup the passive • The same or more spindles on the passive The Exchange admins will push back • Many touting “performance degraded” CCR • Do CCR properly and faithfully duplicate nodes & storage Change volume layout • Active / Passive different controllers • More aggregates than normal
Standby Continuous Replication Designed as the “Disaster Recovery” solution Two servers, two copies of the information Inbuilt 50 log delay No ability to fork a database or restore to previous version
SCR (2) Acts as “target” for Standalone, CCR, SCC • Many Sources to one Target • Ships 1MB log files using RPC • Manual failover Recovery to original node is an extensive undertaking Alternatively…… • SnapMirror • Database Portability • http://technet.microsoft.com/en-us/library/bb123954(EXCHG.80).aspx
DR Site Archival & Compliance ArchivalApplication Exchange Server (failover) NetApp Secondary Storage NetApp SnapVault SnapMirror NetApp FAS (failover) NetApp FAS NetApp SnapLock Database Portability Primary Data Center Production Exchange Server (primary) SnapManager for Exchange/SMBR FC / iSCSI
Single Mailbox Recovery Take the option Microsoft capabilities • None • Restore to RSG and Exmerge/Alternate NetApp capabilities • FlexClone the SnapShot (if you have it) • Mount the LUN • SMBR • Forget about the RSG • Has PF recoverability
SnapShot Retention Store sensible numbers • How many depends on you • RPO and RTO trade-offs SnapVault NearStore Can assist “Discovery” & “Compliance” issues
Rapid Granular Data Recovery from Exchange Backups: Single Mailbox Recovery If you don’t actually need a compliance solution but need to retain mail Rapidly recover storage group,message, folder, or mailbox From any location or media Exchange recovery environment not required Advanced search capabilities Exchange Environment FC or IP SAN SingleMailbox Recovery PrimaryStorage Tape SecondaryStorage
Summary Memory Disk mixing Exchange Calculator • Have NetApp run internal calculator Resiliency if necessary (SCC / CCR) Storage Replication Volumes / LUNs Store Size SMBR Manage Maintenance Windows
Questionsmark.arnold@netapp.com 46 © 2008 NetApp. All rights reserved.