450 likes | 506 Views
Learn how to perform a datacenter switchover for a database availability group in Exchange 2010, including commands, expected outcomes, and common errors. Ensure a smooth transition for your DAG.
E N D
Workflow Steps Perform a datacenter switchover for a database availability group Version 1.2 (Updated 12/2012)
Exchange 2010 - Datacenter Switchover Stop-DatabaseAvailabilityGroup Restore-DatabaseAvailabilityGroup Exchange 2010 - Datacenter Switchback Start-DatabaseAvailabilityGroup
Stop-DatabaseAvailabilityGroup Has the datacenter switchover been approved? YES NO
Stop-DatabaseAvailabilityGroup Is the primary datacenter online or physically accessible? YES NO
Stop-DatabaseAvailabilityGroup Do the remote and primary datacenters have network connectivity? YES NO
Stop-DatabaseAvailabilityGroup Are the Exchange servers in the primary datacenter online? YES NO
Stop-DatabaseAvailabilityGroup Is your DAG extended to multiple Active Directory sites? YES NO
Stop-DatabaseAvailabilityGroup • COMMANDS: • Using the Exchange Management Shell on a sever in the recovery datacenter, run: • Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site> • Repeat the above command for all Active Directory sites containing DAG members that are not the recovery datacenter AD site. • EXPECTED OUTCOMES: • Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. • Exchange servers that were accessible in the primary datacenter should have their Cluster services forcibly cleaned up and the Cluster service should be configured with a startup type of DISABLED. You can verify this using Services.msc. • A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. • COMMON ERRORS: • If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed?
Stop-DatabaseAvailabilityGroup • COMMANDS: • Using the Exchange Management Shell on a sever in the recovery datacenter, run: • Stop-DatabaseAvailabilityGroup –Identity <DAGName> -MailboxServer <DAG member in primary site> • Repeat the above command for all DAG members that are not in the recovery datacenter. • EXPECTED OUTCOMES: • Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. • Exchange servers that were accessible in the primary datacenter should have their Cluster services forcibly cleaned up and the Cluster service should be configured with a startup type of DISABLED. You can verify this using Services.msc. • A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. • COMMON ERRORS: • If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed?
Stop-DatabaseAvailabilityGroup Is your DAG extended to multiple Active Directory sites? YES NO
Stop-DatabaseAvailabilityGroup • COMMANDS: • Using the Exchange Management Shell on a sever in the recovery datacenter, run: • Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site> -ConfigurationOnly:$True • Repeat for any additional Active Directory sites that are not the recovery datacenter. • EXPECTED OUTCOMES: • Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. • A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. • COMMON ERRORS: • If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed?
Stop-DatabaseAvailabilityGroup • COMMANDS: • Using the Exchange Management Shell on a sever in the recovery datacenter, run: • Stop-DatabaseAvailabilityGroup –Identity <DAGName> -MailboxServer <DAG member in primary site> -ConfigurationOnly:$True • Repeat command for all DAG members that are not in the recovery datacenter. • EXPECTED OUTCOMES: • Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. • A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. • COMMON ERRORS: • If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed?
Stop-DatabaseAvailabilityGroup Are the Exchange servers in primary datacenter online? YES NO
Stop-DatabaseAvailabilityGroup Is your DAG extended to multiple Active Directory sites? YES NO
Stop-DatabaseAvailabilityGroup • COMMANDS: • Using the Exchange Management Shell on a sever in the recovery datacenter, run: • Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site> -ConfigurationOnly:$True • Repeat for any additional Active Directory sites that are not the recovery datacenter Active Directory site. • EXPECTED OUTCOMES: • Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. • A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. • COMMON ERRORS: • If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed?
Stop-DatabaseAvailabilityGroup • COMMANDS: • Using the Exchange Management Shell on a sever in the recovery datacenter, run: • Stop-DatabaseAvailabilityGroup –Identity <DAGName> -MailboxServer <DAG member in primary site> -ConfigurationOnly:$True • Repeat for any additional DAG members that are not in the recovery datacenter Active Directory site. • EXPECTED OUTCOMES: • Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. • A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. • COMMON ERRORS: • If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed?
Stop-DatabaseAvailabilityGroup • COMMANDS: • Optional: If Exchange Management Shell access to the primary datacenter is available, run: • Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site> • Repeat for any additional Active Directory sites that are not the recovery datacenter Active Directory site. • EXPECTED OUTCOMES: • Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. • Exchange servers that were accessible in the primary datacenter should have their Cluster services forcibly cleaned up and the Cluster service should be configured with a startup type of DISABLED. You can verify this using Services.msc. • A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. • COMMON ERRORS: • If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. • No Exchange server instance if functional to service the Exchange Management Shell – in this instance this step can be skipped. Command Completed?
Stop-DatabaseAvailabilityGroup • COMMANDS: • Using the Exchange Management Shell on a sever in the recovery datacenter, run: • Stop-DatabaseAvailabilityGroup –Identity <DAGName> -MailboxServer <DAG member in primary site> • Repeat for any additional DAG members that are not in the recovery datacenter Active Directory site. • EXPECTED OUTCOMES: • Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. • A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. • COMMON ERRORS: • If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed?
Stop-DatabaseAvailabilityGroup Is your DAG extended to multiple Active Directory sites? YES NO
Stop-DatabaseAvailabilityGroup • COMMANDS: • Using the Exchange Management Shell on a sever in the recovery datacenter, run: • Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary datacenter> -ConfigurationOnly:$True • Repeat for any additional Active Directory sites that are not the recovery datacenter Active Directory site. • EXPECTED OUTCOMES: • Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. • A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. • COMMON ERRORS: • If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed?
Stop-DatabaseAvailabilityGroup • COMMANDS: • Using the Exchange Management Shell on a sever in the recovery datacenter, run: • Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site> -ConfigurationOnly:$True • Repeat for any additional Active Directory sites that are not the recovery datacenter Active Directory site. • EXPECTED OUTCOMES: • Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. • A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. • COMMON ERRORS: • If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed?
Stop-DatabaseAvailabilityGroup • COMMANDS: • OPTIONAL: Using the Exchange Management Shell on a sever in the recovery datacenter, run: • Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site> -ConfigurationOnly:$True • Repeat for any additional Active Directory sites that are not the recovery datacenter Active Directory site. • EXPECTED OUTCOMES: • Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG (this assumes at least one Exchange server exists : • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. • A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. • COMMON ERRORS: • If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed?
Stop-DatabaseAvailabilityGroup • COMMANDS: • Optional: If Exchange Management Shell access to the primary datacenter is available, run: • Stop-DatabaseAvailabilityGroup –Identity <DAGName> -MailboxServer <DAG member in primary site> -configurationOnly:$TRUE • Repeat for any additional DAG members that are not in the recovery datacenter Active Directory site. • EXPECTED OUTCOMES: • Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG (this assumes at least one Exchange server exists : • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. • A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. • COMMON ERRORS: • If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed?
Stop-DatabaseAvailabilityGroup Is your DAG extended to multiple Active Directory sites? YES NO
Stop-DatabaseAvailabilityGroup • COMMANDS: • Optional: If Exchange Management Shell access to the primary datacenter is available, run: • Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site> -ConfigurationOnly:$True • Repeat for any additional Active Directory sites that are not the recovery datacenter Active Directory site. • EXPECTED OUTCOMES: • Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. • A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. • COMMON ERRORS: • If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed?
Stop-DatabaseAvailabilityGroup • COMMANDS: • Using the Exchange Management Shell on a sever in the recovery datacenter, run: • Stop-DatabaseAvailabilityGroup –Identity <DAGName> -MailboxServer <DAG member in primary site> -ConfigurationOnly:$True • Repeat command for all DAG members that are not in the recovery datacenter. • EXPECTED OUTCOMES • Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. Command Completed?
Restore-DatabaseAvailabilityGroup Did Stop-DatabaseAvailabilityGroup complete successfully? YES NO
Restore-DatabaseAvailabilityGroup • COMMANDS: • Stop the Cluster service on each DAG member in the recovery datacenter. To do this run the appropriate command for your DAG member’s operating system: • Windows Server 2008 R2: Stop-Service Clussvc • Windows Server 2008 SP2: Net Stop Clussvc • EXPECTED OUTCOMES: • Cluster services are stopped on remaining nodes. • COMMON ERRORS • Access denied – You must use an elevated command prompt run as administrator if the default administrator account is not used Command Completed?
Restore-DatabaseAvailabilityGroup Is the Cluster service stopped on all DAG members in your recovery datacenter? YES NO
Restore-DatabaseAvailabilityGroup • COMMANDS: • From the Exchange Management Shell on an Exchange server in the recovery datacenter, run: • Restore-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <recovery site> -AlternateWitnessDirectory:<AWSPath> -AlternateWitnessServer:<AWSName> • EXPECTED OUTCOMES: • A DAG member in the recovery datacenter is randomly selected and it’s Cluster service is started in /forceQuourm mode • DAG members on the StoppedMailboxServers list are evicted from the DAG’s cluster thereby adjusting the membership count • If the resulting membership count is EVEN or results in a SINGLE node, the Cluster is configured with a Node and File Share Majority quorum and it begins using the Alternate Witness Server and Alternate Witness Directory • Cluster services are started on the remaining DAG members and they successfully join the DAG’s cluster • VERIFICATION: • Use the following steps to verify that the DAG members are up and the Cluster Group is online by running the following commands: • Windows Server 2008 R2 • Import-Module FailoverClusters • Get-ClusterNode –Cluster <DAGName> • Get-ClusterGroup –Cluster <DAGName> • Windows Server 2008 SP2 • Cluster <DAGName> node • Cluster <DAGName> group • COMMON ERRORS: • Nodes fail to evict with error 0x46. See http://aka.ms/0x46 Command Completed?
Restore-DatabaseAvailabilityGroup Assuming all pre-requisites have been met, any activation blocks can now be removed and databases can be mounted Command Completed?
Start-DatabaseAvailabilityGroup Is your primary datacenter online? YES NO
Start-DatabaseAvailabilityGroup Ensure that supporting services are available including but not limited to: Active Directory / domain controllers / global catalog / FSMO role holders Domain Name Services (DNS) Witness Server Supporting Exchange roles: Client Access and Hub Transport OPTIONAL: Dynamic Host Configuration Protocol servers (DHCP), if DHCP addresses are used for DAG networks Edge Transport server Unified Messaging server Continue…
Start-DatabaseAvailabilityGroup Are the necessary services established and functioning? YES NO
Start-DatabaseAvailabilityGroup COMMANDS: Verify network connectivity between all DAG members. Suggested methods: Ping test between DAG members Map administrative shares between DAG members EXPECTED OUTCOMES: Connectivity between datacenters is functioning and all cluster inter-node communications are operating normally Command Completed?
Start-DatabaseAvailabilityGroup Have datacenter communications been verified? YES NO
Start-DatabaseAvailabilityGroup • Verify that Cluster service on the DAG members in the primary datacenter have a startup type of DISABLED. If they do not, either the Stop-DatabaseAvailabilityGroup command was not successful or the DAG members in the primary datacenter failed to receive eviction notification after network connectivity between datacenters was restored • Do not proceed until Cluster service cleanup has occurred and Cluster service has a startup type of DISABLED. • You can optionally run the following command on the DAG members in the primary datacenter to forcibly cleanup the outdated cluster information: • Cluster node /forcecleanup Continue…
Start-DatabaseAvailabilityGroup Does the Cluster service show a startup type of disabled? YES NO
Start-DatabaseAvailabilityGroup Is your DAG extended to multiple Active Directory sites? YES NO
Start-DatabaseAvailabilityGroup • COMMAND: • Using the Exchange Management Shell, run the following command: • Start-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site> • Repeat for all other Active Directory sites that were stopped during the datacenter switchover process. • EXPECTED OUTCOMES: • DAG members in the primary datacenter are added to the DAG’s cluster • If the resulting membership count is EVEN, the cluster is to use the Node and File Share Majority quorum • VERIFICATION: • Use the following steps to verify that the DAG members are up and the Cluster Group is online by running the following commands: • Windows Server 2008 R2 • Import-Module FailoverClusters • Get-ClusterNode –Cluster <DAGName> • Get-ClusterGroup –Cluster <DAGName> • Windows Server 2008 SP2 • Cluster <DAGName> node • Cluster <DAGName> group • The following command shows the StartedMailboxServers list with all DAG members and an empty StoppedMailboxServers list: • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • COMMON ERRORS: • Nodes may fail to join the cluster with invalid node error. If this occurs, retry the command again. Continue…
Start-DatabaseAvailabilityGroup • COMMAND: • Using the Exchange Management Shell, run the following command: • Start-DatabaseAvailabilityGroup –Identity <DAGName> -MailboxServer <DAG member in primary site> • Repeat for all other Mailbox servers that were stopped during the datacenter switchover process. • EXPECTED OUTCOMES: • DAG members in the primary datacenter are added to the DAG’s cluster • If the resulting membership count is EVEN, the cluster is to use the Node and File Share Majority quorum • VERIFICATION: • Use the following steps to verify that the DAG members are up and the Cluster Group is online by running the following commands: • Windows Server 2008 R2 • Import-Module FailoverClusters • Get-ClusterNode –Cluster <DAGName> • Get-ClusterGroup –Cluster <DAGName> • Windows Server 2008 SP2 • Cluster <DAGName> node • Cluster <DAGName> group • The following command shows the StartedMailboxServers list with all DAG members and an empty StoppedMailboxServers list: • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • COMMON ERRORS: • Nodes may fail to join the cluster with invalid node error. If this occurs, retry the command again. Continue…
Start-DatabaseAvailabilityGroup Were the DAG members added to the cluster successfully? YES NO
Start-DatabaseAvailabilityGroup Were the DAG members added to the cluster successfully? YES NO
Start-DatabaseAvailabilityGroup • COMMANDS: • Reset the DAG’s Witness Server and Alternate Witness Server properties by running the following command: • Set-DatabaseAvailabilityGroup –Identity <DAGName> -WitnessServer <WSName> -AlternateWitnessServer <AWSName> • EXPECTED OUTCOMES: • Witness Server and Alternate Witness Server properties are configured to ensure the appropriate witness server is in use • If the Cluster configuration does not match the DAG configuration, the Cluster is updated with the proper configuration • COMMON ERRORS: • Administrators incorrectly verify which file share witness is currently in use. See http://aka.ms/E14FSW. Continue…
Start-DatabaseAvailabilityGroup After any activation blocks have been removed, active database copies can be moved to servers in the primary datacenter Continue…