1 / 45

Database Availability Group Switchover Guide

Learn how to perform a datacenter switchover for a database availability group in Exchange 2010, including commands, expected outcomes, and common errors. Ensure a smooth transition for your DAG.

Download Presentation

Database Availability Group Switchover Guide

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Workflow Steps Perform a datacenter switchover for a database availability group Version 1.2 (Updated 12/2012)

  2. Exchange 2010 - Datacenter Switchover Stop-DatabaseAvailabilityGroup Restore-DatabaseAvailabilityGroup Exchange 2010 - Datacenter Switchback Start-DatabaseAvailabilityGroup

  3. Stop-DatabaseAvailabilityGroup Has the datacenter switchover been approved? YES NO

  4. Stop-DatabaseAvailabilityGroup Is the primary datacenter online or physically accessible? YES NO

  5. Stop-DatabaseAvailabilityGroup Do the remote and primary datacenters have network connectivity? YES NO

  6. Stop-DatabaseAvailabilityGroup Are the Exchange servers in the primary datacenter online? YES NO

  7. Stop-DatabaseAvailabilityGroup Is your DAG extended to multiple Active Directory sites? YES NO

  8. Stop-DatabaseAvailabilityGroup • COMMANDS: • Using the Exchange Management Shell on a sever in the recovery datacenter, run: • Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site> • Repeat the above command for all Active Directory sites containing DAG members that are not the recovery datacenter AD site. • EXPECTED OUTCOMES: • Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. • Exchange servers that were accessible in the primary datacenter should have their Cluster services forcibly cleaned up and the Cluster service should be configured with a startup type of DISABLED. You can verify this using Services.msc. • A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. • COMMON ERRORS: • If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed?

  9. Stop-DatabaseAvailabilityGroup • COMMANDS: • Using the Exchange Management Shell on a sever in the recovery datacenter, run: • Stop-DatabaseAvailabilityGroup –Identity <DAGName> -MailboxServer <DAG member in primary site> • Repeat the above command for all DAG members that are not in the recovery datacenter. • EXPECTED OUTCOMES: • Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. • Exchange servers that were accessible in the primary datacenter should have their Cluster services forcibly cleaned up and the Cluster service should be configured with a startup type of DISABLED. You can verify this using Services.msc. • A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. • COMMON ERRORS: • If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed?

  10. Stop-DatabaseAvailabilityGroup Is your DAG extended to multiple Active Directory sites? YES NO

  11. Stop-DatabaseAvailabilityGroup • COMMANDS: • Using the Exchange Management Shell on a sever in the recovery datacenter, run: • Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site> -ConfigurationOnly:$True • Repeat for any additional Active Directory sites that are not the recovery datacenter. • EXPECTED OUTCOMES: • Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. • A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. • COMMON ERRORS: • If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed?

  12. Stop-DatabaseAvailabilityGroup • COMMANDS: • Using the Exchange Management Shell on a sever in the recovery datacenter, run: • Stop-DatabaseAvailabilityGroup –Identity <DAGName> -MailboxServer <DAG member in primary site> -ConfigurationOnly:$True • Repeat command for all DAG members that are not in the recovery datacenter. • EXPECTED OUTCOMES: • Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. • A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. • COMMON ERRORS: • If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed?

  13. Stop-DatabaseAvailabilityGroup Are the Exchange servers in primary datacenter online? YES NO

  14. Stop-DatabaseAvailabilityGroup Is your DAG extended to multiple Active Directory sites? YES NO

  15. Stop-DatabaseAvailabilityGroup • COMMANDS: • Using the Exchange Management Shell on a sever in the recovery datacenter, run: • Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site> -ConfigurationOnly:$True • Repeat for any additional Active Directory sites that are not the recovery datacenter Active Directory site. • EXPECTED OUTCOMES: • Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. • A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. • COMMON ERRORS: • If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed?

  16. Stop-DatabaseAvailabilityGroup • COMMANDS: • Using the Exchange Management Shell on a sever in the recovery datacenter, run: • Stop-DatabaseAvailabilityGroup –Identity <DAGName> -MailboxServer <DAG member in primary site> -ConfigurationOnly:$True • Repeat for any additional DAG members that are not in the recovery datacenter Active Directory site. • EXPECTED OUTCOMES: • Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. • A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. • COMMON ERRORS: • If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed?

  17. Stop-DatabaseAvailabilityGroup • COMMANDS: • Optional: If Exchange Management Shell access to the primary datacenter is available, run: • Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site> • Repeat for any additional Active Directory sites that are not the recovery datacenter Active Directory site. • EXPECTED OUTCOMES: • Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. • Exchange servers that were accessible in the primary datacenter should have their Cluster services forcibly cleaned up and the Cluster service should be configured with a startup type of DISABLED. You can verify this using Services.msc. • A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. • COMMON ERRORS: • If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. • No Exchange server instance if functional to service the Exchange Management Shell – in this instance this step can be skipped. Command Completed?

  18. Stop-DatabaseAvailabilityGroup • COMMANDS: • Using the Exchange Management Shell on a sever in the recovery datacenter, run: • Stop-DatabaseAvailabilityGroup –Identity <DAGName> -MailboxServer <DAG member in primary site> • Repeat for any additional DAG members that are not in the recovery datacenter Active Directory site. • EXPECTED OUTCOMES: • Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. • A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. • COMMON ERRORS: • If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed?

  19. Stop-DatabaseAvailabilityGroup Is your DAG extended to multiple Active Directory sites? YES NO

  20. Stop-DatabaseAvailabilityGroup • COMMANDS: • Using the Exchange Management Shell on a sever in the recovery datacenter, run: • Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary datacenter> -ConfigurationOnly:$True • Repeat for any additional Active Directory sites that are not the recovery datacenter Active Directory site. • EXPECTED OUTCOMES: • Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. • A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. • COMMON ERRORS: • If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed?

  21. Stop-DatabaseAvailabilityGroup • COMMANDS: • Using the Exchange Management Shell on a sever in the recovery datacenter, run: • Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site> -ConfigurationOnly:$True • Repeat for any additional Active Directory sites that are not the recovery datacenter Active Directory site. • EXPECTED OUTCOMES: • Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. • A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. • COMMON ERRORS: • If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed?

  22. Stop-DatabaseAvailabilityGroup • COMMANDS: • OPTIONAL: Using the Exchange Management Shell on a sever in the recovery datacenter, run: • Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site> -ConfigurationOnly:$True • Repeat for any additional Active Directory sites that are not the recovery datacenter Active Directory site. • EXPECTED OUTCOMES: • Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG (this assumes at least one Exchange server exists : • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. • A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. • COMMON ERRORS: • If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed?

  23. Stop-DatabaseAvailabilityGroup • COMMANDS: • Optional: If Exchange Management Shell access to the primary datacenter is available, run: • Stop-DatabaseAvailabilityGroup –Identity <DAGName> -MailboxServer <DAG member in primary site> -configurationOnly:$TRUE • Repeat for any additional DAG members that are not in the recovery datacenter Active Directory site. • EXPECTED OUTCOMES: • Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG (this assumes at least one Exchange server exists : • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. • A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. • COMMON ERRORS: • If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed?

  24. Stop-DatabaseAvailabilityGroup Is your DAG extended to multiple Active Directory sites? YES NO

  25. Stop-DatabaseAvailabilityGroup • COMMANDS: • Optional: If Exchange Management Shell access to the primary datacenter is available, run: • Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site> -ConfigurationOnly:$True • Repeat for any additional Active Directory sites that are not the recovery datacenter Active Directory site. • EXPECTED OUTCOMES: • Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. • A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency. • COMMON ERRORS: • If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored. Command Completed?

  26. Stop-DatabaseAvailabilityGroup • COMMANDS: • Using the Exchange Management Shell on a sever in the recovery datacenter, run: • Stop-DatabaseAvailabilityGroup –Identity <DAGName> -MailboxServer <DAG member in primary site> -ConfigurationOnly:$True • Repeat command for all DAG members that are not in the recovery datacenter. • EXPECTED OUTCOMES • Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG: • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter. Command Completed?

  27. Restore-DatabaseAvailabilityGroup Did Stop-DatabaseAvailabilityGroup complete successfully? YES NO

  28. Restore-DatabaseAvailabilityGroup • COMMANDS: • Stop the Cluster service on each DAG member in the recovery datacenter. To do this run the appropriate command for your DAG member’s operating system: • Windows Server 2008 R2: Stop-Service Clussvc • Windows Server 2008 SP2: Net Stop Clussvc • EXPECTED OUTCOMES: • Cluster services are stopped on remaining nodes. • COMMON ERRORS • Access denied – You must use an elevated command prompt run as administrator if the default administrator account is not used Command Completed?

  29. Restore-DatabaseAvailabilityGroup Is the Cluster service stopped on all DAG members in your recovery datacenter? YES NO

  30. Restore-DatabaseAvailabilityGroup • COMMANDS: • From the Exchange Management Shell on an Exchange server in the recovery datacenter, run: • Restore-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <recovery site> -AlternateWitnessDirectory:<AWSPath> -AlternateWitnessServer:<AWSName> • EXPECTED OUTCOMES: • A DAG member in the recovery datacenter is randomly selected and it’s Cluster service is started in /forceQuourm mode • DAG members on the StoppedMailboxServers list are evicted from the DAG’s cluster thereby adjusting the membership count • If the resulting membership count is EVEN or results in a SINGLE node, the Cluster is configured with a Node and File Share Majority quorum and it begins using the Alternate Witness Server and Alternate Witness Directory • Cluster services are started on the remaining DAG members and they successfully join the DAG’s cluster • VERIFICATION: • Use the following steps to verify that the DAG members are up and the Cluster Group is online by running the following commands: • Windows Server 2008 R2 • Import-Module FailoverClusters • Get-ClusterNode –Cluster <DAGName> • Get-ClusterGroup –Cluster <DAGName> • Windows Server 2008 SP2 • Cluster <DAGName> node • Cluster <DAGName> group • COMMON ERRORS: • Nodes fail to evict with error 0x46. See http://aka.ms/0x46 Command Completed?

  31. Restore-DatabaseAvailabilityGroup Assuming all pre-requisites have been met, any activation blocks can now be removed and databases can be mounted Command Completed?

  32. Start-DatabaseAvailabilityGroup Is your primary datacenter online? YES NO

  33. Start-DatabaseAvailabilityGroup Ensure that supporting services are available including but not limited to: Active Directory / domain controllers / global catalog / FSMO role holders Domain Name Services (DNS) Witness Server Supporting Exchange roles: Client Access and Hub Transport OPTIONAL: Dynamic Host Configuration Protocol servers (DHCP), if DHCP addresses are used for DAG networks Edge Transport server Unified Messaging server Continue…

  34. Start-DatabaseAvailabilityGroup Are the necessary services established and functioning? YES NO

  35. Start-DatabaseAvailabilityGroup COMMANDS: Verify network connectivity between all DAG members. Suggested methods: Ping test between DAG members Map administrative shares between DAG members EXPECTED OUTCOMES: Connectivity between datacenters is functioning and all cluster inter-node communications are operating normally Command Completed?

  36. Start-DatabaseAvailabilityGroup Have datacenter communications been verified? YES NO

  37. Start-DatabaseAvailabilityGroup • Verify that Cluster service on the DAG members in the primary datacenter have a startup type of DISABLED. If they do not, either the Stop-DatabaseAvailabilityGroup command was not successful or the DAG members in the primary datacenter failed to receive eviction notification after network connectivity between datacenters was restored • Do not proceed until Cluster service cleanup has occurred and Cluster service has a startup type of DISABLED. • You can optionally run the following command on the DAG members in the primary datacenter to forcibly cleanup the outdated cluster information: • Cluster node /forcecleanup Continue…

  38. Start-DatabaseAvailabilityGroup Does the Cluster service show a startup type of disabled? YES NO

  39. Start-DatabaseAvailabilityGroup Is your DAG extended to multiple Active Directory sites? YES NO

  40. Start-DatabaseAvailabilityGroup • COMMAND: • Using the Exchange Management Shell, run the following command: • Start-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site> • Repeat for all other Active Directory sites that were stopped during the datacenter switchover process. • EXPECTED OUTCOMES: • DAG members in the primary datacenter are added to the DAG’s cluster • If the resulting membership count is EVEN, the cluster is to use the Node and File Share Majority quorum • VERIFICATION: • Use the following steps to verify that the DAG members are up and the Cluster Group is online by running the following commands: • Windows Server 2008 R2 • Import-Module FailoverClusters • Get-ClusterNode –Cluster <DAGName> • Get-ClusterGroup –Cluster <DAGName> • Windows Server 2008 SP2 • Cluster <DAGName> node • Cluster <DAGName> group • The following command shows the StartedMailboxServers list with all DAG members and an empty StoppedMailboxServers list: • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • COMMON ERRORS: • Nodes may fail to join the cluster with invalid node error. If this occurs, retry the command again. Continue…

  41. Start-DatabaseAvailabilityGroup • COMMAND: • Using the Exchange Management Shell, run the following command: • Start-DatabaseAvailabilityGroup –Identity <DAGName> -MailboxServer <DAG member in primary site> • Repeat for all other Mailbox servers that were stopped during the datacenter switchover process. • EXPECTED OUTCOMES: • DAG members in the primary datacenter are added to the DAG’s cluster • If the resulting membership count is EVEN, the cluster is to use the Node and File Share Majority quorum • VERIFICATION: • Use the following steps to verify that the DAG members are up and the Cluster Group is online by running the following commands: • Windows Server 2008 R2 • Import-Module FailoverClusters • Get-ClusterNode –Cluster <DAGName> • Get-ClusterGroup –Cluster <DAGName> • Windows Server 2008 SP2 • Cluster <DAGName> node • Cluster <DAGName> group • The following command shows the StartedMailboxServers list with all DAG members and an empty StoppedMailboxServers list: • Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL • COMMON ERRORS: • Nodes may fail to join the cluster with invalid node error. If this occurs, retry the command again. Continue…

  42. Start-DatabaseAvailabilityGroup Were the DAG members added to the cluster successfully? YES NO

  43. Start-DatabaseAvailabilityGroup Were the DAG members added to the cluster successfully? YES NO

  44. Start-DatabaseAvailabilityGroup • COMMANDS: • Reset the DAG’s Witness Server and Alternate Witness Server properties by running the following command: • Set-DatabaseAvailabilityGroup –Identity <DAGName> -WitnessServer <WSName> -AlternateWitnessServer <AWSName> • EXPECTED OUTCOMES: • Witness Server and Alternate Witness Server properties are configured to ensure the appropriate witness server is in use • If the Cluster configuration does not match the DAG configuration, the Cluster is updated with the proper configuration • COMMON ERRORS: • Administrators incorrectly verify which file share witness is currently in use. See http://aka.ms/E14FSW. Continue…

  45. Start-DatabaseAvailabilityGroup After any activation blocks have been removed, active database copies can be moved to servers in the primary datacenter Continue…

More Related