430 likes | 933 Views
DBI316. SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn. Sanjay Mishra Program Manager Microsoft Corporation. Setting the Stage. Assumed Pre-requisites for this presentation: Basic knowledge of
E N D
DBI316 SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager Microsoft Corporation
Setting the Stage • Assumed Pre-requisites for this presentation: Basic knowledge of • AlwaysOn Failover Cluster Instances (FCI) • AlwaysOn Availability Groups (AG) • Definition: For the purpose of this presentation • High Availability (Local HA): Availability within a data center • Disaster Recovery (DR): Availability across data centers
Setting the Stage • AlwaysOn ≠ Availability Groups • AlwaysOn = { SQL Server Failover Cluster Instances, Availability Groups } • Availability Groups ≠ Database Mirroring
SQL Server 2012 AlwaysOn HA+DR Design Patterns Slight variations of these design patterns are occasionally observed as well.
Wednesday, June 13, 10:15 AM – 11:30 AM SQLCAT: HA/DR Customer Panel - Microsoft SQL Server 2012 AlwaysOn Deployment Considerations N 320E • Michael Steineke, Edgenet, Inc. • David P. Smith, ServiceU Corporation • AyadShammout, CareGroup Healthcare Systems • Wolfgang Kutschera, bwin party • Thomas Grohser, Hedge fund in Connecticut
title Multi-site Failover Cluster Instance for HA and DR
Multi-site Failover Cluster Instance for HA and DR Windows Server Failover Cluster Node 3 Node 1 Node 4 Node 2 Primary Site DR Site Passive SQL-FCI Passive Active Passive Storage Replication
Multi-site Failover Cluster InstanceKey Elements • A single SQL Server failover cluster instance (FCI) providing HA as well as DR • spanning across multiple sites (usually multiple subnets as well) • Key components: • Storage • Storage level replication • Cluster Enabler • Provided by the storage vendor • Work with your storage vendor to get the appropriate software and best practices • Network • Multi-subnet support in SQL Server configuration and engine • Key improvement in SQL Server 2012 • IP address OR dependency set within SQL Server setup • SQL Engine skips binding to any IP’s which are not online at start-up • RegisterAllProvidersIP for Network Name improves application failover time
Multi-site Failover Cluster InstanceDeployment Considerations • Storage Validation • Storage Validation Check Requirement is relaxed due to make-up of multi-site storage infrastructure (but still get the pop-up!) • Multi-site FCI Solution does not require passing the storage validation tests, to be supported. http://support.microsoft.com/kb/943984 • Appropriate Quorum Model • Validation suggests “Node and Disk Majority” which can be ignored • Consider “Node and File Share Majority” or “Node Majority” based on number of nodes
Multi-site Failover Cluster InstanceDeployment Considerations • TEMPDB on Local Disk • Not specific to “multi-site” FCIs, but has some great positive side effects for “multi-site” scenarios • Enables use of local storage for TEMPDB • Can use solid state storage to improve performance of TEMPDB-heavy workloads • Saves money on storage replication licensing • Reduces cross-data center storage replication traffic
title Availability Groups for HA and DR
Availability Groups for HA and DRDeployment Considerations • Pre-requisites: • Prerequisites, Restrictions, and Recommendations for AlwaysOn Availability Groups (SQL Server) • Unit of Failover • Group of databases – not the instance • Consider Contained Database for containing logins for failover • For jobs and other objects outside the database, simple customization needed • Considerations for Replacing Log Shipping • No delayed apply on the secondary • Removing log shipping means the regular log backup job is removed • Need to re-establish periodic log backup (essential for truncating the log) • New tools for monitoring and alerting • AlwaysOnDashboard • New DMVs • System Center Operations Manager
Availability Groups for HA and DRQuorum Considerations • Quorum is managed by the WSFC, irrespective of the number of SQL Server instances, number of nodes, number of availability groups • Important goal: Design to ensure • Unavailability of the DR site (or the node at DR site) , or loss of network connectivity between sites should not impact the quorum of the WSFC • Two steps: • Node votes: First decide which nodes should have a vote • Quorum Model: Then choose the appropriate quorum model
Availability Groups for HA and DRQuorum Considerations • Node Votes • By default, every node has a vote => May not be ideal for the HA / DR goals • Windows Server hotfix: http://support.microsoft.com/kb/2494036 • Guidelines: http://msdn.microsoft.com/en-us/library/hh270280.aspx#RecommendedAdjustmentstoQuorumVoting • For the example topology discussed here, this means: • 1 vote to each node in the primary data center • 0 vote to the node in the disaster recovery data center • = total 2 votes in the Windows Cluster => not ideal ! • Need odd number of votes for a “majority” based quorum model • Since this is a purely non-shared storage solution, two possible quorum models: • Node and File Share Majority, or • Node Majority
Quorum Model and Node VotesNode and Fileshare Majority Use the “Node and File Share Majority” quorum model with a protected file share witness. Note: The Fileshare Witness always has 1 vote.
Quorum Model and Node VotesNode Majority Add an additional voting node to the WSFC in the primary data center, and then use the “Node Majority” quorum model.
Quorum Model and Node VotesHow to set / view To View Quorum Model To Change Quorum Model • Windows Failover Cluster Manager GUI • PowerShell • Cluster.exe • SQL Server DMVs • AlwaysOn Dashboard in SSMS • Windows Failover Cluster Manager GUI • PowerShell • Cluster.exe To View Node Votes To Change Node Votes • PowerShell • Cluster.exe • PowerShell • Cluster.exe • SQL Server DMVs • AlwaysOn Dashboard in SSMS
Recovering from a Disaster • Disaster = Primary site is down • Manual Process involved to bring database service online on the DR site • Force Quorum on the secondary in the DR site • Execute FORCE SERVICE ALLOW DATA LOSS • Adjust quorum model and/or node votes
Migration: From DBM+LS to AGPlanning and Key Considerations • Hardware: new hardware, reuse existing hardware? • Windows Clustering: involve the Windows System Administration team and the networking team • Quorum considerations across multiple data centers • Cluster network communication across multiple data centers • Stages: migrate the whole configuration at once, or migrate the DR afterwards? • Application connection string change
Special Case: Automatic Failover for DRUse of 3rd Data Center
title Failover Cluster Instance for HA, and Availability Group for DR
FCI for HA + AG for DRDeployment Considerations • Pre-requisites: • Windows Server Service packs / QFEs: • Asymmetric Storage • Windows Server 2008 with http://support.microsoft.com/kb/976097 • OR, Windows Server 2008 R2 SP1 • Node Votes: http://support.microsoft.com/kb/2494036 • Validate disk test QFE: http://support.microsoft.com/kb/2531907 • Prerequisites, Restrictions, and Recommendations for AlwaysOn Availability Groups (SQL Server) • Different units of failover for HA and DR • Instance-level failover for local HA (FCI) • Group of databases (AG) for DR • AG Failover Mode • In FCI+AG configuration, FCI provides automatic failover, and AG provides manual failover
FCI for HA + AG for DRDeployment Considerations • Asymmetric Storage • Key concept behind this architecture • New Windows Server Failover Clustering capability introduced in: • Windows Server 2008 R2 SP1 • Windows Server 2008 with QFE • Symmetric storage = a cluster disk that is shared between all the WSFC nodes • Asymmetric storage = a cluster disk that is shared between a subset of nodes • Instance Naming • Each FCI within the WSFC needs to have a different instance name • Database File Paths • (recommended) use identical drive letters for the disks for each FCI • (recommended) use identical file paths for data and log files for each FCI
FCI for HA + AG for DRQuorum Considerations • Quorum is managed by the WSFC, irrespective of the number of SQL Server instances (FCI or standalone), number of nodes, number of availability groups • Important goal: Design to ensure • Unavailability of the DR site, or loss of network connectivity between sites should not impact the quorum of the WSFC • Two steps: • Node votes: First decide which nodes should have a vote • Quorum Model: Then choose the appropriate quorum model
FCI for HA + AG for DRQuorum Considerations • Node Votes • By default, every node has a vote => May not be ideal for the HA / DR goals • Windows Server hotfix: http://support.microsoft.com/kb/2494036 • Guidelines: http://msdn.microsoft.com/en-us/library/hh270280.aspx#RecommendedAdjustmentstoQuorumVoting • For the example topology discussed here, this means: • 1 vote to each node in the primary data center • 0 vote to each node in the disaster recovery data center • = total 2 votes in the Windows Cluster => not ideal ! • Need odd number of votes for a “majority” based quorum model • Quorum models: • Pick one of the “majority” based quorum models with odd number of votes • Node and File Share Majority, or • Node Majority, or • Node and (asymmetric) Disk Majority • Or, pick (asymmetric) Disk Only (special case!) – votes don’t matter
Quorum Model and Node VotesExample: Node and Fileshare Majority Note: The Fileshare Witness always has 1 vote.
Quorum Model and Node VotesHow to set / view To View Quorum Model To Change Quorum Model • Windows Failover Cluster Manager GUI • PowerShell • Cluster.exe • SQL Server DMVs • AlwaysOn Dashboard in SSMS • Windows Failover Cluster Manager GUI • PowerShell • Cluster.exe To View Node Votes To Change Node Votes • PowerShell • Cluster.exe • PowerShell • Cluster.exe • SQL Server DMVs • AlwaysOn Dashboard in SSMS Note: Only cluster.exe can be used to set quorum model to “Node and (asymmetric) Disk Majority” or “(asymmetric) Disk Only”
Recovering from a Disaster • Disaster = Primary site is down • Manual Process involved to bring database service online on the DR site • Force Quorum on the secondary in the DR site • Execute FORCE SERVICE ALLOW DATA LOSS on the Availability Group • Adjust quorum model and/or node votes • Rethink quorum model: needs for another fileshare at the DR site?
Migration: From FCI+DBM to FCI+AGPlanning and Key Considerations • Hardware: new hardware, reuse existing hardware? • Windows Clustering • Quorum considerations across multiple data centers • Cluster network communication across multiple data centers • Stages: migrate the whole configuration at once, or migrate the DR afterwards? • Secondary (DR site) needs re-seeding • Uninstall existing SQL FCI • Destroy existing WSFC at the DR site • Re-install SQL FCI after joining DR nodes to primary data center WSFC • Backup from primary, and Restore on the secondary • Application connection string change
SQL Server 2012 AlwaysOn HA+DR Design Patterns Slight variations of these design patterns are occasionally observed as well.
Wednesday, June 13, 10:15 AM – 11:30 AM SQLCAT: HA/DR Customer Panel - Microsoft SQL Server 2012 AlwaysOn Deployment Considerations N 320E • Michael Steineke, Edgenet, Inc. • David P. Smith, ServiceU Corporation • AyadShammout, CareGroup Healthcare Systems • Wolfgang Kutschera, bwin party • Thomas Grohser, Hedge fund in Connecticut
Track Resources Hands-On Labs @sqlserver @ms_teched SQL Server 2012 Eval Copy Get Certified! mva Microsoft Virtual Academy
www.sqlcat.com sanjaymi@microsoft.com @sqlcat Sanjay Mishra
Resources Learning TechNet • Connect. Share. Discuss. • Microsoft Certification & Training Resources http://northamerica.msteched.com www.microsoft.com/learning • Resources for IT Professionals • Resources for Developers • http://microsoft.com/technet http://microsoft.com/msdn
Required Slide Complete an evaluation on CommNet and enter to win!
Required Slide • *delete this box when your slide is finalized • Your MS Tag will be inserted here during the final scrub. MS Tag Scan the Tag to evaluate this session now on myTechEd Mobile
© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.