1 / 42

Building a High Availability Strategy for Your Enterprise Using Microsoft SQL Server 2008

Building a High Availability Strategy for Your Enterprise Using Microsoft SQL Server 2008. Matt Hollingsworth Principal Program Manager Microsoft Corporation DAT303. Agenda. Introduction to High Availability and Disaster Recovery SQL Server Always On Technologies

ziven
Download Presentation

Building a High Availability Strategy for Your Enterprise Using Microsoft SQL Server 2008

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Building a High Availability Strategy for Your Enterprise Using Microsoft SQL Server 2008 Matt Hollingsworth Principal Program Manager Microsoft Corporation DAT303

  2. Agenda • Introduction to High Availability and Disaster Recovery • SQL Server Always On Technologies • Developing Your Availability Solution • Conclusion

  3. Introduction to High Availability and Disaster Recovery • Definitions • Introduce key terms and concepts • Business Continuity Planning • Overview of the BCP process • SQL Server High Availability Planning • How does BCP apply to SQL Server availability?

  4. High Availability and Disaster Recovery: Definition High Availability Disaster Recovery Processes and procedures designed to restore business operations due to a natural or human-induced disaster Typically involves providing redundancy spanning multiple sites or across geographic regions • High availability is a system design protocol and associated implementation that ensures a certain absolute degree of operational continuity during a given measurement period • Availability defined in terms of service level agreements (SLA) • Recovery Time • Data loss during unplanned downtime • A highly available application should be accessible by users x% of the time

  5. Defining x and SLA • Recovery Time Objective (RTO) guided by availability requirements • How much downtime can you tolerate? • Recovery Point Objective (RPO) guided by criticality of application data • How much data can you lose? Availability Requirements Tier3 Tier2 • RPO Tier1 • RTO

  6. Protection Levels • Protection against resource failures • Machine • Database Corruption • Disk • Location Redundancy • Building • < 10 miles Regional DR Geographic DR • Protection against • Natural Disasters • Location Redundancy • State, Country • > 100-200 miles • Protection against • Network Outages • Site Failures • Location Redundancy • City, County • < 100-200 miles Local HA

  7. Business Continuity Planning • Impact Analysis • Critical Functions • Threat Identification • Recovery Objectives • Solution Design • Achieve recovery objectives for relevant threats within specified constraints like budget, human resources etc • Cost\Benefit analysis of solutions • Implementation • Deploy the recommended solution • Testing • Test to see if the solution meets the recovery requirements • Maintenance • Yearly testing and review of procedures

  8. SQL Server High Availability Planning • Analysis • Application tiers serviced by the databases • Causes of database downtime • Protection levels: Local HA, Regional DR, Geographic DR • Solution Design • Need to understand what solutions exists? • What are the characteristics and cost of the solution? • Implementation • What are the deployment steps and best practices? • Testing • How do I test my implementation? • Maintenance • How do I monitor and maintain the solution?

  9. Database Downtime Drivers • Analysis

  10. Understand the solutions and choices before making a decision Solution Design • Solution Design

  11. SQL Server Always On Technologies • Solution Design

  12. Always On Technologies • Solution Design • Provides a full range of options to minimize downtime and maintain appropriate levels of application availability

  13. Always On Technology Overview • Solution Design • Architecture Overview • How does it work? • Solution Characteristics • Data Loss Guarantees • Failover Characteristics • Redundancy Levels and Utilization • Cost • Limitations and Caveats

  14. Backup and Restore • Solution Design • Base availability technology for any solution • Protects against failures and recovery from errors • Provides Local HA and Site DR • Need to ensure the backups are accessible if site goes down • High RTO due to restore time • RPO=0 can never be guaranteed • Types: Full, Differential, and Transaction Log • File-group backup/restore for large databases • Backup Compression provides faster and smaller backups in SQL Server 2008

  15. Log Shipping • Solution Design • Automated transaction log backup and restore provides redundancy at the database level • SQLLogship.exe provides the underlying framework for doing automated backup, copy and restore • Backup on primary instance • Restore on secondary instance(s) • Scheduling is done through SQL Server Agent jobs • SQL Server 2008 provides sub-minute scheduling interval providing the ability to do quick backup and restores • No automatic failover capabilities

  16. Database Mirroring • Solution Design • A database level high availability solution that provides complete protection against data loss and fast recovery through automatic failover • Maintains a redundant database by shipping log blocks when the transactions are committed on the principal • Synchronous and Asynchronous modes provide the spectrum of options to choose between availability and performance • Automatic failover when using witness server

  17. Failover Clustering • Solution Design • Instance level protection built on Windows Failover Clustering shared disk model • Cluster nodes typically co-located within the same site to provide local HA • Regional DR possible using VLAN and stretch storage level replication • No built in data redundancy like database mirroring and log shipping • Data protection has to be provided at the storage level or by combining with other solutions

  18. Transactional Replication • Solution Design • A high performance data replication solution that provides granular table level replication • Logical data movement provides flexibility and better hardware utilization • Key scenarios: • Customized application-specific DR • Real-time reporting on secondary server that be used for Site DR • Scale out application queries with ability to use any one database copy for Site DR • Two types relevant for HA and DR • Transactional and Peer-to-Peer

  19. Always On Solution Characteristics • Solution Design Redundancy and Utilization RPO Failover Cost * Database Mirroring and Log Shipping can provide point in time read capability using STANDBY or database snapshots respectively ** Database Mirroring provides fastest failover to hot secondary *** Depends on SAN technology

  20. What’s New in SQL Server 2008 New Features Feature Enhancements Database Mirroring Automatic recovery from page corruption Log stream compression Faster recovery on failover Log Shipping Sub-Minute Log Shipping Backup compression Failover Clustering 16 nodes Rolling upgrade Peer-Peer Replication Hot add new nodes • Resource Governor • Manage SQL Server workloads and resources by specifying limits on resource consumption • Backup Compression • Reduce backup and restore time

  21. Backup Compression • Common questions: • “How much compression will I see?” • “Will it be comparable to, say, SQL Litespeed?” • One simple answer: “It depends!” • All data compresses differently – the compression ratio achieved depends on: • The type of data in the database • Whether the data in the database is already compressed • Whether the data/database is encrypted • “We saw an 85 percent reduction in file size using SQL Server 2008 Backup Compression,” says Colin Neller, Senior Software Engineer at ServiceU and part of the company’s SQL Server 2008 implementation team. “A backup file that was previously over 300 GB is now only 40 GB, and the job runs in about half the time.”

  22. Backup Compression: Backup Performance • Backup of a 322 MB Adventureworks database Compressed Uncompressed • A LOT more CPU used (avg 25%) BUT runtime = 21.6s (45% improvement) and backup stored in 76.7MB (4.2x compression ratio) • Hardly any CPU used (avg 5%), runtime = 39.5s, compression ratio of 0.

  23. demo DEMO: Increasing Availability Using Always On Technologies

  24. Developing Your Availability Solution using SQL Server Always On Technologies • Solution Design

  25. Recap • Solution Design • Application availability requirements or SLA drive primary solution choices • RPO and RTO are the key metrics used to define the SLA • Need mitigation against planned and unplanned downtimes • Multiple solution choices that provides varying cost\benefits • Other requirements apart from application SLA factor into the choice • Understand constraints and tradeoffs you can make Database Mirroring Clustering Peer-Peer Replication Log Shipping

  26. Always On Solution Characteristics • Solution Design Redundancy and Utilization RPO Failover Cost * Database Mirroring and Log Shipping can provide point in time read capability using STANDBY or database snapshots respectively ** Database Mirroring provides fastest failover to hot secondary *** Depends on SAN technology

  27. AdventureWorks Inc Scenario • Solution Design One datacenter located in Omaha Three applications Manufacturing – Tier 1 Finance – Tier 2 Scheduling – Tier 3 Manufacturing application runs on a dedicated SQL Server 2008 Instance All other applications run on a second instance Availability of manufacturing application is critical Implement a solution at the lowest possible cost Adventureworks Inc is a manufacturing company that manufactures and sells bicycles across the world. There are a number of applications, some that are mission critical that run on multiple SQL Server Instances • The DBA team is run by Darren who is responsible for deploying and managing the application databases. One of his core responsibilities is to ensure availability of all application databases in order to meet the application SLA

  28. Application Requirements • Solution Design • Manufacturing application has strict SLA’s • Finance application requires readability on the secondary • The reports are run every 4 hours and need to be fresh as of the last one hour. To offload the reporting load from the main system they would like to utilize the mirror

  29. Solution Choice for Manufacturing Application • Solution Design • Clustering can provide a zero data loss solution that can also provide fast instance level failover • Use RAID configuration to provide data redundancy on the SAN • If a redundant copy is required that can provide instance failover with zero data loss use SAN replication • High Cost Solution • Use synchronous database mirroring if instance failover is not needed Clustering with RAID

  30. Solution Choice for Finance Application • Solution Design Reports Finance Scheduling Db Snapshot every hour • For database level redundancy with acceptable data loss with minimal perf impact, asynchronous database mirroring is an optimal choice • Use database snapshots at periodic intervals to provide a readable snapshot of the data for reporting • Low cost solution Async Database Mirroring Omaha Datacenter

  31. Adding a Regional Datacenter Into the Mix • Solution Design • Regulatory and compliance requirements drive the need for having a additional datacenter within a 10 mile radius to provide redundancy against site level failure. • It is now required that all applications have the ability to failover to the regional datacenter across the river in Council Bluff • The SLA need to be maintained for tier 1 applications even in the case of site failures

  32. Regional Site Solution Choices • Solution Design Manufacturing Cluster with SAN Reports Sync Mirroring no witness Finance Scheduling Db Snapshot every hour Async Database Mirroring Omaha Datacenter Log Shipping CB Datacenter

  33. A Complete Topology • Solution Design • Considering the potential of floods and tornadoes destroying the regional data centers, Adventureworks Inc wants to maintain a disaster recovery site in San Antonio, TX • The disaster recovery site has lower SLA requirements for all applications • The manufacturing application can have an RPO of 1 hour • The RTO is set at 4 hours

  34. Topology Diagram • Solution Design Sync Mirroring No witness Manufacturing Cluster with SAN Log Shipping

  35. Scale Out and Availability Scenario • Solution Design • Requirements • Geo Redundancy • Data Locality • High Availability • Local Read-Scale • Workload Characteristics • Mainly reads • Few writes • Application Characteristics • Each user logging in connects to a particular server • Partitioned based on user-id and region • Writes from a user always happen on one server regardless of the region the user log in from • All reads redirected to the closest geo-location • Reasonable tolerance for latency (5-10 minutes) • Adventureworks is building a new web based order management system that allows customers from all over the world access the system and place orders • The core group of customers are in Western Europe, South East Asia and North America

  36. Replication Topology • Solution Design Asia1 Asia2 Peer Nodes Read-Only Servers

  37. demo Implementing and maintaining a HA solution

  38. Licensing Facts • Passive servers are mirror, log shipped secondary and clustering passive node • No license required on passive if it is truly passive • A passive server does not need a license if the number of processors in the passive server is equal to or less than the number of processors in the active server. • The passive server can take the duties of the active server for 30 days. Afterwards, it must be licensed accordingly.

  39. HA Features Edition Support ₁Single thread redo ₂ Limited to 2 node cluster

  40. Summary • There is no “one size fits all” solution • Consider the cost\benefits\constraints and compare that to availability requirements of the organization to determine the best solution • Use the charts to understand cost, benefit and constraints of the various SQL Server High Availability solutions • TEST the solution to ensure it can meet the availability requirements and meet SLA’s

  41. question & answer

More Related