220 likes | 341 Views
SQL Server Disaster Recovery. Chris Shaw Sr. SQL Server DBA, Xtivia Inc. Christopher Shaw. Microsoft SQL Server MVP Cshaw@Xtivia.com Twitter @ SQLShaw http ://chrisshaw.wordpress.com/ SQL PASS Regional Mentor Contributing Author for 3 SQL Server Books (working on 4 th )
E N D
SQL Server Disaster Recovery Chris Shaw Sr. SQL Server DBA, Xtivia Inc.
Christopher Shaw • Microsoft SQL Server MVP • Cshaw@Xtivia.com • Twitter @SQLShaw • http://chrisshaw.wordpress.com/ • SQL PASS Regional Mentor • Contributing Author for 3 SQL Server Books (working on 4th) • A full time VDBA
Success Requires Planning • Start planning now. • Document your plan, and update regularly. • Create hard copies and soft copies of your plan, keep a copy of your plan with your backups. • Predetermine layers of recovery. Tip – Have increased understanding of your business and goals as a whole and not just the task goal. Maximize resource usage.
Define “Disaster” Anything impacting the availability of your data. • Human Error • Storage Failure • Power Failure • Memory Failure • Natural disasters • The cable guy down the street Tip – Your company servers don’t know the difference between a fire, a flood or a bad disk drive. A crash is a just a crash. Down is simply down.
Define “Disaster” • Your company is much more likely to experience a hardware issue than a natural disaster. • 76% of effected businesses have not been impacted by a natural disaster. • Most common cause of system failures is HARDWARE. • Annual disk replacements have been documented as high as 13%. • Inside attacks and user error account for the majority of business downtime. • 70% of all successful attacks on networks were carried out by employees and insiders. • Disaster “downtime” can co$t your company • Companies lose an average of $84,000 for each hour of downtime. Tip – Your company servers don’t know the difference between a fire, a flood or a bad disk drive. A crash is a just a crash.
The Two Design Questions • Define the Goal – What is: • RTO – Recovery Time Objective. • The Amount of acceptable downtime • RPO – Recovery Point Objective. • The Amount of maximum data loss. Tip –This is planning for worst case scenario. No one wants to say they can be down for extended periods of time, or lose any data. Yet over aggressive goals can increase the cost of your solution.
The Keys to Success • Identify the Resources • Define the Budget • Evaluate costs of downtime to cost of high availability. What is it worth? • Compliance Requirements (HIPA, SOX, SSAE 16 etc.) Tip –This is planning for worst case scenario. No one wants to say they can be down for extended periods of time, or lose any data. Yet over aggressive goals can increase the cost of your solution.
Outage Types • Site Outage • Hardware Outage • Database Outage
Possible Site Outages Effect all levels Potential Examples: Natural disaster, Fire, Flood, Global emergency, Power Outages. Technology Options • Transaction Log Shipping • Mirror • AlwaysOn
Possible Hardware Outages Technology Options • Redundant Hardware • RAID • Windows Clustering • Transaction Log Shipping • Geo – Cluster • Mirror • AlwaysOn Potential Examples: Storage issue, bad drive, memory crash, power supply
Possible Database Outages Technology Options • Backup / Restore • Transaction Log Backups • Database Snapshots • Snapshot Replication • Snapshots • Transaction Log Shipping • Geo – Cluster • Mirror • AlwaysOn Potential Examples: Human error, bug, bad release
Transaction Log Shipping Multiple Secondary Outside objects Potential Data Loss Failover Time Cost True Require Configuration Backups Database Based on Backups Slow $ Inexpensive Automated failover Fail back Latency Secondary Readable False False Based on Backups Yes, with restrictions Tip – Great inexpensive way to have a DR copy and a reporting server. However requires a lot of added configuration, and failback requires reconfiguration.
Transactional Replication Multiple Secondary Outside objects Potential Data Loss Failover Time Cost Database True Require Configuration Based on Configuration Slow $ Inexpensive Automated failover Fail back Latency Secondary Readable Distributor Copy False False Based on Agents Yes
Clusters Two or more servers that act as one Multiple Secondary Outside objects Potential Data Loss Failover Time Cost False Move with database Small Fast ( 5 seconds +) $ Expensive Automated failover Fail back Latency Secondary Readable True True None No Tip – Does a great job of protecting you from hardware issues. However if the storage fails your database will go down.
Mirroring Similar to Replication Multiple Secondary Outside objects Potential Data Loss Failover Time Cost False Requires Manual Configuration Database Small Fast (Dependent on connection) $ Witness Varies on Configuration Automated failover Fail back Latency Secondary Readable Varies on Configuration Database True Varies on Configuration No Tip – Great option for companies that don’t run standard edition.
AlwaysOn Availability Groups Multiple Secondary Outside objects Potential Data Loss Failover Time Cost True Partial Small Database Fast (Dependent on connection) $ Most Expensive Automated failover Fail back Latency Secondary Readable Varies on Configuration Database Database True Varies on Configuration Yes Tip – Ideal for companies that want to make use of all the resources they have however the cost is larger than other solutions
DR Options Matrix 1 - Objects required outside of database need to be preconfigured 2 - Consider the failover to be one way. To go back to original hardware reconfig and failover required 3 - Not 100% some items may need to be pre-configured 4 – Common misconception 5 – Feature is going to be removed from SQL Server in future editions to be replaced by AlwaysOn
Past Experience • There is no single cookie cutter solution that will fit every companies needs. • Technology changes, offering new solutions with each release. • Businesses grow, and requirements change. • Building without testing and updating is similar to not doing anything at all. • The shorter the latency the greater likelihood to propagate corruption. Increased latency increases data loss potential
Contact XTIVIA Today Email: CShaw@xtivia.com Phone: 719-387-0980 Website: www.XTIVIA.com
Thank You for Your Time… Questions?