380 likes | 552 Views
AZR203. Business Continuity in the Windows Azure Cloud. Yousef A. Khalidi Distinguished Engineer Microsoft Corporation. Session Objectives and Takeaways. Session Objectives: Understand business continuity support provided by Windows Azure
E N D
AZR203 Business Continuity in the Windows Azure Cloud Yousef A. Khalidi Distinguished Engineer Microsoft Corporation
Session Objectives and Takeaways • Session Objectives: • Understand business continuity support provided by Windows Azure • Learn methods to maintain application availability • Key Takeaways: • Windows Azure provides highly-available and geo-distributed infrastructure • You have to architect your app for high availability • Your SLA requirements and budget constraints will dictate the solution
Cloud + Business Continuity Some things change New trust relationships Plan for failure at multiple levels Design to operate seamlessly through failures A new option for the disaster recovery site And some remain the same Your business goals Your availability and recovery objectives What can you expect from the platform? How can you make your application highly available?
The Big Picture Application Architecture Design your application to meet your availability goals Platform Services Optional availability services your applications can leverage Platform Preparedness Preventing and recovering from outages
Windows Azure World-Class By Design Platform Availability and Security Physical Features Geo-Distribution Complianceand DR Physical facilities have broad compliance certifications Service-level compliance on near-term roadmap Preparedness, testing, refinement Multiple data centers in different geographies Local and geo-replication Redundant platform services and failover 99.9% uptime, financially-backed SLAs Highly available platform services Service isolation over virtualized compute and network Clear boundaries and multiple lines of defense State-of-the-art security and access control World-class data centers - redundant power, climate control, and fire prevention and suppression Leading innovator in power efficiency Multi $billion cloud infrastructure
Highly Available Infrastructure • Redundancy • Duplicate copies of all data • No single point of failure platform services • Redundant network switches, routers, etc. • Partitioning • Many separate compute and storage stamps • Separate fabric controller and related services for each stamp • Optimized for MTTR • Expect and recover from failures quickly
Windows Azure Global Presence North America Region Europe Region Asia Pacific Region West – U.S. Sub-Region East – U.S. Sub-Region N. Europe Sub-Region W. Europe Sub-Region N. Central – U.S. Sub-Region E. AsiaSub-Region S. Central – U.S. Sub-Region S.E. Asia Sub-Region Major datacenter CDN node
Platform-Level DR Preparedness On-going investment in disaster preparedness Platform meta state Stored in storage system Frequently check-pointed, backed-up and geo replicated Testing, simulations and process refinements Capacity management Extra capacity reserved in each datacenter for DR purposes “N+1” model for failover
Platform Services A set of building blocks BLOB and Table Geo-replication SQL Azure DB Copy Application Health Management Traffic Manager CTP
WA Storage Geo-Replication North Central US South Central US • Data geo-replicated across data centers hundreds of miles apart • Turned on right now for Blob and Table data • Provides data durability in face of major data center disasters • Data geo-replicated within regions only • User chooses primary location during account creation • Other datacenter in region is the secondary location • Asynchronous geo-replication • Off critical path of live requests North Europe West Europe East Asia South East Asia Geo-replication
WA Geo-Failover Updating IP Address • Existing URL works after failover • Failover Trigger – failover would only be used if primary could not be recovered • Asynchronous Geo-replication – may lose recent updates during failover • Typically geo-replicate data within minutes Azure DNS Failover South Central US NorthCentral US Geo-replication
Location of Customer Data Customers may specify the geographic region in which their Data will be stored • Asia: East and Southeast • Europe: North and West • United States: North Central, South Central,East, West • Microsoft will not transfer Customer Data outside the major geographic region(s) customer specifies (for example, from Europe to U.S. or from U.S. to Asia) except: • Where the customer configures the account to enable this, e.g., through use of the Content Delivery Network (CDN) feature • Where necessary for Microsoft to provide customer support, to troubleshoot the service, or comply with legal requirements • Microsoft does not control or limit the regions from which customers or their end users may access Customer Data Microsoft may transfer Customer Data within a major geographic region (e.g., within Europe) for data redundancy or other purposes
SQL Data Sync • Goals of Data Sync • Synchronization of data between SQL Server databases and SQL Azure databases • Synchronization of data between two or more SQL Azure databases • Challenges • Preservation of transaction boundaries • Some schemas are not supported • No support for multiple versions Sync SQL Azure
SQL Azure HA Recommendations • Enable resiliency by app re-try logic • Enable point in time recovery by maintaining several snapshots • Convert to BACPAC and blobs to minimize storage cost • Enable geo-redundancy by exporting BACPAC(s) into multiple datacenters • Consider using blob geo-replication to minimize storage and bandwidth cost
Roadmap: Evolution of HA in SQL Azure User initiated geo-replication Automatic replication and synchronization Optional RPO enforcement Read-only geo-secondary Multiple geo-secondaries User-controlled termination for failover
Roadmap: Point in time recovery Backup to attached storage P Highly available S Restore to new database S Any point in time within retention period P
Windows Azure Traffic ManagerLoad balance user traffic across hosted services running in same or different datacenters to build globally available, high performing apps DNS based traffic management based on policies: Performance, Round- robin, Failover www.foo.com CNAME • Load-balancing • Endpoint monitoring foo.trafficmgr.cloudapp.net Improve app performance by serving user requests with services ‘closest’ to them Policies Improve app availability by automatically failing over when a service goes down Hosted Service Hosted Service Hosted Service
Application Design Best Practices Deploy to multiple regions Route traffic intelligently with Traffic Manager Synchronize data Specify locations of compute and storage resources Capacity and app arch considerations “Performance” policy for active-active “Failover” policy for active-passive SQL Azure Backup and Data Sync Other storage (custom-built replication)
Consider Your Application Portfolio Mission critical High impact Low impact
Application Design Patterns Redeploy on failure Single data center deployment Everything ready for redeploy Capacity as available Active / passive Single data center active Staged in additional data center(s) Reserve capacity, scale as needed Active / active Multiple data centers active Use all of what you reserve Optimize connections for performance Balance Cost Complexity Recovery time Recovery point Reserved capacity Plan it, test it Assets People Procedures Connections Dependencies
Ideal Approaches Active/Active Mission critical Active/Passive High impact Redeploy on Failure Low impact
Things To Think About What components can be distributed and stateless? Logic that needs to be site/instance-aware Availability objectives versus cost Cold/warm/hot standby Synchronous or asynchronous replication, tolerance for loss
Important Considerations Data stored in Windows Azure blobs and tables is automatically replicated to peer data center Can't access remote data until storage failover is complete Microsoft decides when the failover occurs Other data and applications are not replicated and do not automatically failover between data centers Maintain deployments in secondary data center to guarantee capacity
Using Windows Azure as a Disaster Recovery Site • Use Windows Azure for data backup • SQL Azure Sync • Backup data to blob store • 3rd party appliances • Run VMs in cloud or on-premises • Periodically back up VHDs in blob storage • Launch VMs in the cloud • Consider application architecture and dependencies • AD, databases, other services
Using Windows Azure as Online Backup with Windows Server 2012 Microsoft Online Backup Portal Sign up & Billing Sign up &Billing 3rd Party Cloud Microsoft Online Backup Service 3rd Party Online Backup Service IT Admin or VAP IT Admin or VAP Backup/Restore Backup/Restore Registration Inbox Engine Inbox UI Registration Windows Server 2012 Windows Server 2012 Backup (Extensible)
Comprehensive Compliance Framework Payment Card Industry Data Security Standard Health Insurance Portability and Accountability Act Industry Standards and Regulations Controls Framework Predictable Audit Schedule Media Ratings Council Sarbanes-Oxley, GLBA, etc. • Identify and integrate • Regulatory requirements • Customer requirements • Assess and remediate • Eliminate or mitigate gaps in control design • Test effectiveness and assess risk • Attain certifications and attestations • Improve and optimize • Examine root cause of non-compliance • Track until fully remediated ISO/IEC 27001:2005 certification Certification and Attestations SSAE 16 attestations
More Information: Windows Azure Trust Centerhttp://www.windowsazure.com/en-us/support/trust-center/ One location to aggregate content across Security, Privacy, and Compliance
Summary • Multi-level failure handling built into Windows Azure platform • Platform provides you building blocks to use in your app • You have to architect your app for high availability • Availability objectives versus cost • Design to operate seamlessly through failures • Windows Azure continues to invest in high availability
Related Content • DBI334: Business Continuity Solutions in Microsoft SQL Azure Find Me Later At the TLC
Track Resources @WindowsAzure @ms_teched Hands-On Labs Meetwindowsazure.com DOWNLOAD Windows Azure Windowsazure.com/ teched
Resources Learning TechNet • Connect. Share. Discuss. • Microsoft Certification & Training Resources http://northamerica.msteched.com www.microsoft.com/learning • Resources for IT Professionals • Resources for Developers • http://microsoft.com/technet http://microsoft.com/msdn
Required Slide Complete an evaluation on CommNet and enter to win!
MS Tag Scan the Tag to evaluate this session now on myTechEd Mobile
© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.