290 likes | 470 Views
DBI314. Microsoft SQL Azure Performance Considerations and Troubleshooting. Henry Zhang Senior Program Manager Microsoft. Objectives. Performance Consideration beyond the Database Network Latency Load Balancer and Cluster-wide Management Throttling Service Drill down
E N D
DBI314 Microsoft SQL Azure Performance Considerations and Troubleshooting Henry Zhang Senior Program Manager Microsoft
Objectives • Performance Consideration beyond the Database • Network Latency • Load Balancer and Cluster-wide Management • Throttling Service Drill down • DMVs and E2E Monitoring • Perf Baseline Facilitates Meaningful Comparison • Leverage Elasticity
Shared infrastructure at SQL database and below Massively distributed cluster w/ commodity hardware Scalable HA technology provides the glue Each SQL Azure DB has 3 replicas Automatic replication and failover Gateway Service forwards TDS requests Review - SQL Azure Architecture Logical Server Machine 4 Machine 5 Machine 6 SQL Instance SQL Instance SQL Instance SQL DB SQL DB SQL DB UserDB1 UserDB1 UserDB1 UserDB2 UserDB2 UserDB2 UserDB3 UserDB3 UserDB3 UserDB4 UserDB4 UserDB4 SQL Azure Gateway Service Scalability and Availability: Fabric, Failover, Replication, and Load balancing
Microsoft Azure Data Centers World Wide North Europe North Central US West Europe South Central US East Asia Southeast Asia
Network Latency Latency_2 Latency_1 • Network Latency: • Userand Application • Applicationand SQL Azure DB • Perceived performance: • Response Time= 2x(Latency_1 + Latency_2) + Query_Exec_Time • Optimization • Minimize latency 1: select data center closest to majority of your users • Minimize latency 2: co-locate with Windows Azure application • Minimize network round trips SQL Azure
SQL Azure Blog Post: Testing Client Latency to SQL Azure http://blogs.msdn.com/b/sqlazure/archive/2010/05/27/10016392.aspx
Resource Management and Multitenancy • Resource shared on machine with neighbor databases • CPU, memory, data/log spindles • TempDB, worker threads, network • Neighbors: size and activity can affect your DB • Multi-tenancy management provided in SQL Azure • Load Balancer • Throttling Service
Load Balancer • Balance resource utilization across all machines • Minimize overloaded machines and reduce throttling • Swap vs. move mechanisms • Runs periodically, solves long term imbalance for cluster • Reactive Load Balancer solves short term spikes • React to spikes before the next regular LB run • Alleviate high throttling occurrences on hot machines • Local optimization, fast solution
Resource Throttling in SQL Azure • Throttling Service • Protect a machine from sustained high usage of system resources • Evaluate actual resource usage vs. safe thresholds real-time • Throttle the busiest DBs first (soft throttle) • Throttle every DB if necessary (hard throttle) • Throttling show as connection error 40501 • “The service is currently busy. Retry the request after 10 seconds. Code: %d.” • Decode throttling code for more insight
Decoding Throttling Code Code = 131075 Why am I throttled? How bad is it? Step 1: Reasons = Code/256 = 512 Throttling Impact = Code % 4 If remainder is 0: No throttling 1: Reject Update/Insert 2: Reject All Writes 3: Reject all Step 2: Convert Reasons to binary 512 => 1000000000(2) Throttling Type – Hard vs. Soft 00: not throttled on this resource 01: soft throttled on this resource 10: hard throttled on this resource Step 3: Group in sets of 2 digits from right to left: 10|00|00|00|00(2) Resource Code 0: Physical Database Space 1: Physical Log Space 2: LogWriteIODelay 3: DataReadIODelay 4: CPU 5: Database Size 6: Internal 7: SQL Worker Threads 8: Internal Example: Resource Code: (4) - CPU throttling Throttling Type: (10)- Hard throttling Conclusion: CPU Hard throttling
Throttling – Scenario 1 • Customer A using 30% CPU on a machine • Customer B kicks of load of 70% additional CPU on the same machine • Customer B gets throttled Solution: Yes. Load balancer moves A or B away from this machine Throttling Trigger: B Throttling Victim: B Fairness: Fair to throttle B - B uses more CPU than A - B triggered throttling on the machine
Throttling – Scenario 2 • Customer A using 70% CPU on a machine • Customer B kicks of load to 30% additional CPU on the same machine • Customer A gets throttled Solution: Yes. Load balancer moves A or B away from this machine Throttling Trigger: B Throttling Victim: A Fairness: Not quite fair to throttle A - B triggered throttling on the machine
Throttling – Scenario 3 • Machine has no active workload • Customer A kicks of load to 100% CPU and gets throttled repeatedly • Customer A gets throttled Solution: No. A will get throttle anywhere it is placed. A exceeds a machine’s total CPU Throttling Trigger: A Throttling Victim: A Fairness: Fair from system perspective but customer will not be happy =( Customer A needs to optimize and reduce resource usage to fit within a SQL Azure machine
Get the Most out of Throttling Information • Monitor throttling reasons for your DB, find distribution • Use throttling code to identify potential inefficiency in DB • May need to scale out to more than 1 DB
DMVs and Monitoring • 10 Perf Related DMVs: select * from sys.all_views where name like '%dm%' • DMV data mapped to proper userDB context • Works identical to SQL Server 2008 DMVs • Update Statistics Supported • Minimize Index Fragmentation • Profiler not yet, think E2E monitoring • DMV Examples
DMV Example: Find Total DB Storage Used select sum(reserved_page_count)*8.0/1024 AS [Storage_in_MB] from sys.dm_db_partition_stats
DMV Example: Find CPU Intensive Queries select highest_cpu_queries.total_worker_time, q.text AS [Query_Text], highest_cpu_queries.plan_handle from (select top 50 qs.plan_handle, qs.total_worker_time from sys.dm_exec_query_statsqs order by qs.total_worker_timedesc) as highest_cpu_queries cross apply sys.dm_exec_sql_text(plan_handle) as q order by highest_cpu_queries.total_worker_timedesc
DMV Example: Find IO Intensive Queries select top 25 (total_logical_reads/execution_count) as avg_logical_reads, (total_logical_writes/execution_count) as avg_logical_writes, (total_physical_reads/execution_count) as avg_phys_reads, Execution_count, sql_handle, plan_handle from sys.dm_exec_query_stats order by (total_logical_reads + total_logical_writes) Desc
Setting Performance Expectations • Cloud does not SOLVE perf problems • Cloud does not guarantee same perf compared to on-prem • Hardware is different • Multi-tenancy environment • Network latency • Know your on-prem DB well before migrating to SQL Azure
Get a Baseline before We Compare Perf • Baselining on-prem Performance • What is the on-prem hardware spec? • Data size in on-prem testing? • Use of DOP? • Concurrent txns? • How chatting is the middle-tier to the DB? • Has anything changed after moving to SQL Azure? • A busy DB may exceed the hardware limits of 1 machine • Think Scale out
Leveraging Elasticity for New DB Applications • Traditional Capacity Planning = Buy sufficient hardware • SQL Azure Capacity Planning = Determine number of DBs needed • Create DB = Get more resource • Drop DB = Release resource • When to create/drop? • Use DB Copy to Separate Read/Write Workload • Partition Aware Middle-tier • Build 2 level Composite Key for Federation/Scale out • Customer ID • Month • Composite Key = (bigint) [customer_ID + MonthYear_Key]
Required Slide Speakers, please list the Breakout Sessions, Interactive Discussions, Labs, Demo Stations and Certification Exam that relate to your session. Also indicate when they can find you staffing in the TLC. Related Content • DBI 403 -Building Scalable Database Solutions Using Microsoft SQL Azure Database Federations • DBI 313 - Building Scalable Database Solutions Using Microsoft SQL Azure Database Federations
Required Slide Track PMs will supply the content for this slide, which will be inserted during the final scrub. Database Platform (DAT) Resources • Visit the updated website for SQL Server® Code Name “Denali” on www.microsoft.com/sqlserverand sign to be notified when the next CTP is available • Follow the @SQLServer Twitter account to watch for updates • Try the new SQL Server Mission Critical BareMetal Hand’s on-Labs • Visit the SQL Server Product Demo Stations in the DBI Track section of the Expo/TLC Hall. Bring your questions, ideas and conversations!
Resources • Connect. Share. Discuss. http://northamerica.msteched.com Learning • Sessions On-Demand & Community • Microsoft Certification & Training Resources www.microsoft.com/teched www.microsoft.com/learning • Resources for IT Professionals • Resources for Developers http://microsoft.com/technet http://microsoft.com/msdn
© 2011 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.