260 likes | 437 Views
Windows Azure SQL Database (WASD) Troubleshooting. I will assume basic SQL Server knowledge. Bob Ward Principal Architect Escalation Engineer bobward@microsoft.com. My Goals for You Today. What Will We Cover Today. The Azure Troubleshooting Challenge. WASD is a platform service (PAAS)
E N D
Windows Azure SQL Database (WASD) Troubleshooting I will assume basic SQL Server knowledge Bob Ward Principal Architect Escalation Engineer bobward@microsoft.com
The Azure Troubleshooting Challenge • WASD is a platform service (PAAS) • This is not a VM running SQL Server “box” (IAAS) • Multi-tenant platform • You are sharing a SQL instance with other databases from other customers • You are abstracted from the SQL Server instance, Windows, and computer server • Less admin tasks means lower TCO but also means less access • You are isolated to a specific database • You have a logical server and a master but most things are done in your database • Most things are database scoped (Ex. DMVs) • We make decisions to maximize all database availability • Application design may be required • The service can be updated far quicker than the “box” product
Use min 30sec login timeout WASD Connectivity Errors
Network latency Example Connectivity Errors Be sure to give this to support 40XXX errors unique to WASD May see this after deleting a server After getting dropped on idle connection
Troubleshooting Connectivity History tables – not real time
Demo Tools for Connectivity
WASD Errors full list here These can result in connection termination and possible future rejection of work
The partition is in transition and transactions are being terminated. SHUTDOWN is in progress. Failover
Governance • Max number of concurrent worker threads (currently 180) per database • Msg 10928 if you exceed the limit • Connection terminated. Retry when your concurrent work subsides • Check for blocking problems or inefficient queries • Msg 10929 if the overall system has too many workers • You may get less than 180 max • Connection terminated. You can retry but it may take longer to stabilize • Still could be an application issue but a service issue could also be occurring Resource ID : 1 = worker threads
Quotas • Quota errors for space used • Msg 40544 when you run out of space for your max size for your db • Only reads and DELETE/DROP allowed until you free up space • Use sys.dm_db_partition_stats to find what is consuming space • Solutions • Increase max size • Delete data or drop tables/indexes • Partition out database • But…freeing up may not be immediately recognized Changing MAXSIZE disconnects all users
Throttling Limits Rebuild index Online
Engine Throttling • This is more of a legacy monitoring method used to keep instances healthy • Another external service monitors the health of the instance and computer • Soft throttling – we have detected a resource issue so pick specific databases • Hard throttling – entire instance at risk so all databases are affected • How it Works • Existing requests run to completion • New requests for existing connections and new connections may get Msg 40501 and connection terminated depending on type of request • Reason code in Error has more details on soft vs hard, what will be rejected, and why • throttling in sys.event_log 0x8003 x03 = RejectAll x80 = Hard Throttling on I/O Decode reason codes Another resource
“Not Supported” Errors • USE <db> not supported – specify when connecting • ALTER DATABASE supported minimally (Ex. Name, Edition, MAXSIZE, READ_ONLY) • All DBCC commands not supported except for DBCC SHOW_STATISTICS • Database scoped DMVs supported • Feature Support for Windows Azure SQL Database • Unsupported Transact-SQL Statements (Windows Azure SQL Database) • Partially Supported Transact-SQL Statements (Windows Azure SQL Database)
Demo Using Event Tables to Troubleshoot WASD Errors
Demo Troubleshooting Query Performance on WASD
Before you contact support • We can do RCA but…. • It can take some time and we may • not have enough history
References • Retry Logic for Transient Failures in Windows Azure SQL Database • Error Messages (Windows Azure SQL Database) • Windows Azure SQL Database Performance and Elasticity Guide • Windows Azure SQL Database Connection Management • sys.event_log documentation • CSS SQL Escalation Blog • Troubleshoot and Optimize Queries with Windows Azure SQL Database
Questions? http://sdrv.ms/Zqdkex Thank you!
The Troubleshooting Checklist • Does the Windows Azure Portal work and list your databases? • Is there a dashboard posting for an outage in your region? • Does the SQL Management Portal work? • Does SQL Server Management Studio work? • Is there an internet provider issue? • Is your firewall configuration correct? • Is the problem Windows Azure vs WASD? • Is there blocking? • Are your queries and index tuned? • Is this really an application retry issue? • Governance, quotas, limits, and throttling are “part of this platform” • Have you looked at Event Tables?