70 likes | 81 Views
Learn optimization strategies and pitfalls in Transact-SQL query tuning to enhance performance. Understand optimizer strengths and weaknesses, common mistakes to avoid, and how to improve query efficiency.
E N D
Tuning Transact-SQL Queries • Learn the Strengths and Weaknesses of the Optimizer • One of the largest factors determining performance is T-SQL! Test not only forefficient plans but also semantic correctness ! • Optimizer “Plans” Join Order 4 Tables at a Time • Every combination of “4-table permutations” is costed (e.g. {t1,t2,t3,t4}, {t1,t2,t3,t5}, {t1,t2,t3,t6}, {t1,t2,t4,t5}, {t1,t2,t4,t6}, {t2,t1,t3,t4}, etc.). • Best “outer” table is saved and the remaining combinations are “costed” to determine “next outer-most” table until done. • Scrutinize these types of queries for best possible plan. • Adding redundant predicates (eg. where a = b and b = c and a = c) gives the optimizer more choices! .
Tuning Transact-SQL Queries • Learn the Strengths and Weaknesses of the Optimizer • Avoid the following, if possible : • Mathematical Manipulation of SARGsSELECT name FROM employee WHERE salary * 12 > 100000 • Incompatible Datatypes (Columns, SARGs, or SPROC Parameters)Float & Int, Char & Varchar, Binary & Varbinary are Incompatible Int & Intn (allow nulls) OK • Multiple “OR” Clauses (especially on different columns in same table)If any portion of the OR clause requires a table scan, it will ! OR strategy requires cost of creating and sorting a work table. Evaluate UNIONs as an alternative ! • Not Using the Leading Index Keys (unless query is covered) Without leading key, B-tree index can’t be searched ! • Not Equal Expressions (!=)
Tuning Transact-SQL Queries • Subquery Processing in SQL Server • Subquery Flattening (Transform to Normal or Existence Join) • Normal joins execute more quickly because best join order can be chosen • Existence joins stop after first match • Queries that CAN be flattened include: • Many IN, ANY, and EXISTS subqueries • Expression subqueries (ie., column {<, <=, >, >=, !=, =} subquery) with unique joins or returning unique columns • Queries that CAN NOT be flattened include: • Most NOT IN, NOT EXISTS, ALL subqueries • IN, ANY, EXISTS subqueries in an OR clause • IN, ANY, EXISTS subqueries in a correlated subquery with aggregates • Expression subqueries without unique joins or not returning unique columns • Subquery Materialization • Evaluate subquery once before outer query and store results for later use • Makes sense only when subquery will evaluate to same result for every outer row! • Non-correlated expression subqueries • select title_id from titles where total_sales = (select max(total_sales) from sales) • Quantified subqueries (ie., IN, ALL, ANY, EXISTS) containing aggregates • select name from employees where salary IN (select max(salary) from employee group by office)
Tuning Transact-SQL Queries Simple Tricks to Use for Good Performance Use “>=“ rather than “>“ Whenever Possible ý “select * from foo where x > 3” must scan all index pages where x=3 just to find the first qualified row! þ “select * from foo where x >= 4” can go directly to first qualified row. n Use “EXISTS” and “IN”rather than “NOT EXISTS” and “NOT IN” (or COUNT) þ Faster in both subqueries and IF statements þ Easy to re-write sprocs using EXISTS or IN. For example, if not exists (select * from ...) begin ... /* statement group */ end could be re-written as : if exists (select * from ...) begin goto exists_label end ... /* statement group */ exists_label: ... þ EXISTS stop after 1st match as opposed to COUNT which does all the I/O to count!
Tuning Transact-SQL Queries • Creating Tables in Stored Procedures • n When tables are created in the same stored procedure in which they are used, • the optimizer can’t know their size. • n In these cases, the optimizer assumes the table has 10 data pages & 100 rows • ý This assumption can lead to poor plans when the table is large. • ý Previously recommended creating the table outside the stored procedure in • which it is used. This allows the optimizer to see its real size. For example, • create proc p as • select * into #huge_result from ... • select * from foo, #huge_result where ... • /* may result in #huge_result as outer table */ • can be re-written as : • create proc p as • select * into #huge_result from ... • exec s • create proc s as • select * from foo, #huge_result where ...
Tuning Transact-SQL Queries Use Parameters, Not Local Variables, in WHERE Clauses n The optimizer can’t predict the value of a declared variable at compile-time. n The optimizer does know the value of a parameter to the sproc at compile-time perhaps leading the “right” plan. n To avoid using local variables in a WHERE clause, split up the sproc whenever possible (see warning on previous page). For example, create proc p as declare @x int select @x = col1 from foo1 where ... select * from foo2 where col2 = @x can be re-written as : create proc p as declare @x int select @x = col1 from foo1 where ... exec s @x create proc s @x int as select * from foo2 where col2 = @x
Tuning Transact-SQL Queries • Enhanced SHOWPLAN in SQL Server • Shows the plan chosen as “most efficient” by the optimizer. Run all queries through during development & testing to ensure accurate access model and performance • SELECT title_id FROM titles WHERE total_sales > ALL (SELECT total_sales FROM titles WHERE type = “business”) • QUERY PlAN FOR STATEMENT 1 (at line 1). STEP 1 The type of query is SELECT FROM TABLE titles Nested iteration Table scan, Ascending scan, Positioned at start of table Run Subquery 1 (at nesting level 1) Using I/O Size 16kbytes, With LRU Buffer Replacement strategy NESTING LEVEL 1 SUBQUERIES FOR STATEMENT 1 QUERY PLAN FOR SUBQUERY 1 (at nesting level 1 and at line 2) Correlated Subquery, Subquery under an ALL predicate STEP 1 The type of query is SELECT Evaluate ungrouped ANY AGGREGATE FROM TABLE titles EXISTS TABLE: nested iteration Table scan, Ascending scan, Positioned at start of table Using I/O Size 16kbytes, With LRU Buffer Replacement strategy END OF QUERY PLAN FOR SUBQUERY 1 • n Verify Proper Index Selections. Watch out for table scans. • n Verify Proper Join Order. Watch out for large tables. • n Verify Proper I/O Block Size and Cache Strategy. • n Verify Proper Subquery Optimizations • n Look at TEMPDB usage. May require optimizations for multi-user issues.