Transact-SQL Tips and Tricks

Required Slide SESSION CODE: DAT405 Transact-SQL Tips and Tricks Itzik Ben-Gan Mentor and Co-Founder Solid Quality Mentors Tobias Ternstrom Senior Program Manager Lead SQL Server Engine

Agenda • 75 mins • Intro • 5 minutes • Warm-up, a couple of Tips to get started! • 15 minutes • Fetching Query Results • 30 minutes • Why it does matter! • When can T-SQL Cursors actually make sense (including coverage of CLR alternative)? • 25 minutes

Warm-Up Tips • Virtual Auxiliary Table of Numbers • Packing Date Intervals

Virtual Auxiliary Table of Numbers • A helper table of numbers is a very useful tool • Problem: can’t always create a real one due to policies • Solution: can generate a virtual auxiliary table of numbers on the fly

Definition of GetNums Function CREATEFUNCTIONdbo.GetNums(@n ASBIGINT)RETURNSTABLE AS RETURN WITH L0 AS(SELECT 1 AS c UNIONALLSELECT 1), L1 AS(SELECT 1 AS c FROM L0 AS A CROSSJOIN L0 AS B), L2 AS(SELECT 1 AS c FROM L1 AS A CROSSJOIN L1 AS B), L3 AS(SELECT 1 AS c FROM L2 AS A CROSSJOIN L2 AS B), L4 AS(SELECT 1 AS c FROM L3 AS A CROSSJOIN L3 AS B), L5 AS(SELECT 1 AS c FROM L4 AS A CROSSJOIN L4 AS B), NumsAS(SELECTROW_NUMBER()OVER(ORDERBY (SELECTNULL))ASn FROM L5) SELECTTOP (@n) n FROMNumsORDERBY n; GO

Packing Date Intervals • No built-in support for temporal interval types and operators • Packing is a common need with intervals • Business case: return periods of project activity

Solution to Packing Intervals WITH Dates AS ( SELECTDATEADD(day, n-1, @from)ASdt FROMdbo.GetNums(DATEDIFF(DAY, @from, @to)+ 1)ASNums ), Groups AS ( SELECTD.dt, DATEADD(day,-1*DENSE_RANK()OVER(ORDERBYD.dt),D.dt)ASgrp FROMdbo.ProjectsAS P JOIN Dates AS D OND.dtBETWEENP.start_dateANDP.end_date ) SELECTMIN(dt)ASstart_period,MAX(dt)ASend_period FROM Groups GROUPBYgrp;

Fetching Query Results • Pull vs. Push model for fetching Query Results • Dynamic vs. Static Execution plans • What about Performance? • Latency • Throughput • Scalability • What about Result set Stability?

Pull vs. Push model for fetching Query Results • Push model • Fire hose ”Cursor” • Pull model (each fetch is a ”Fire hose”) • Static Cursor • Keyset Cursor • Dynamic Cursor • Fast Forward Cursor • What about MARS?

Dynamic vs. Static Execution plans • Dynamic plan • Is restartable • Requires fully streaming plan • Static plan • Cannot be restarted • Does not require streaming plan • Dynamic Cursors require a Dynamic plan • Static & Keyset Cursors use Static plans • Fast Forward Cursors uses ”the best option”

What about Performance? • Throughput • Latency • Scalability

Throughput • Default result set • Best possible • Default optimization performed by QO is for throughput • Cursor using Dynamic plan • Not optimized for throughput • May use a non-optimal plan • Cursor using Static plan • Uses the same plan as the Default result set but always adds extra spooling operation at the end decreasing throughput

Latency • Default result set • Can be optimized for low latency by changing query or using query hints • Cursor using Dynamic plan • It depends, did you get a “good” streaming plan? • Cursor using Static plan • Never as good as default result set • Always spools the entire result set

Scalability “Dormant result sets consume threads and memory” • Default result set • Experiences this problem • MARS? • Cursor using Dynamic plan • Uses a restartable plan so does not use resources when not fetching rows • Cursor using Static plan • Does not experience threading & memory problems but: • Can experience tempdb-contention

What about Result set Stability? “The stability of result sets vary” Example:(using default isolation level) CREATE TABLE TestStability (RowNo INT NOT NULL PRIMARY KEY, Value INT NOT NULL); • SELECT * FROM MyTableWHERE RowNo BETWEEN 1000 AND 4000 • May see changes that occurred after the first row was returned by the query • SELECT * FROM TestStabilityWHERE RowNo BETWEEN 1000 AND 4000ORDER BY Value • Will not be able to see changes that occurred after the first row was returned by the query • SELECT Value, COUNT(*) FROM TestStabilityGROUP BY Value • Will not be able to see changes that occurred after the firstrow was returned by the query

What about Result set Stability? • Default result set • Unpredictable, may or may not be sensitive • Because of network buffer • Because of stop & go operators in exec. plan • Cursor using Dynamic plan • Unpredictable, may or may not be sensitive • Because of network buffer • May degrade to static cursor • Cursor using Static plan • Stable, always insensitive to data changes(as of full population of temp. table)

So, does Stability matter? • There are different types of consumers • Hungry Reader • Always consumes the entire result “immediately” • Slow Reader • Slowly consumes the result (spends at least X ms on each row) • Lazy Reader • Consumes results as the user requires it (think paging in a client application)

Variations on Reader requirements • A Reader can be either one of … • … “Doesn’t care” Reader • Uninterested in getting dynamic results (seeing what is happening “in front of the cursor”) • … Dynamic Reader • Requires dynamic results • … Static Reader • Requires static results

Which gives us the following options • Hungry • “Doesn’t care” Reader • Static Reader • Dynamic Reader • Slow • “Doesn’t care” Reader • Static Reader • Dynamic Reader • Lazy • “Doesn’t care” Reader • Static Reader • Dynamic Reader

Hungry • “Doesn’t care” Reader • Static Reader • Should use SNAPSHOT isolation • Dynamic Reader • Doesn’t make sense • Slow • “Doesn’t care” Reader • Static Reader • Should use SNAPSHOT isolation • Dynamic Reader • Lazy • “Doesn’t care” Reader • Static Reader • Should use SNAPSHOT isolation • Dynamic Reader

… which brings us to … • Hungry • “Doesn’t care” Reader • Slow • “Doesn’t care” Reader • Dynamic Reader • Lazy • “Doesn’t care” Reader • Dynamic Reader

… • Hungry • “Doesn’t care” Reader • Slow • “Doesn’t care” Reader • Dynamic Reader • A dynamic reader should be using dynamic paging, i.e. re-execute the query • Lazy • “Doesn’t care” Reader • Dynamic Reader • A dynamic reader should be using dynamic paging, i.e. re-execute the query

…and now we are finally here: • Hungry Reader • Slow Reader • Lazy Reader • If you require an insensitive result set • You should use SNAPSHOT isolation • If you require a dynamic result set • You should re-execute the query • I.e. Result set stability matters,but you don’tuse a specific cursor type to achieve it

T-SQL Cursors • Cursors - why not by default? • Sensible use of cursors • Examples

Cursors – Why Not By Default? • Contradicts the Relational Model—not set-based • Cursors incur high overhead for each record manipulation • Usually more code than set-based solutions (how vs. what)—maintenance burden

Sensible Use Of Cursors • The process is iterative by nature—e.g., need to run a procedure per row • Still, consider using APPLY if procedure logic can be expressed as table function • Performance—when algorithmic complexity (scaling) of best set-based solution is problematic • Still, keep revisiting the problem and see if a good performing set-based solution can be found

Examples • Calculating maximum number of concurrent sessions • Running totals

Calculating Maximum Number of Concurrent Sessions • Given a set of intervals return the maximum number of intervals that were active concurrently • Business case example: per-user license 3

Calculating Maximum Number of Concurrent Sessions • Solutions: • Traditional set-based: quadratic algorithmic complexity—very slow! • Cursor-based: linear complexity • Behold: new set-based solution with linear complexity! (credits: Ben Flanaghan, Arnold Fribble, and RBarryYoung)

Benchmark Results

Running Totals • Very common need! • Business examples: • Account balance • Inventory • Cumulative sales values • Solutions: • Traditional set-based using subqueries/joins: quadratic complexity—very slow! • Cursor-based: linear complexity • Hopefully SQL Server will add support for missing window functions functionality in the future: will allow optimal set-based solution with linear complexity

Benchmark Results

CLR Alternative to T-SQL Cursors • A .NET SqlDataReader is a form of a cursor—only much more efficient than the T-SQL one…

Required Slide Track PMs will supply the content for this slide, which will be inserted during the final scrub. DAT Track Scratch 2 Win • Find the DAT Track Surface Table in the Yellow Section of the TLC • Try your luck to win a Zune HD • Simply scratch the game pieces on the DAT Track Surface Table and Match 3 Zune HDs to win

Required Slide Resources Learning • Sessions On-Demand & Community • Microsoft Certification & Training Resources www.microsoft.com/teched www.microsoft.com/learning • Resources for IT Professionals • Resources for Developers • http://microsoft.com/technet • http://microsoft.com/msdn

Required Slide Complete an evaluation on CommNet and enter to win!

Sign up for Tech·Ed 2011 and save $500 starting June 8 – June 31st http://northamerica.msteched.com/registration You can also register at the North America 2011 kiosk located at registrationJoin us in Atlanta next year

© 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Required Slide

Transact-SQL Tips and Tricks