SQL Server for the .NET Developer

with Michelle Ufford SQL Server for the .NET Developer

About Michelle • Sr. DBA for GoDaddy.com • Working with SQL for 6 years, focused on performance tuning and VLDB’s • Member of the PASS Performance SIG • President, 380PASS SQL User Group - http://380pass.org • Active blogger at http://sqlfool.com • Twitters at http://twitter.com/sqlfool

Table Structures • The general syntax for creating a table is: CREATE TABLE [schema].[tableName] ( <column definition> ); i.e. CREATE TABLE dbo.myTable ( id int NOT NULL , lastName varchar(20) NOT NULL , firstName varchar(20) NULL ); • In general, narrow tables perform better.

What the Heck is a Heap? Heap Table LastNameFirstName Carson Amanda Smith John Adams Mike Thomson Laura Johnson Brian Ordered Table (Clustered Index) LastNameFirstName Adams Mike Carson Amanda Johnson Brian Smith John Thomson Laura • ordered data • can have fast writes (depends on if clustering key is sequential) • fast reads (fastest reads when searching on the clustering key) • recommended table structure! • un-ordered data • fast writes • slow reads • not typically recommended

Creating a Clustered Index • General syntax for creating a clustered index: CREATE [UNIQUE] CLUSTERED INDEX <index name> ON <table name>(<column list>); i.e. CREATE UNIQUE CLUSTERED INDEX CIX_myTable ON dbo.myTable(id); • Best practice is to create a clustered index on a narrow column with static values • Try to create a unique clustered index for a little better performance

Creating a Primary Key • A primary key enforces uniqueness within a table • General syntax for creating a primary key: CREATE TABLE [schema].[tableName] ( <column definition> CONSTRAINT <constraint name> PRIMARY KEY [NONCLUSTERED|CLUSTERED](<column list>)); i.e. CREATE TABLE dbo.myTable ( id int NOT NULL , lastNamevarchar(20) NOT NULL , firstNamevarchar(20) NOT NULL CONSTRAINT PK_myTable PRIMARY KEY NONCLUSTERED(id));

Clustered Index vs Primary Key • Pay attention to where you’re placing the primary key! • CREATE TABLE dbo.employee • ( id int NOT NULL • , ssnint NOT NULL PRIMARY KEY • , lastNamevarchar(20) NOT NULL • , firstNamevarchar(20) NOT NULL ); SSN ID LastNameFirstName 123-45-6789 8 Carson Amanda 212-12-1212 3 Smith John 323-23-2323 5 Adams Mike 424-24-2424 1 Thomson Laura 505-05-0505 4 Johnson Brian SSN ID LastNameFirstName 123-45-6789 8 Carson Amanda 200-98-7653 9 Edwards Frank 212-12-1212 3 Smith John 323-23-2323 5 Adams Mike 424-24-2424 1 Thomson Laura 505-05-0505 4 Johnson Brian New Hire

Clustered Index vs Primary Key • Creating a table with a non-clustered primary key and separate clustered index: CREATE TABLE dbo.employee ( id int Identity(1,1) NOT NULL , ssn int NOT NULL , lastName varchar(20) NOT NULL , firstName varchar(20) NOT NULL CONSTRAINT PK_employee PRIMARY KEY NONCLUSTERED(ssn)); CREATE UNIQUE CLUSTERED INDEX CIX_employee ON dbo.employee(id);

Choosing The Right Data Types • Use the most appropriate data type to meet the requirements of the column • Use the smallest data type while allowing a little room for growth • If the data type is too big, you’re… • wasting space • wasting IO • wasting time

The WRONG Data Type • Assume you have 1 million records to store: a unique id, a number 1 to 100, and a timestampCREATE TABLE dbo.myBigTable ( myID INT IDENTITY(1,1) , myNumber INT , myDate DATETIME ); • Bytes per row: 4 + 4 + 8 + 10 = 26 bytes (myID + myNumber + myDate + overhead) • Rows per page: 8,060 / 26 = 310 rows • Pages per 1mm rows: 1,000,000 / 310 = 3225 pages

The RIGHT Data Type • We’re still storing 1 million records:CREATE TABLE dbo.mySmallTable ( myID INT IDENTITY(1,1) , myNumber TINYINT , myDate SMALLDATETIME ); • Bytes per row: 4 + 1 + 4 + 10 = 19 bytes (myID + myNumber + myDate + overhead) • Rows per page: 8,060 / 19 = 424 rows • Pages per 1mm rows: 1,000,000 / 424 = 2357 pages

The Results • myBigTable consumed 3,225 pages • mySmallTable consumed 2,357 pages • mySmallTable used 868 less pages! • Space savings of 27% • A single IO can now return 114 more rows • Performance improvement of 37%

Non-Clustered Indexes • Text Book example: Clustered Index Non-Clustered Index Book Page Appendix Adams, Mike Carson, Amanda Edwards, Frank Johnson, Brian Roberts, Amanda Smith, John Thomson, Laura Amanda  2, 5 Brian  4 Frank  3 John  6 Laura  7 Mike  1

Creating a Non-Clustered Index • General syntax for creating a non-clustered index: CREATE [UNIQUE] NONCLUSTERED INDEX <index name> ON <table name>(<index keys>) INCLUDE (<column list>); i.e. CREATE NONCLUSTERED INDEX IX_employee_ssn ON dbo.employee(ssn) INCLUDE (lastName, firstName); • Too may indexes can make writes expensive

Included Columns Non-key columns on the leaf level of an index Not used as search criteria in an index SELECT employeeID, lastName, firstNameFROM dbo.employeeWHERE hireDate < ‘2009-01-01’ CREATE NONCLUSTERED INDEX IX_employee_hireDate ON dbo.employee(hireDate) INCLUDE (lastName, firstName); hireDate = index key employeeID = clustered key pointerlastName, firstName = included column

Half-Time Show! Want free hosting? First person to tweet… "@GoDaddyGuy I wantGoDaddy.com hosting!” …will win free hostingfor a year! 

What Is Fragmentation? • When the logical ordering of pages does not match the physical ordering of pages • Both clustered and non-clustered indexes can become fragmented • Fragmentation is caused by insert/update/delete operations

How Does Fragmentation Occur? • Page 4 • Page 5 • Page 6  • Page 3 • Page 4 • Page 5  • Page 3 • Page 4 • Page 742 • Page 4 • Page 742 • Page 5  • Page 742 • Page 5 • Page 6  Insert • Page 3 • Record 1 • Record 2 • Record 3 • Record 4 • Page 5  • Page 3 • Record 1 • Record 2 • Page 742 • Page 4 • Record 5 • Record 3 • Record 4 • Page 5  Insert Page 4 Page 4 Page 742

How Do I Find Fragmentation? • SQL Server 2000 • DBCC ShowContig • SQL Server 2005 / 2008 • Dynamic Management Views (DMV) • sys.dm_db_index_physical_stats • Execute after normal business hours or use “Limited” to avoid impacting the environment

How Do I Fix Fragmentation? • SQL Server 2000 • DBCC IndexDefrag - quicker • DBCC dbReindex - more thorough • SQL Server 2005 • Alter Index {Reorganize | Rebuild} • My Index Defrag Script: http://sqlfool.com/2009/03/automated-index-defrag-script/

Stored Procedures • Stored procedures offer: • enhanced security • better performance • easier maintenance • less network traffic • Best practice for accessing data in SQL Server

Tuning Stored Procs • Select only the columns you actually need • Try to use an existing covering index whenever possible, but… • Don’t create an index for every query! • Place Data Definition (DDL) statements first, then Data Modification (DML) • Specify the schema owner (i.e. “dbo”, “sys”) for every referenced object

Tuning Stored Procs • Consider MAXDOP restrictions • Consider NOLOCK hints (only if dirty reads are acceptable) • Use “SET NOCOUNT ON;” • Avoid nested proc calls • Keep transactions small

Tips for Developing Applications • Minimize database calls whenever possible • When working with static data, retrieve data once and store it in the app (i.e. hash table) • When developing high-volume applications: • Batches are key! More manageable updates • Use table-valued parameters (TVP) or XML to handle bulk operations • Staging tables vs single row updates

Where To Go When You Need Help • Books Online,http://msdn.microsoft.com/en-us/library/ms130214.aspx • SQLServerPedia, http://sqlserverpedia.com • SQL Server Performance, http://www.sql-server-performance.com • SQL Server Central, http://www.sqlservercentral.com/ • My blog! http://sqlfool.com

Other Resources • Batch Compiliation, Recompliation, and Plan Caching Issues in SQL Server 2005http://technet.microsoft.com/en-us/library/cc966425.aspx • SQL Server 2005 Waits & Queues (Google) • Grant Fritchey’s SQL Server Execution Plans • Kalen Delaney’s SQL Server 2008 Internals • Itzik Ben-Gan’s SQL Server 2008 T-SQL Fundamentals

Questions? Thank you for attending my presentation!  I can be reached at: Michelle Ufford michelle@sqlfool.com http://sqlfool.com

SQL Server for the .NET Developer