330 likes | 355 Views
Module 6: Planning Indexes. Overview. Introduction to Indexes Index Architecture How SQL Server Retrieves Stored Data How SQL Server Maintains Index and Heap Structures Deciding Which Columns to Index. Introduction to Indexes. How SQL Server Stores and Accesses Data
E N D
Overview • Introduction to Indexes • Index Architecture • How SQL Server Retrieves Stored Data • How SQL Server Maintains Index and Heap Structures • Deciding Which Columns to Index
Introduction to Indexes • How SQL Server Stores and Accesses Data • Whether to Create Indexes
Con ... ... ... ... ... Rudd Akhtar Smith Martin Ganio ... Funk ... ... ... ... ... White Funk Ota Phua Jones ... White ... ... ... ... ... Barr Smith Jones Jones Hall ... ... ... ... ... ... ... ... Martin ... Smith ... ... ... ... ... ... ... ... ... ... ... ... ... ... How SQL Server Stores and Accesses Data • How Data Is Stored • Rows are stored in data pages • Heaps are a collection of data pages for a table • How Data Is Accessed • Scanning all data pages in a table • Using an index that points to data on a page Data Pages Page 8 Page 9 Page 4 Page 5 Page 6 Page 7
Whether to Create Indexes • Why to Create an Index • Speeds up data access • Enforces uniqueness of rows • Why Not to Create an Index • Consumes disk space • Incurs overhead
Index Architecture • SQL Server Index Architecture • Using Heaps • Using Clustered Indexes • Using Nonclustered Indexes
Using Heaps SQL Server: • Uses Index Allocation Map Pages That: • Contain information on where the extents of a heap are stored • Navigate through the heap and find available space for new rows being inserted • Connect data pages • Reclaims Space for New Rows in the Heap When a Row Is Deleted
Using Clustered Indexes • Each Table Can Have Only One Clustered Index • The Physical Row Order of the Table and the Order of Rows in the Index Are the Same • Key Value Uniqueness Is Maintained Explicitly or Implicitly
Using Nonclustered Indexes • Nonclustered Indexes Are the SQL Server Default • Existing Nonclustered Indexes Are Automatically Rebuilt When: • An existing clustered index is dropped • A new clustered index is created • The DROP_EXISTING option is used to change which columns define the clustered index
How SQL Server Retrieves Stored Data • How SQL Server Uses the sysindexes Table • Finding Rows Without Indexes • Finding Rows in a Heap with a Nonclustered Index • Finding Rows in a Clustered Index • Finding Rows in a Clustered Index with a Nonclustered Index
indid Object Type 0 Heap 1 Clustered Index 2 to 250 Nonclustered Index 255 text, ntext, or image How SQL Server Uses the sysindexes Table • Describes the Indexes • Location of IAM, First, and Root of Indexes • Number of Pages and Rows • Distribution of Data
sysindexes id indid = 0 First IAM Extent Bit Map … 127 1 01 01 01 01 01 01 01 Dunn Graff Con Con Con Con Con … … … … … … … 128 1 02 02 02 02 02 02 02 Funk Funk Funk Funk Randall Bacon Funk … … … … … … … 129 0 03 03 03 03 03 03 03 Ota White White Koch White White White … … … … … … … 130 1 04 … … … … … 04 ... ... ... Durkin ... ... Slichter ... ... ... ... ... ... ... … 05 … 05 … … … … ... ... Lang LaBrie ... ... ... ... ... ... ... ... ... ... 01 Seattle … 02 Paris … 03 Tokyo … 01 01 01 01 01 01 01 01 Rudd Rudd Rudd Rudd Rudd Rudd Rudd Rudd … … … … … … … … 01 01 01 01 01 01 01 01 Akhtar Akhtar Akhtar Akhtar Akhtar Akhtar Akhtar Akhtar … … … … … … … … 01 01 01 01 01 01 01 01 Smith Smith Smith Smith Smith Smith Smith Smith … … … … … … … … 04 Atlanta ... 02 02 02 02 02 02 02 02 White White White White White White White White … … … … … … … … 02 02 02 02 02 02 02 02 Funk Funk Funk Funk Funk Funk Funk Funk … … … … … … … … 02 02 02 02 02 02 02 02 Ota Ota Ota Ota Ota Ota Ota Ota … … … … … … … … … ... ... 03 03 03 03 03 03 03 03 Barr Barr Barr Barr Barr Barr Barr Barr … … … … … … … … 03 03 03 03 03 03 03 03 Smith Smith Smith Smith Smith Smith Smith Smith … … … … … … … … 03 03 03 03 03 03 03 03 Jones Jones Jones Jones Jones Jones Jones Jones … … … … … … … … … … … … … … … … ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... 04 04 04 04 04 04 04 04 Martin Martin Martin Martin Martin Martin Martin Martin ... ... ... ... ... ... ... ... … … … … … … … … … … … … … … … … ... ... ... ... ... ... ... ... … … … … … … … … ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... … … … … … … … … ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... … … … … … … … … ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... Finding Rows Without Indexes IAM Heap Extent 128 Extent 129 Extent 130 Extent 127
id indid = 2 root Non clustered Index Non Clustered Index Non-Leaf Level Non-Leaf Level Akhtar Akhtar ... ... Martin Martin Martin Akhtar Akhtar Page 37 Page 37 Page 28 Page 28 Page 12 - Root Page 12 - Root Ganio Ganio Martin Martin Martin ... ... Smith Smith ... ... Leaf Level(Key Value) Leaf Level(Key Value) Page 41 Page 41 Page 51 Page 51 Page 61 Page 61 Page 71 Page 71 Akhtar Akhtar 4:706:01 4:706:01 Smith Ganio Ganio Smith 4:709:01 4:709:01 4:706:03 4:706:03 Barr Barr 4:705:03 4:705:03 Hall Hall Smith Smith 4:709:04 4:708:04 4:709:04 4:708:04 Martin Martin 4:708:01 4:708:01 Con Con 4:704:01 4:704:01 Smith Jones Smith Jones 4:709:02 4:707:01 4:707:01 4:709:02 Matey Matey Matey 4:706:04 4:706:04 4:706:04 Funk Funk 4:706:02 4:706:02 White Jones Jones White 4:704:03 4:704:03 4:708:03 4:708:03 Ota Ota 4:707:02 4:707:02 Funk Funk 4:704:02 4:704:02 White Jones White Jones 4:707:03 4:705:02 4:707:03 4:705:02 Ota Phua Phua Phua 4:707:02 4:708:02 4:708:02 4:708:02 Rudd Rudd Rudd 4:705:01 4:705:01 4:705:01 Heap Heap Page 704 Page 704 Page 705 Page 705 Page 706 Page 706 Page 707 Page 707 Page 708 Page 708 Page 709 Page 709 01 01 ... ... Akhtar Akhtar 01 01 ... ... Conn Conn 01 01 01 ... ... ... Rudd Rudd Rudd 01 01 ... ... Smith Smith 01 01 ... ... Martin Martin 01 01 ... ... Ganio Ganio 02 02 ... ... Funk Funk 02 02 ... ... Funk Funk 02 02 ... ... White White 02 02 ... ... Ota Ota 02 02 02 02 ... ... ... ... Phua Ota Phua Phua 02 02 ... ... Jones Jones 03 03 ... ... Smith Smith 03 03 ... ... White White 03 03 ... ... Barr Barr 03 03 ... ... Jones Jones 03 03 ... ... Jones Jones 03 03 ... ... Hall Hall 04 04 ... ... Matey Matey ... ... ... ... ... ... ... ... ... ... ... ... 04 ... Matey ... ... ... ... ... ... 04 04 ... ... Smith Smith ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... File ID #4 File ID #4 Finding Rows in a Heap with a Nonclustered Index sysindexes SELECT lastname, firstname FROM member WHERE lastname BETWEEN 'Masters' AND 'Rudd'
id indid = 1 root sysindexes Clustered Index Clustered Index Akhtar Akhtar … … Martin Martin Page 140 - Root Page 140 - Root Akhtar Akhtar Martin Martin Ganio Ganio Smith Smith … … … … Page 141 Page 141 Page 145 Page 145 Akhtar Akhtar 2334 2334 ... ... Ganio Ganio 7678 7678 ... ... Martin Martin 1234 1234 ... ... Smith Smith 1434 1434 ... ... Barr Barr 5678 5678 ... ... Hall Hall 8078 8078 ... ... Martin Martin 7778 7778 ... ... Smith Smith 5778 5778 ... ... Con Con 2534 2534 ... ... Jones Jones 2434 2434 ... ... Ota Ota 5878 5878 ... ... Smith Smith 7978 7978 ... ... Funk Funk 1334 1334 ... ... Jones Jones 5978 5978 ... ... Phua Phua 7878 7878 ... ... White White 2234 2234 ... ... ... ... ... ... ... ... ... ... Funk Funk 1534 1534 Jones Jones 2634 2634 Rudd Rudd 6078 6078 White White 1634 1634 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... Page 100 Page 100 Page 110 Page 110 Page 120 Page 120 Page 130 Page 130 Ota 5878 ... Finding Rows in a Clustered Index Martin SELECT lastname, firstname FROM member WHERE lastname = 'Ota' Martin
id indid = 2 root Nonclustered Index on First Name Nonclustered Index on First Name Non-Leaf Level Non-Leaf Level Aaron Aaron ... ... Jose Jose Aaron Aaron Jose Jose Deanna Deanna Nina Nina … … … … Leaf Level(Clustered Key Value) Leaf Level(Clustered Key Value) Aaron Aaron Con Con Jose Jose Lugo Lugo Deanna Deanna Daum Daum Adam Adam Barr Barr Judy Judy Kaethler Kaethler Don Don Hall Hall Amie Amie Baldwin Baldwin Mike Mike Mike Nash Nash Nash Doug Doug Hampton Hampton … … … … … … … … … … … … Barr Barr Clustered Index On Last Name Clustered Index On Last Name Kim Kim Nagata Nagata O’Melia O’Melia Barr Barr Adam Adam … … Kim Kim Shane Shane … … Nagata Nagata Susanne Susanne … … Cox Cox Arlette Arlette … … Kobara Kobara Linda Linda … … Nash Nash Nash Mike Mike Mike … … … Daum Daum Deanna Deanna … … LaBrie LaBrie Ryan Ryan … … Nixon Nixon Toby Toby … … … … … … … … … … … … … … … … … … … … Finding Rows in a Clustered Index with a Nonclustered Index sysindexes SELECT lastname, firstname, phone FROM member WHERE firstname = 'Mike' Nagata
How SQL Server Maintains Index and Heap Structures • Page Splits in an Index • Forwarding Pointer in a Heap • How SQL Server Updates Rows • How SQL Server Deletes Rows
Ganio … Hall … … … … … Leaf Level(Key Value) Hart … Jones … Jones … Jackson … Jackson … Akhtar Akhtar … … Ganio Jones … … Lang Lang … … Smith Smith … … … … … … Barr Barr … … Hall Jones … … Martin Martin … … Smith Smith … … Barr Barr … … Hart … Martin Martin … … Smith Smith Akhtar Lang … … Borm Borm … … Martin Martin … … Smith Smith … Smith … … Buhl Buhl … … Moris Moris … … Smith Smith Martin … … … Page Splits in an Index INSERT member (last name) VALUES lastname = ‘Jackson' Index Pages Non-Leaf Level Akhtar Ganio … Jackson Leaf Level(Key Value)
id indid = 2 root Non clustered Index Non Clustered Index Non-Leaf Level Non-Leaf Level Akhtar Akhtar ... ... Martin Martin Martin Akhtar Akhtar Page 37 Page 37 Page 28 Page 28 Page 12 - Root Page 12 - Root Ganio Ganio Martin Martin Martin ... ... Smith Smith ... ... Leaf Level(Key Value) Leaf Level(Key Value) Page 41 Page 41 Page 51 Page 51 Page 61 Page 61 Page 71 Page 71 Akhtar Akhtar 4:706:01 4:706:01 Ganio Smith Smith Ganio 4:706:03 4:709:01 4:709:01 4:706:03 Barr Barr 4:705:03 4:705:03 Smith Hall Smith Hall 4:709:04 4:708:04 4:709:04 4:708:04 Martin Martin 4:708:01 4:708:01 Ota 4:707:02 Con Con 4:704:01 4:704:01 Smith Smith Jones Jones 4:707:01 4:709:02 4:709:02 4:707:01 Martin Martin 4:706:04 4:706:04 Funk Funk 4:706:02 4:706:02 Jones Jones White White 4:708:03 4:704:03 4:704:03 4:708:03 Ota Ota 4:707:02 4:707:02 Funk Funk 4:704:02 4:704:02 White Jones White Jones 4:705:02 4:707:03 4:707:03 4:705:02 Phua Phua 4:708:02 4:708:02 Rudd Rudd 4:705:01 4:705:01 Heap Heap Page 704 Page 704 Page 705 Page 705 Page 706 Page 706 Page 707 Page 707 Page 708 Page 708 Page 709 Page 709 01 01 ... ... Akhtar Akhtar 01 01 ... ... Conn Conn 01 01 ... ... Rudd Rudd 01 01 ... ... Smith Smith 04 01 01 02 02 02 ... ... ... ... ... Martin Martin Ota Ota Ota Ota 01 01 ... ... Ganio Ganio 02 02 ... ... Funk Funk 02 02 ... ... Funk Funk 02 02 ... ... White White 02 02 ... ... Ota Ota 02 02 ... ... Phua Phua 02 02 ... ... Jones Jones 03 03 ... ... Smith Smith 03 03 ... ... White White 03 03 ... ... Barr Barr 03 03 ... ... Jones Jones 03 03 ... ... Jones Jones 03 03 ... ... Hall Hall 04 04 ... ... Martin Martin ... ... ... ... ... ... ... ... ... ... ... ... 04 04 ... ... Corets Corets 04 04 ... ... Smith Smith ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... 05 05 ... ... Nash Nash ... ... ... ... ... ... ... ... ... ... ... ... File ID #4 File ID #4 Forwarding Pointer in a Heap sysindexes UPDATE member SET Address = <something long> WHERE lastname = 'Ota'
How SQL Server Updates Rows • An Update Generally Does Not Cause a Row to Move • An Update Can Be a Delete Followed by an Insert • Batch Updates Touch Each Index Only Once
How SQL Server Deletes Rows • How Deletes Cause Ghost Records • How SQL Server Reclaims Space • How Files Can Shrink Clustered index pages move as a unit Heap records move individually
Deciding Which Columns to Index • Understanding the Data • Indexing Guidelines • Choosing the Appropriate Clustered Index • Indexing to Support Queries • Determining Selectivity • Determining Density • Determining Distribution of Data
Understanding the Data • Logical and Physical Design • Data Characteristics • How Data Is Used • The types of queries performed • The frequency of queries that are typically performed
Indexing Guidelines • Columns to Index • Primary and foreign keys • Those frequently searched in ranges • Those frequently accessed in sorted order • Those frequently grouped together during aggregation • Columns Not to Index • Those seldom referenced in queries • Those that contain few unique values • Those defined with text, ntext, or image data types
Choosing the Appropriate Clustered Index • Heavily Updated Tables • A clustered index with an identity column keeps updated pages in memory • Sorting • A clustered index keeps the data pre-sorted • Column Length and Data Type • Limit the number of columns • Reduce the number of characters • Use the smallest data type possible
Indexing to Support Queries • Using Search Arguments • Writing Good Search Arguments • Specify a WHERE clause in the query • Verify that the WHERE clause limits the number of rows • Verify that an expression exists for every table referenced in the query • Avoid using leading wildcards
Number of rows meeting criteriaTotal number of rows in table 100010000 = 10% = Number of rows meeting criteriaTotal number of rows in table 900010000 = 90% = Determining Selectivity High selectivity member_no last_name first_name 1 Randall Joshua 2 Flood Kathie . SELECT *FROM memberWHERE member_no > 8999 Good candidate for Index? . . 10000 Anderson Bill Low selectivity member_no last_name first_name 1 Randall Joshua 2 Flood Kathie . SELECT *FROM memberWHERE member_no < 9001 Good candidate for Index? . . 10000 Anderson Bill
last_name first_name Randall Joshua . . . Randall Cynthia Randall Tristan . . . Ota Lani . . . Determining Density High Density SELECT *FROM memberWHERE last_name = ‘Randall’ Low Density SELECT *FROM memberWHERE last_name = ‘Ota’
Standard Distribution of Values Number ofLast Names A - E F - J K - O P - U V - Z Last Name Even Distribution of Values Number ofLast Names A - B C - F G - K L - N O - Z Last Name Determining Distribution of Data
Create Indexes on Columns That Join Tables Drop Unused Indexes Avoid Long Clustering Keys Consider Using a Clustered Index to Support Sorting and Range Searches Create Indexes That Support Search Arguments Recommended Practices Use Indexes to Enforce Uniqueness
Review • Introduction to Indexes • Index Architecture • How SQL Server Retrieves Stored Data • How SQL Server Maintains Index and Heap Structures • Deciding Which Columns to Index