1 / 98

Understanding SQL Server: Internals, Queries, Data Storage

Delve into SQL Server internals to grasp page structure, record formats, and data storage methods. Learn about fixed and variable data types, row-overflow, and off-row storage techniques.

lanj
Download Presentation

Understanding SQL Server: Internals, Queries, Data Storage

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mark S. Rasmussen – improve.dk Knowing the Internals, Who Needs SQL Server Anyway?

  2. Whoami • Tech Lead @ iPaper • Developer/DBA/Sysadmin/Project manager/* • Comp.Sci @ Aarhus University • Blogging at improve.dk • @improvedk • Author of the OrcaMDF project

  3. Disclaimer • Level 650 – meant to inspire, not teach! • Based on 2008 R2 • I have no idea... • Most of what I say is incorrect

  4. Background • Presentation at Miracle Open World • Formally started OrcaMDF

  5. Old School Querying • privatestaticvoid oldschool() • { • using (varconn=newSqlConnection("Data Source=.;Initial Catalog=QFD;…") • { • conn.Open(); • varcmd=newSqlCommand("SELECT * FROM Persons", conn); • var reader = cmd.ExecuteReader(); • while(reader.Read()) • Console.WriteLine(reader["ID"] +": "+ reader["Name"] +" ("+ reader["A • } • }

  6. OrcaMDF Querying • using (varmdf=newMdfFile(mdfPath)) • { • var scanner =new DataScanner(mdf); • var rows = scanner.ScanTable("Persons"); • EntityPrinter.Print(rows); • } • using (varmdf=newMdfFile(mdfPath)) • { • var scanner =new DataScanner(mdf); • var rows = scanner.ScanTable("Persons") • .Where(x => x.Field<short>(”Age”) < 40); • EntityPrinter.Print(rows); • }

  7. Pages The foundation of SQL Server storage

  8. What Is a Page? • 8192 bytes • Everything storedas pages

  9. Undocumented DBCC Commands & Flags • DBCC IND • DBCC PAGE • DBCC TRACEON (3604) • Documented in 6.5 & 7.0 • Unofficially documented

  10. Page Header • Absolutely no documentation • Absolutely necessary for parsing

  11. Reverse Engineering the Header • DEMO • OrcaMDF: PageHeader

  12. Slot Array • Points to beginningof records in body • Defines logicalorder of records

  13. Records • Data records • Stores table data • Index records • Stores nonclustered index data, as well as non-leaf level clustered index data • Stored in the FixedVar format

  14. FixedVar Record Format

  15. Status Bits A

  16. Status Bits B

  17. NULL Bitmap • Bitmap tracking whether columns are NULL • CEIL(#Cols / 8) bytes • Always present on data pages, except when it’s not • Only trust defined bits – rest may be garbage

  18. Variable Length Offset Array

  19. Example Record CREATE TABLE RecordTest ( A int, B int, C char(5), D varchar(10), E varchar(20) ) INSERT INTO RecordTest VALUES (25, 38, 'ABCD', 'Mark', 'Denmark')

  20. When Is Data Present? • Fixed length data always present • Even if null • Though not necessarily tail columns! • Variable length data only present when not null • Adding nullable columns is a metadata op • Denali default value columns is metadata too!

  21. Data Types How are data types stored within a record?

  22. Classifying Data Types • Fixed length data types • bit, char, int, decimal, date, datetime, float, etc. • Variable length data types • (n)varchar, varbinary, varchar(MAX), text, etc. • sql_variant • Please just stay away from it

  23. Variable Length Data Types • SLOBs • varchar(x), nvarchar(x), varbinary(x) • LOBs • text, ntext, image, varchar(MAX), nvarchar(MAX), varbinary(MAX), xml • vardecimal

  24. In-row (n)varchar(x) Storage • CREATE TABLE VarcharTest • ( • A varchar(4) • ) • INSERT INTO VarcharTest VALUES ('Mark')

  25. Complex Columns • DEMO • Identified using the sign bit • 0b1001001110010101 = 37.781 • 0b0001001110010101 = 5.013 • Use cases • Row-overflow/LOB pointers • Sparse vectors • Back pointers

  26. Off-row SLOB Storage • Varchar, nvarchar, varbinary • DEMO

  27. Off-row SLOB Storage • Column data moved to new page, pointer left behind

  28. Off-row SLOB Storage

  29. Off-row SLOB Storage B = [BLOB Inline Root] Slot 0 Column 2 Offset 0x11a1 Length (physical) 24 Level = 0 Unused = 0 UpdateSeq = 1 TimeStamp = 1298595840 Link 0 Size = 4500 RowId = (1:21:0) • timestamp =BitConverter.ToInt64(data, 8) <<16;

  30. BLOB_FRAGMENT Record Blob row at: Page (1:21) Slot 0 Length: 4514 Type: 3 (DATA) Blob Id:469368832 • Stored on shared (obj-level) TextMix pages

  31. Off-row SLOB Storage • Allways stored in-row if < 24 bytes • 24 byte [BLOB Inline Root] pointer • Data stored in BLOB_FRAGMENT on TextMix page • Timestamp == Blob ID • Performance prediction is tough

  32. (MAX) LOB Storage • varchar(MAX), nvarchar(MAX), varbinary(MAX) • The LOB that wanted to be a SLOB • Three scenarios • [BLOB Inline Data] • [BLOB Inline Root] • [Textpointer] • DEMO

  33. ”BLOB Inline Data” • Used when data fits in record • Not an official LOB structure Slot 0 Column 1 Offset 0x0 Length 0 Length (physical) 0 A = [NULL] B = [BLOB Inline Data] Slot 0 Column 2 Offset 0x1393 Length 4 Length (physical) 4 B = 0x41424344

  34. BLOB Inline Root • Can reference up to 5 pages – data, roots, trees, etc. • 12 byte header • Array of 12 byte references • Only used by SLOBs & (MAX) LOBs • Also not a LOB structure (by my definition)

  35. LOB Structure Records • Wrapped in a single-column ”meta” record

  36. DATA Blob row at: Page (1:176) Slot 0 Length: 8054 Type: 3 (DATA) Blob Id:1210253312 • Type 3 • Where data is actually stored • Size always > 64 bytes (SMALL_ROOT)

  37. DATA • How much data can we store in a DATA record? • 8096 – Page body size • 8080 (8094)– Theoretical max • 8040 (8054)- Reality

  38. INTERNAL Blob row at: Page (1:55) Slot 0 Length: 324 Type: 2 (INTERNAL) Blob Id: 1210253312 Level: 0 MaxLinks: 501 CurLinks: 19 Child 0 at Page (1:176) Slot 0 Size: 8040 Offset: 8040 Child 1 at Page (1:177) Slot 0 Size: 8040 Offset: 16080 • Type 2 • CurLinks = number of references • MaxLinks = ? • Level = tree level • Size = computed

  39. INTERNAL Total record size = 20 + X * 16

  40. Connecting the Dots

  41. The Tree Grows • In theory (INTERNAL)... • In reality... 500

  42. Connecting the Dots

  43. Connecting the Dots

  44. Two Levels Is All It Takes • 8,040,000,000 bytes / 7,48 GB • (MAX) limit is 231-1 • Many permutations

  45. Large Value Types Out of Row • sp_tableoption ‘MyTable’, ‘Option’, ‘ON/OFF’ • Even more permutations • “text in row” 24-7000, default 256

  46. Textpointer • Used for classic LOB types & MAX LOB types with ’large value types out of row’ ON • text, ntext, image • Complex column

  47. Classic Lob Structures • You thought (MAX) was complex? • Textpointer =

  48. SMALL_ROOT • Type = 0 • Used when data <= 64 bytes • Min size = 84 • Data > length = garbage

  49. LARGE_ROOT_YUKON • Type = 5 • Min size = 84 • Part of LOB tree

  50. Connecting the Dots

More Related