1 / 65

SQL Server 2008 Beyond Relational

SQL Server 2008 Beyond Relational. Yoni Okun DBA Team Manager SRL Group Yonio@srl.co.il. Data Storage Explosion!. New Digital Information Created. 55% in personal PCs 16% in corporate data warehouses Internet only 21 TB Email 500x more than Internet/year (~400TB). 2003: 7.0.

ling
Download Presentation

SQL Server 2008 Beyond Relational

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SQL Server 2008 Beyond Relational Yoni Okun DBA Team Manager SRL Group Yonio@srl.co.il

  2. Data Storage Explosion! New Digital Information Created • 55% in personal PCs • 16% in corporate data warehouses • Internet only 21 TB • Email 500x more than Internet/year (~400TB) 2003: 7.0 2002: 5.3 12Exabytes 2001: 3.8 2000: 2.9 1999: 2.2 All information before 1999 1999- 2003 (20 Ex)

  3. Data Storage Explosion! Storage of New Information • 1 Megabyte • 2 – 250 page novels • 1 Gigabyte: • a pickup truck filled with books • 2 Terabyte: • 100,000 trees made into paper and printed. • 2 Petabytes: • All U.S. academic research libraries • 2 Exabytes: • Total volume of information generated in 1999 MagneticDigital64% Hard Drive Prices MagneticAnalog28% Paper1% Film7% 1980… 40,000$!

  4. Data Storage Explosion!

  5. Relational and Non-Relational Data Relational Data Docs & Multimedia Applications XML Spatial

  6. Agenda • Spatial Data Types • Filestream Data Storage • HierarchyID Data Type • Sparse Design – Columns & Stats • Filtered Indexes • Integrated Full Text Search • Administrative Features

  7. Spatial Data Type • Answers to location-based queries • Which roads intersect campus? • Does my land claim overlap yours? • All Italian restaurants within 5 km • Does Your Database Include Address? • Where are all SQL Server Geeks? SELECT * FROM roads WHERE roads.geom.Intersects(@ms)=1

  8. Spatial Data Type • The Open Geospatial Consortium (OGC) defines a hierarchy of spatial data types

  9. Spatial Data Type Geometry Geography A geodetic model A planar model (Created by the Mercator projection)

  10. Spatial Data Type Supports two dimension Only • While Points can hold members: Z (for elevation) and M (for measure) They are Not used in any spatial method • Coordinate order is Latitude /Longitude • Longitude/Latitude In RTM

  11. Working With Spatial Data Type 1. Use with DDL CREATE TABLE Hotels (HotelNamenvarchar(200), HotelLocation ) geography 2. Insert Values INSERT INTO Hotels VALUES ('Dan', ), ('Herdos' ) geography::Point(29.548306566591655,34.96617794036865,4326) , geography::STGeomFromWKB(0x0101000000B699452D3C8C3D4000000082 877B4140,4326) 3. Use Methods & Properties to work with data DECLARE @x geography, @y geography select @x = HotelLocationfrom Hotels where HotelName= 'Dan' select @y = HotelLocationfrom Hotels where HotelName= 'Herdos' select @x.ToString(), @y.ToString() select @x.STDistance(@y)

  12. Spatial Data Type • Input: • Binary - ST[Type]FromWKB • Text - ST[Type]FromText • GML – GeomFromGml • Output: • Binary - STAsBinary • Text - STAsText • GML - AsGml • Text with Z and M – AsTextZM • SRID: • Represent different assumption around what is the earth elipse. • SRID 4326 – GPS • sys.spatial_reference_systems

  13. Spatial Data Type Demonstration • Using Spatial Viewer • Using Spatial Data Type 15

  14. Useful Methods / Properties • Descriptive • STArea • STLength • STCentroid • Relation between two instances • STIntersects • STDistance • Collections • STGeometryN - geography element in a GeometryCollection • STPointN – Nth point of a geometry

  15. Spatial Indexes • Spatial Is Indexed With An • Adaptive Multi- level Grid • Index Is Integrated Into The SQL • Server • Index Consists Of A Grid- based • Hierarchy • Each Level Subdivides The Grid • Sector That Is Defined In The • Level Above Spatial Index - Conceptual Model

  16. Spatial Indexes 1. Decomposing space into Grid Hierarchy: • Grids parameter determines density: • Low = 4x4 • Medium = 8x8 (Default) • High = 16x16 • 4x4x4 = 65,536 cells !! 2. Tessellation • Fitting Objects Into Grid (Touched Cells).

  17. Tessellation Process Level 3 & 4 Intersections 85 matching Cells Level 1 Intersections Level 2 Intersections Cells Per Object Stops Tessellation At Limit: (Per Object = 15, Cells = 13) Complete Match Cells Aren’t Broken To Lower Level (42 Cells)

  18. Creating Spatial Indexes CREATE SPATIAL INDEX Sindx_col2 ON SpatialTable (geometry_col) USING GEOMETRY_GRID WITH ( BOUNDING_BOX = ( xmin=0, ymin=0, xmax=500, ymax=200 ), GRIDS = (LOW, LOW, MEDIUM, HIGH), CELLS_PER_OBJECT = 64 ) • BOUNDING_BOX - GEOMETRY index only • GRIDS - 4 Grid Levels, Grid Densities For Level - Low, Medium, High • CELLS_PER_OBJECT - number of cells recorded for matching

  19. Spatial Data Type Demonstration • Integration With Maps • App Code by Dan Morgenstern • SRL Group 22

  20. Agenda • Spatial Data Types • FILESTREAM Data Storage • HierarchyID Data Type • Sparse Design – Columns & Stats • Filtered Indexes • Integrated Full Text Search • Administrative Features

  21. FILESTREAM Storage • To Blob Or Not To Blob? • Mgmt Complexities Vs. Streaming & Performance • Cons • LOBS take Memory buffers • Updating LOBS cause fragmentation • $ Per GB • However, File system "update" is delete and insert • Pros • Transactional consistency • Point-in-time backup & restore • Single storage and query vehicle ?

  22. FILESTREAM Storage Store BLOBs in Database Dedicated BLOB Store Use File Servers Application Application Application BLOBs BLOBs BLOBs DB DB DB • Low cost per GB • Streaming Performance • Lower cost per GB at scale • Scalability & Expandability • Integrated management • Data-level consistency Advantages • Complex application development & deployment • Separate data management • Enterprise-scales only • Complex application development & deployment • Integration with structured data Challenges • Poor data streaming support • File size limitations • Highest cost per GB • Windows File Servers • NetAppNetFiler • EMC Centera • Fujitsu Nearline • SQL Server VARBINARY(MAX) Example

  23. FILESTREAM Storage • FILESTREAM Combines The Best Of 2 Worlds • Integrates DB Engine With NTFS • Storing BLOB Data As Files • FEATURES: • Uses NT Cache For Caching File Data. • SQL Bpool Not Used And Is Available To Query Processing • Win 32 File System Interface Provide Streaming Access To Data • Compressed volumes are supported

  24. FILESTREAM Storage • It’s not only about Storing But About Working With BLOBS: • Image Analysis • Voice Interpretation & Scripting • Mixing Satellite Feeds & Spatial Data Type For • Weather Reports • Etc..

  25. FILESTREAM Implementation • At Database Level • Declare A Filegroup & Map To Directory • At Table Level • Define On VARBINARY(MAX) • Must Have UNIQUEIDENTIFIER Column • Integrated Security & Management: • Permissions On FILESTREAM Implied On Files. • Tools And Functions Work For Filestream Data. (Backup)

  26. FILESTREAM Programming Dual Programming Model • TSQL (Same as SQL BLOB) • Win32 Streaming File IO APIs • Begin a SQL Server Tran • Obtain a symbolic PATH NAME & TRANSACTION CONTEXT • Open a handle using sqlncli10.dll - OpenSqlFilestream • Use Handle Within System.IO Classes • Commit Tran

  27. FILESTREAM Programming // 1. Start up a database transaction - SqlTransactiontxn = cxn.BeginTransaction(); // 2. Insert a row to create a handle for streaming. newSqlCommand("INSERT <Table> VALUES ( @mediaId, @fileName, @contentType);",cxn, txn); // 3. Get a filestreamPathName & transaction context. newSqlCommand("SELECT PathName(), GET_FILESTREAM_TRANSACTION_CONTEXT() FROM <Table>", cxn, txn); // 4. Get a Win32 file handle using SQL Native Client call. SafeFileHandle handle = SqlNativeClient.OpenSqlFilestream(...); // 5. Open up a new stream to write the file to the blob. FileStreamdestBlob= newFileStream(handle, FileAccess.Write); // 6. Loop through source file and write to FileStream handle while ((bytesRead = sourceFile.Read(buffer, 0, buffer.Length)) > 0) {destBlob.Write(buffer, 0, bytesRead);} // 7. Commit transaction, cleanup connection. -txn.Commit();

  28. Initial FILESTREAM Limitations • Remote FILESTREAM storage - Not supported • DB Snapshot and Mirroring – Not supported • Features not integrated • SQL Encryption • Table Value Parameters

  29. FILESTREAM Storage Demonstration • Administrating FILESTREAM • Programming FILESTREAM 33

  30. Agenda • Spatial Data Types • FILESTREAM Data Storage • HierarchyID Data Type • Sparse Design – Columns & Stats • Filtered Indexes • Integrated Full Text Search • Administrative Features

  31. HierarchyID

  32. HierarchyID Scenarios: • List Forum Threads • Business Organization Charts • Product Categories • Files/Folders Management Features: • Compact - 100,000 Nodes, 6 Level ~ 5 Bytes / Node • Available To Clr Clients As The Sqlhierarchyid Data Type

  33. 2k5 Alternatives - Adjacency Model • Pros: • Understandable • 2k5 – Recursive CTE • Cons: • De-Normalized (Personnel+Chart) • Not Set-Based

  34. 2k5 Alternatives – Path Enumeration • Holds Path As A String Concatenation • Pros: • Logical Representation • Cons: • Searches Done With String Functions • And Predicates On Those Path Strings

  35. 2k5 Alternatives – Nested Sets • "Left" And "Right" Columns Represent Edges • Pros: • Predictable , Set Based Results • Cons: • Must Be Maintained Separately

  36. Demo Structure 1. Insert Root 2. Insert 1st Subordinate 3. Enter Rest of Tree 4. Query Hierarchical Data 5. Reparent Employee 6. Add Subordiante 7. Reparent Node

  37. HierarchyID Demonstration • Trees and Hierarchies 41

  38. HierarchyID Methods GetRoot() - root of hierarchy tree ToString() - Logical string representation of Value GetDescendant()- a child node x of this GetAncestor() - hierarchyid of the nth ancestor of this IsDescendantOf() - true if child is a descendant of this GetLevel() – integer representing depth of the node this GetReparentedValue() -path to newRoot, followed by oldRoot to this

  39. HierarchyID Indexes Depth-first Index Breadth-first Index Employees that Report Directly To The Same Manager Employees That Report Through A Manager

  40. Agenda • Spatial Data Types • FILESTREAM Data Storage • HierarchyID Data Type • Sparse Design – Columns & Stats • Filtered Indexes • Integrated Full Text Search • Administrative Features

  41. Sparse Properties How Can We Model property bags? • Products Catalog • Lab tests with different readings per test • Because they don't appear on each row they are difficult to model

  42. Modeling Sparse Properties Entity-Attribute-Value • Non Relational • Value Column Issues • Need PIVOT to Make Sparse • Xml • Non Relational • Complex Updates • Sparse Columns • Hit 1,024 Limit • Storage

  43. Sparse Columns • Efficient Way Of Managing Empty Data • Null Data  No Physical Space • 30,000 Column Limit (RTM) • 1024 For "Non-sparse" Columns • Column Set - Xml Of All Sparse Values • For Web Site That Needs To Show The Properties

  44. Sparse Columns • Sparse require more storage for nonnull values then regular Columns • NULL Percentage • percent of the data that must be NULL for savings 40% space

  45. Sparse Columns Demonstration • Build Sparse Design • Sparse Storage 50

  46. Agenda • Spatial Data Types • FILESTREAM Data Storage • HierarchyID Data Type • Sparse Design – Columns & Stats • Filtered Indexes • Integrated Full Text Search • Administrative Features

  47. Filtered Indexes • Dramatic Effect on Performance & Storage: • Smaller. Stats is more accurate • Reduced Management Costs (on Changes) • Different indexes on frequently \ infrequently changed columns. • Storage-wise is the same. • Do your queries relate to subsets of Data??

  48. Filtered Indexes • Candidate Columns: • Heterogeneous Categories Of Values • Columns With Distinct Ranges Of Values • Partitioned Tables • sparse columns • Can keep track of only non-null value distribution

  49. Filtered Indexes Create Index Ind1 on t(c1) where C1=‘A’ or C1 = ‘D’ Create index Ind2 on t(sc7) where sc7 is not null

More Related