410 likes | 419 Views
Creating High Performance Spatial Databases with SQL Server 2008. Alastair Aitchison. About Me. Consultant, Trainer, Author, and Housedad. Session Plan. Do you need geometry and geography ? Constructing a spatial index (the theory ) Filtering spatial query results (the practice )
E N D
Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison
About Me • Consultant, Trainer, Author, and Housedad
Session Plan • Do you need geometry and geography? • Constructing a spatial index (the theory) • Filtering spatial query results (the practice) • Optimising spatial queries
The “Two-Column” Model CREATE TABLE Customers ( Name varchar(32), Address varchar(255), Lat float, Long float );
Point-in-Polygon SELECT * FROM Customers WHERE Lat BETWEENLtMinANDLtMax AND Long BETWEENLnMinANDLnMax (LtMax, LnMax) (LtMin, LnMin)
Calculating Distance SELECT 3963.0 * ACOS( SIN(Lat1) * SIN(Lat2) + COS(Lat1) * COS(Lat2) * COS(Lon2 - Lon1) ) (Lat2, Lon2) (Lat1, Lon1)
“Two-Column” Model Limitations • Only stores points • Calculations on flat plane or perfect sphere • Limited range of methods
SQL Server 2008 • Points, Linestrings, Polygons • Accurate calculations • Ellipsoid model (geography) • Flat plane (geometry) • Full complement of spatial methods • Intersects, Contains, Crosses, Touches • Distance, Length, Area • DE-9IM
Do You Need geometry / geography? • Not all “spatial” apps need spatial datatypes • Example: Store locator
Session Plan a • Do you need geometry and geography? • Constructing a spatial index (the theory) • Filtering spatial query results (the practice) • Optimising spatial queries
Querying geometry and geography • SELECT * WHERE A.STIntersects(B) = 1 • Primary Filter (Based on Index) • Approximate • Fast • Superset of actual results • Secondary Filter (Based on Table) • Refine results of primary filter • Accurate
Assigning Order to Spatial Data • B-Tree indexing for linearly ordered data • decimal, float, money etc. – numeric order • char, varchar, nvarchar etc. – alphabetic order • datetime, date, time etc. – chronological order • How do we assign order to spatial data?
From Grid to Index • Covered, partially covered, or touched cells • Maximise accuracy - Minimise index size • Three Rules • Covering Rule • Deepest-Cell Rule • Cells Per Object Rule
Covering Rule “If a grid cell is completely covered by a geometry, don’t further subdivide that cell.”
Deepest-Cell Rule “Once a cell has been subdivided, only store the intersecting cell(s) at the deepest grid level.”
Cells Per Object Rule “If subdividing a cell would exceed the maximum allowed number of cells for each object, do not subdivide the cell.”
Session Plan a • Do you need geometry and geography? • Constructing a spatial index (the theory) • Filtering spatial query results (the practice) • Optimising spatial queries a
Creating a Spatial Index I CREATE TABLE Grid ( id char(1), shape geometry, CONSTRAINT [idxGridCluster] PRIMARY KEY CLUSTERED ( id ASC ) );
Add Some Points To The Table INSERT INTO Grid VALUES ('A', geometry::Point(0.5, 2.5, 0)), ('B', geometry::Point(2.5, 1.5, 0)), ('C', geometry::Point(3.25, 0.75, 0)), ('D', geometry::Point(3.75, 2.75, 0));
Creating a Spatial Index II CREATE SPATIAL INDEX idxGrid ON Grid(shape) USING GEOMETRY_GRID WITH ( BOUNDING_BOX = (0, 0, 4096, 4096), GRIDS = ( LEVEL_1 = MEDIUM,LEVEL_2 = MEDIUM, LEVEL_3 = MEDIUM, LEVEL_4 = MEDIUM), CELLS_PER_OBJECT = 16); -- Each L1 cell is 512 x 512-- Each L2 cell is 64 x 64-- Each L3 cell is 8 x 8-- Each L4 cell is 1 x 1
Grid Level 4 D A B C
Finding Intersecting Points DECLARE @Polygon geometry = 'POLYGON ((1.5 0.5, 3.5 0.5, 3.5 2.5, 1.5 2.5, 1.5 0.5)) '; SELECT * FROM Grid WHERE shape.STIntersects(@Polygon) = 1;
sp_help_spatial_geometry_index EXEC sp_help_spatial_geometry_index @tabname = Grid, @indexname = idxGrid, @verboseoutput = 1, @query_sample = 'POLYGON ((1.5 0.5, 3.5 0.5, 3.5 2.5, 1.5 2.5, 1.5 0.5))';
4 Number_Of_ObjectCells_In_Level4_In_Index D A B C
9 Number_Of_ObjectCells_In_Level4_For_QuerySample
Compare the Grid Cells D A B C
25% Percentage_Of_Rows_NotSelected_By_Primary_Filter D A B C
3 Number_Of_Rows_Selected_By_Primary_Filter D A B C
33 Percentage_Of_Primary_Filter_Rows_Selected_By_Internal_Filter D A B C
2 Number_Of_Times_Secondary_Filter_Is_Called D A B C
2 Number_Of_Rows_Output D A B C
66 Primary_Filter_Efficiency D A B C
50 Internal_Filter_Efficiency D A B C
Session Plan a • Do you need geometry and geography? • Constructing a spatial index (the theory) • Filtering spatial query results (the practice) • Optimising spatial queries a a
Making Sure the Index Is Used • Use a Supported Method • STIntersects() • STContains(), STWithin(), STTouches() • STDistance() • Filter() • Syntax must be A.STIntersects(B) = 1 • Upgrade to SP1 • Use a HINT where necessary
Making the Index Effective • Three possible outcomes: • Preselection (Internal Filter) • Discarding (Primary Filter) • Secondary Filter • Adjust Index Settings to fit data in the column and typical query samples
Improving Performance • Make bounding box as tight as possible • Grid Resolution ↑ ... Cells Per Object ↑ • Multiple Indexes (may need HINT) • Use non-spatial predicates • Reduce unnecessary detail • Experiment!
Want To Know More? Beginning Spatial with SQL Server 2008 MSDN Spatial Forum http://social.msdn.microsoft.com/ Forums/en-US/sqlspatial/threads alastair@beginningspatial.com
A Practical Demonstration • Geonames export • 6.9 million points • Search for those in Newport • Without Index: ~100 rows. 12,391,230 secs • With Index: ~100 rows. < 1 sec