340 likes | 533 Views
Putting the world in your Database: The Informix Spatial and Geodetic DataBlades. Robert Uleman Consulting IT Specialist Worldwide Information Management Sales Support – Spatiotemporal Technology. Agenda. Overview GIS: Geographic Information Systems Spatial data in an object-relational DBMS
E N D
Putting the world in your Database:The Informix Spatial and Geodetic DataBlades Robert Uleman Consulting IT Specialist Worldwide Information Management Sales Support – Spatiotemporal Technology
Agenda • Overview • GIS: Geographic Information Systems • Spatial data in an object-relational DBMS • The Spatial DataBlade • Competitive differences • Details • Spatial SQL syntax • Spatial indexing • Geodetic: round-earth spatial
Architectural Evolution of GIS Data Management Proprietary data format File System 1st Generation: Spatial Features GIS Application RDBMS SQL Attributes Proprietary GIS API GIS Data Engine RDBMS 2nd Generation: GIS Application SQL Spatial features, indexes in BLOBs Proprietary spatial structures Proprietary GIS API Open or proprietary Spatially enabled DBMS GIS Data Engine GIS Application SQL 3rd Generation: Open “Spatial” Application Spatial types functions indexes Spatial business logic
What’s Special about Spatial? • Traditionally not supported by relational databases • Requires new indexing techniques • Voluminous data • Individual values can get arbitrarily large: • Large, convoluted lines and polygons (e.g., a coastline) • Raster images • Lots of features • Maps can effectively represent lots of information • Much more than spreadsheets or reports • Queries often retrieve many more rows than “normal” queries • Individual operations may be computationally expensive • WHERE clause predicates • Transactions generally long, unlike OLTP • Resembles code revision control in software development
Spatial Geodetic Grid Your idea goes here Server Subsystems Connectivity Backup Restore Extender/DataBlade Introduction: Component Technology IDS
New Extender/ DataBlade Extender/DataBlade Elements Types Functions Casts Aggregates Indexes Tables Client Code
If Integer were not built in … Domain • Whole numbers, up to some maximum magnitude Data types • Smallint, Integer, Bigint • Representations: ASCII ([+|-]d..),binary (2’s complement, byte order) Functions and operators • Add(+), Subtract(-), Multiply(*), Abs, Mod, … • Equal(=), LessThan(<), GreaterThanOrEqual(>=), … Index support • B-Tree 0123456789
Integer not built in? Not so far-fetched • Illustra: Pure object-relational database • Commercialization of UC Berkeley Postgres project • Acquired by Informix in 1996, Informix acquired by IBM in 2001 • Postgres continues as open-source PostgreSQL • No built-in data types; everything is bound at runtime • Land Information New Zealand: Fraction data type “5/7” • Avoid roundoff in cumulative subdivision of property • Legally mandated improvement in area/tax calculation precision • Dates • Birthdate: understands that February 28 is a birthday for someone born on February 29 • Tradingdate: skips weekends, holidays
User-Defined Data Types Data Type: Descriptor assigned to a column or a variable name compensation location jobs_held image John T. Smith 349,876 yen (123 256) Clerk, Administrator, Manager CREATE TABLE employee ( name varchar(30), compensation salary_t, locationpoint, jobs_held set(varchar(30)), picture image );
Some SQL Queries • Location-Based Services: List Points of Interest • SELECT name, description, addressFROM restaurantsWHERE Overlaps(location, box(getGPS(), 2000, 2000))AND category = ‘chinese’AND docContains(menu, ‘Peking duck’); • List volcanic eruptions in a region of interest • SELECT name, year, mag, locationFROM volcanoWHERE ST_Within(location,'polygon((-125 43,-125 46,-120 46,-120 43,-125 43))')ORDER BY name, year; name year mag location HOOD MOUNT 1854 0 POINT (-121.69999 45.36000) HOOD MOUNT 1859 2 POINT (-121.69999 45.36000)
OpenGIS Standard Spatial Types and Functions Certified compliant with OpenGIS Simple Features Specification Geometric data types ST_Point, ST_Linestring, ST_Polygon, ST_Geometry, etc. Spatial functions ST_Distance, ST_Intersects, ST_Within, etc. Standard Data representations Well-Known Binary, Well-Known Text, ESRI Shape Tailored to ESRI’s ArcSDE 9.x (spatial database gateway) Additional functions, support for annotation, SDE format, etc. Based on ESRI’s geometry engine (Shape library) Consistent results of spatial operations in all software tiers: database, middle (ArcSDE), client (ArcGIS, ArcIMS) The Informix Spatial DataBlade
DB2 Spatial Extender, IDS Spatial DataBlade • Developed, supported and maintained by IBM • Wrapped around ESRI’s geometry engine (Shape library) • Spatial index: R-tree • Spatially aware optimizer • Recognizes spatial operators and index • Cost, selectivity provided by R-tree • Administration tools: Blade Manager • Utilities: Shape file import, export • Strategic alliance with ESRI • Close relationship in engineering, marketing, and sales • Software (Data, WebSphere), Hardware, Services
ST_Curve ST_Surface ST_Geom- Collection ST_LineString ST_Polygon ST_Multi-Surface ST_Multi-Curve ST_Multi-Point ST_Multi-Polygon ST_Multi-LineString Spatial/OpenGIS SQL Data Types ST_Geometry ST_Point “Abstract” Classes Instantiable Classes
ST_Intersects(geometry1,geometry2) ? OpenGIS Spatial SQL Functions
ArcSDE Architecture for Informix Client SQL Applications • ArcGIS Family • Open API Application ArcExplorer ESQL/CODBCJDBC SDE Client API ArcSDE Server JDBC TCP/IP ODBC Caching Compression Connection pooling(Projections) (Long transactions) (Raster support) OpenGISSpatialQueries Spatial DataBlade R-tree index Geodatabase, business rules, custom types IDS
ArcSDE Architecture for Others SQL Applications ArcSDE Server ArcSDE Server SQL Applications SQL Applications FAT ArcSDE Server OracleSpatialQueries OpenGISSpatialQueries NO SpatialQueries Oracle Spatial <nothing> Spatial DataBlade Index tables SQLServer, Oracle “binary” Oracle Informix
Standard, intuitive syntax Easy development, maintainable, fewer bugs Result of true Object-Relational extensibility OpenGIS Simple Features conformance for interoperability with other compliant systems Performance Spatial index and functions integrated into server at code interface level, not based on tables and SQL Tight cooperation and integration with ESRI The Informix Spatial Advantage
SQL Comparisons: Query DB2 Spatial SELECT A.Feature_ID FROM A WHEREST_Overlaps( A.shape, ST_GeomFromText( ‘ST_POLYGON( x1 y1, x2 y2, x3 y3, x4 y4 … )’, 5 -- OpenGIS requirement )) ; Oracle Spatial SELECT A.Feature_ID FROM TARGET A WHEREsdo_relate( A.shape, mdsys.sdo_geometry( 2003, NULL, NULL, mdsys.sdo_elem_info_array( 1,1003,1 ), mdsys.sdo_ordinate_array( x1,y1, x2,y2, x3,y3, x4,y4, … ) ), 'mask=anyinteract querytype=window‘) = 'TRUE‘ ;
Spatial queries • Compute the percentage overlap of imagery that covers part of a region of interest • ROI = Ontario Province, with 25 km buffer, but only over Canada • SELECT i.id AS image_id,ST_Area(ST_MultiPolygonST_Intersection( i.footprint, p.shape ) ::ST_Polygon ) / ST_Area( i.footprint ) * 100 AS "%overlap"FROM images i, provinces pWHERE ST_Overlaps( i.footprint,ST_Difference(ST_Buffer( p.shape, 25, 'KILOMETRE' ), (SELECT shape FROM countries WHERE name = 'USA')) AND p.name = 'Ontario'ORDER BY 2 DESC;
The B-tree index “Polygon” A through Z A - I S - Z J - R A - B C - E F - I J - L M - O P - R S - Sp Sq - U V - Z B-tree indexes rapidly reduce the number of items to search through in a selection process and are the industry standard for alpha-numeric data. But how can spatial data be sorted???
K8 L4 N2 K1 K9 L1 K10 K2 L5 K4 K11 K5 L2 L3 K3 N1 K6 K7 SearchObject L3 K1 K2 K3 K11 K4 L2 L5 N2 K6 K9 L1 K5 N1 L4 K8 K10 A Simple R-tree K = Key bounding box L = Leaf node bounding box N = Node (internal) bounding box Index structure Data space K3 K7
Planar Coordinates Northing Easting 5,000,000 500,000
+90(90° N) R latitude longitude 0 -90(90° W) Spherical Coordinates
+90 ? 0 ? UTM 32 UTM 33 ? ? -90 -180 0 +180 Flattening the Earth • Plane Geometry on lat-long • Singularities and scale distortion at and toward the poles • Wrap-aroundat 180º longitude • Poor location of lines, edges, intersections • Local/Regional Projections • Limited valid range • Map edge-matching problems • Non-uniform scale • Indexing: it gets worse! • Multiple “bounding boxes” or complete loss of selectivity
±180° 0° ±180° Single bounding box: high selectivity, low complexity +90 Y↑ 0 Split bounding boxes: high complexity Single bounding strip: low selectivity -90 →X -180 -90 0 +90 +180 ROUND FLAT
Geodetic DataBlade/Extender Latitude-longitude (‘geodetic’) coordinates, ellipsoidal datum Uniform accuracy and resolution around the globe (“world to cm”) No scale singularities and map edges Based on Hipparchus geometry engine by Geodyssey Ltd. Integrated time and floating-point dimensions for single-index searches and true spatio-temporal data management Powerful indexes for high performance: Voronoi Tessellation – adaptive space partitioning R-tree – self-tuning multidimensional index (up to 5 dimensions) Unique to IBM
Connect the dots... Ellipsoid: line segments connecting vertices are geodesics Flat plane: line segments connecting vertices are straight lines
Connect the dots, continued Add vertices if you want a line segment to follow a parallel (line of constant latitude) meridians parallels
Distances What is the distance from Anchorage to Tokyo? The shortest path is the shorter of thetwo possible geodesic paths:the thick part of the great circle
Polygons that straddle the 180th meridiansplit flat-plane representation into 2 or more pieces MULTIPOLYGON( ((-180 30,-165 30,-165 40, -180 40,-180 30)), ((180 30,180 40,165 40, 165 30, 180 30)) ) POLYGON( (165 30, -165 30, -165 40, 165 40) )
Polygons that enclose a pole POLYGON( (180 -60, -180 -60, -180 -90, 180 -90, -180 -60) ) POLYGON( ( 0 -60, -120 -60, 120 -60, 0 -60) ) extra edge extra vertex extra edge extra vertex
Hemispheres Western hemisphere, flat-earth representation: POLYGON((0 -90, 0 90, -180 90, 180 -90, 0 -90)) Western hemisphere, round-earth representation: POLYGON((0 -30, 0 90, 180 -30)) 2 3 2 1 3 1 4 Note that the same three points, specified in opposite order, define the eastern hemisphere