580 likes | 744 Views
Spatial Databases: Lecture 3. DT249 Semester 2 2013-14 Pat Browne. Outline. We will look in more detail on what happens when a spatial table is constructed in PostgreSQL/PostGIS. We will describe how construct a table for a sub-set of the historical data set 1 .
E N D
Spatial Databases: Lecture 3 DT249 Semester 2 2013-14 Pat Browne
Outline • We will look in more detail on what happens when a spatial table is constructed in PostgreSQL/PostGIS. • We will describe how construct a table for a sub-set of the historical data set1. • We take a closer look at a range of OGC queries that can be used in PostGIS2. • A reminder of viewing and querying using OpenJump • We will look at map accuracy.
PostgreSQL-PostGIS • An object-relational DBMS with PostGIS spatial extensions. • Is completely Open Source • Compliant with OGS’s Simple Features for SQL • Has a spaghetti-like spatial data model • Spatial indexing • Supports OGC types and PostgeSQL’s ‘native types’: point, line, box, path, polygon, and circle geometric types • Topology is under development. • Can perform overlay function • Simple features are based on 2D geometry with linear interpolation between vertices.
PostGIS S/W components • PostGiS provides Open Database Connectivity ODBC connectivity. PostGIS includes extensions to the underlying PostgreSQL ODBC drivers which allow transparent access to GIS objects from PostGIS via the ODBC protocol. ODBC connectivity is part of the OGC standard. • PostGIS also provides Java Database Connectivity (JDBC), which is not part of the OGC standard. • GiST (Generalized Search Tree) provides high speed spatial indexing. • PROJ.4 is an open source library that provides coordinate reprojection to convert between geographic coordinate systems. • GEOS (Geometry Engine, Open Source) is a library used by PostGIS to perform all the operations in the OpenGIS Simple Features for SQL Specification. The GEOS library is used to provide geometry tests (ST_Touches(), ST_Contains(), ST_Intersects()) and operations (ST_Buffer(), ST_Union(),ST_Intersection() ST_Difference()) within PostGIS.
PostGIS and OGC standard • PostGIS implements and is compliant with the OGC’s SimpleFeatures for SQL standard. • PostGIS supports all OGC types: Point, Line, Polygon, MultiPoint, MultiLine, MultiPolygon, GeometryCollection and operations on those types • PostGIS uses OGC well-known text format on the SQL command-line to represent GIS features.
Creating a table. • The basic steps to create a new spatially enabled table are: • Create a table with the desired non-spatial attributes. • Add a spatial column with as PostGIS/OGC extension AddGeometryColumn • Insert the geometry with a SQL insert & select statements.
Creating a spatial table • We can also create a spatial table from an existing table. • On the following slides we will describe how to make a table containing a subset of the historical data set from National Monuments Service1. We will make a table with the historical information for Dublin. We assume that the county2 table exists and that Dublin is a single region in the county table (Dublin consists of four regions). • Note a system generated identified (gid) is used as the primary key. • It is possible to include a geometry column at table creation time, but the system would not generate integrity constraints so we will stick with this method.
Creating a spatial table, step 1 CREATE TABLE "public"."dublin_historical“ (gid serial PRIMARY KEY, "rmp_prop" int8, "map_symbol" int8, "entity_id" varchar(7), "co_id" int8, "smr_val0" numeric, "nat_grid_e" numeric, "class_desc" varchar(255), "nat_grid_n" numeric, "objectid" int8, "townlands" varchar(255), "scope_n1" varchar(255), "smrs" varchar(255));
Creating a spatial table, step 1 • PostgreSQL/PostGIS will respond: • NOTICE: CREATE TABLE will create implicit sequence "dublin_historical_gid_seq" for serial column "dublin_historical.gid" • NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "dublin_historical_pkey" for table "dublin_historical"
Creating a spatial table, step 1 • Examine the table (\d): Table "public.dublin_historical" Column | Type | Modifiers --------+---------+----------------------- gid | integer | not null default nextval('dublin_historical_gid_seq'::regclass) Reset of data Indexes: "dublin_historical_pkey"PRIMARYKEY,btree (gid)
Creating a spatial table, step 2 • Here we add the geometry column, PostGIS will automatically generate integrity constraints. • SELECT AddGeometryColumn('public','dublin_historical','the_geom','29900','POINT',2); • Do not execute this command in the lab, this table already exists. • We use an version of the Irish National Grid with SRID = 29900. The type of the geometry to be stored in the column must be included, in this case it is POINT. • This command accesses geometry_columns system table (details later).
Creating a spatial table, step 2 • First system generated constraint ALTER TABLE dublin_historicalDROP CONSTRAINT enforce_dims_the_geom; ALTER TABLE dublin_historical ADD CONSTRAINT enforce_dims_the_geom CHECK (ndims(the_geom) = 2);
Creating a spatial table, step 2 • Second system generated constraint ALTER TABLE dublin_historical ADD CONSTRAINT enforce_geotype_the_geom CHECK (geometrytype(the_geom) = ‘POINT'::text OR the_geom IS NULL);
Creating a spatial table, step 2 • Third system generated constraint ALTER TABLE dublin_historical ADD CONSTRAINT enforce_srid_the_geom CHECK (srid(the_geom) = 29900);
Creating a spatial table, step 2 • The Primary Constraint was created in step1 CONSTRAINT dublin_historical_pkey PRIMARY KEY(gid);
Creating a spatial table, step 3 • Next we insert the data from the all Ireland historical table into the newly created dublin_historical table. • Only data is contained in Dublin is inserted into the new table. Check the OGC & PostGIS documentation on the contains predicate AKA ‘a spatial relationship function’.
Creating a spatial table, step 3 • INSERT INTO dublin_historical (rmp_prop,map_symbol, entity_id, co_id,smr_val0, nat_grid_e“ ,class_desc,nat_grid_n,objectid, townlands,scope_n1, smrs, the_geom) • SELECT rmp_prop, map_symbol, entity_id, co_id,smr_val0, nat_grid_e, class_desc, nat_grid_n, objectid, townlands, scope_n1, smrs, h.the_geom • FROM county c, historical h WHERE contains(c.the_geom,h.the_geom) AND • c.name = 'Dublin';
PostGIS system tables • The next few slides describe the built-in PostGIS meta-tables that provide the spatial functionality. We only outline the main features. For further details, please see the PostGIS manual.
geometry_columns table Column | Type |Modifiers -----------------------------+--------------------------------+---------- f_table_catalog | character varying(256) | not null f_table_schema | character varying(256) | not null f_table_name | character varying(256) | not null f_geometry_column | character varying(256) | not null coord_dimension | integer | not null srid | integer | not null type | character varying(30) | not null Indexes: "geometry_columns_pk" PRIMARY KEY, btree (f_table_catalog, f_table_schema, f _table_name, f_geometry_column) This table allows PostgreSQL/PostGIS to keep track of actual user spatial tables.
spatial_ref_sys table • Displaying a spherical earth on a flat surface requires a projection. This table uses a standard numbering, called the EPSG1, to describe various projections. Using PostgreSQL’s expanded display we can examine the details for a particular projection representing the Irish National Grid: \x select * from spatial_ref_sys where srid=29900;
spatial_ref_sys table \d spatial_ref_sys Column | Type | Modifiers -----------+--------------------------+----------- srid | integer | not null auth_name | character varying(256) | auth_srid | integer | srtext | character varying(2048) | proj4text | character varying(2048) | Indexes: "spatial_ref_sys_pkey" PRIMARY KEY, btree (srid)
Valid Geometry? select IsValid(the_geom) from lakes; NOTICE: Self-intersection at or near point 114275 271699 NOTICE: Self-intersection at or near point 124552 240642 NOTICE: Self-intersection at or near point 121283 305664 isvalid --------- t t f f f t t t
Finding the centre point of a county • To find the centre point of each county: • select name, asText(Centroid(the_geom)) from county;
Finding a bounding box • To find the bounding box of each county: • select name, asText(envelope(the_geom)) from county; • Or • select name,extent(the_geom) from county group by name; • Must use ‘group by’ which can handle multiple polygons, also extent returns the geometry as text, so no need for asText.
Finding the dimensions of a bounding box select name, extent(the_geom) from county where name= 'Dublin County Borough' group by name; • Returns Dublin County Borough | BOX(307445.9 228925.0 ,325443…) • Using the above, we can measure the dimensions of the bounding box: select distance( geomFromText(‘Point….)’), geomFromText(‘Point….)’));
Point in Bounding Box • We can use a non-OGC function && to find objects. • SELECT name FROM county where GeomFromText('POINT(309612.0 233192.0)', 29900) && the_geom; • See result on next slide. • When constructing a query it is important to remember that only the bounding-box-based operators such as && can take advantage of the PostgreSQL’s GiST spatial index.
Find things near a point • select townlands from dublin_historical WHERE distance(the_geom, GeomFromText('POINT(309612.0 233192.0)', 29900)) < 100; • When we submit such a query we should have some idea of the type of data in the result set. To find out the expected return type use PostGIS manual or the OGC standard for SQL.
Find things near a point • SELECT c.name, h.townlands FROM county AS c,dublin_historical AS h WHERE distance(h.the_geom, GeomFromText('POINT(317431.0 231704.0)', 29900)) < 1000 and c. Name = 'Dublin County Borough'; • Do we know the type of data in the result set here?
Find things near a point SELECT c.name, h.townlands FROM county AS c, dublin_historical AS h WHEREst_dwithin(h.the_geom, PointFromText('POINT(317431.0 231704.0)', 29900),1000) and c. Name = 'Dublin County Borough'; • Similar to previous query but using st_dwithin.
Find things near a point SELECT gid,townlands, distance(the_geom, GeomFromText('POINT(317431.0 231704.0)',29900)) FROM dublin_historical WHERE st_dwithin(the_geom, GeomFromText('POINT(317431.0 231704.0)',29900),500) ORDER BY distance(the_geom, GeomFromText('POINT(317431.0 231704.0)',29900)) LIMIT 100;
Finding the largest county in Ireland • SELECT name, area(the_geom)/10000 AS hectares FROM county ORDER BY hectares DESC LIMIT 1; • Leaving out Northern Ireland • SELECT name, area(the_geom)/10000 AS hectares FROM county • where name != 'Northern Ireland' • ORDER BY hectares DESC LIMIT 1;
Stored and calculated areas • Area in square KMs • SELECT name, area(the_geom)/1000000 AS Calculated, area_km2 AreaStored FROM county ;
What is the length of roads fully contained within Dublin County Borough? SELECT c.name, sum(length(r.the_geom))/1000 as roads_km FROM roads AS r, county AS c WHERE r.the_geom && r.the_geom AND contains(c.the_geom,r.the_geom) AND c.name = 'Dublin County Borough' GROUP BY c.name ORDER BY roads_km;
What historical objects are near Dublin roads? SELECT townlands FROM dublin_historical h, roads r WHERE distance(h.the_geom,r.the_geom) < 200;
Overlays • We should distinguish the overlay operation and tests for overlap. The OGCSFSQL use two similar keyword. Table-on-table overlays are possible with the ST_Intersection() function • ST_Intersects(a,b) returns BOOLEAN • ST_Intersection(a,b) returns GEOMETRY • ST_Intersects(a,b)=TRUE | FALSE • ST_Intersection()=GEOMETRY
Efficiency of Search1 It is expensive to process the exact geometry of an object. Therefore approximations such as bounding boxes (BB) or a convex s (CH) are used to help to examine candidate objects and decided whether a candidate fulfils the query or not. The are used as ‘geometric filter’
Network Queries • In order to execute network queries, we need to augment the spatial information used in the OGC Simple Features for SQL standard. We will use pgRouting1.
Network Queries1 Shortest Path from Dublin to Waterford
What two houses within 500 meters of the Chester Beatty Library have the most residents? • SELECT b1.residents • FROM buildings_geodir b1, buildings_geodir b2 • WHERE • b2.name = 'CHESTER BEATTY LIBRARY' and • ST_DWithin(b2.the_geom,b1.the_geom, 500) ORDER BY residents DESC LIMIT 2; • Note residents column will need to be added to the buildings_geodir table
Counties that have exactly 1 neighbour • SELECT c1.name • FROM county c1, county c2 • WHERE touches(c1.the_geom, c2.the_geom) = 'TRUE' • GROUP BY c1.name • HAVING count(c2.name) =1;
Proportions SELECT saps_label,primary_degree10_4,(male1_1 + female1_1) as Pop, ((cast (primary_degree10_4 as float)) / (cast ((male1_1 + female1_1) as float)) * 100) as Grad_percent FROM dublin_eds WHERE primary_degree10_4 IS NOT NULL ORDER BY Grad_percent DESC LIMIT 1; saps_label | primary_degree10_4 | pop | grad_percent ------------------------------+-----------------------------+------+------------------ 130 Pembroke West A | 455 | 4262 | 10.6757390896293 (1 row)
Accuracy • You should calculate the area of county Meath and compare the result to the area stored in the county table, you should have get two different figures for the area of Meath. Which area is correct? To answer this question we would need to know the accuracy of the area stored in the database and the accuracy the map that we used to calculate the area.
Accuracy • Obviously no spatial database or GIS can increase the positional accuracy of a spatial dataset. The accuracy does not change as the viewing scale changes. • So we need to know the scale of the original survey and the expected accuracy at that scale.
Accuracy • The accuracy of a map is dependent on the differences between the true position of features and their representative position in the map. • To find the true position requires highly accurate devices such as industrial grade Global Positioning Systems (GPS) and sophisticated mathematical software.
Accuracy • A common of way of defining positional accuracy for maps is to place limits on the root mean square error (RMSE) for individual position components (the X,Y and possibly Z i.e. height) • The RMSE is derived from the square root of the average of the squared discrepancies when compared to a higher level independent survey. The RMSE is normally defined in terms of ground scale errors (e.g. +- one metre).