160 likes | 282 Views
with your hosts, Schuyler Erle and Rich Gibson of Locative Technologies Sebastopol, CA. Handling Large (Vector) Datasets with MapServer. Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005. Getting MapServer to scale with large datasets is an art.
E N D
with your hosts, Schuyler Erle and Rich Gibson of Locative Technologies Sebastopol, CA Handling Large (Vector) Datasets with MapServer Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005
Getting MapServer to scale with large datasets is an art ... Handling Large (Vector) Datasets Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005
Fortunately, thanks to the miracle of modern technology, we can all be MapServer artists! Handling Large (Vector) Datasets Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005
MAXSCALE is your friend – use it! The Low Hanging Fruit Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005
If your vector data is static, why not use ESRI Shapefiles? The Shape(file) of Things to Come Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005
* shptree builds quadtree indexes of Shapefiles on disk * The indexes live in .qix files in your data directory * shptree ships with MapServer * shptree is your friend – use it! The Shape(file) of Things to Come Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005
* Stephen Woodbridge's shp2tile builds a quadtree indexes of a Shapefile – and then splits it into a number of smaller Shapefiles of about equal size * Use the -q option to shp2tile for best results * Use ogrtindex and TILEINDEX to stitch them back together in MapServer * Get shp2tile from http://imaptools.org/ * shp2tile is your friend – use it! The Shape(file) of Things to Come Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005
* Sometimes you just have too much data to show at once... * Vector generalization (a.k.a “line simplification”) is the answer ** Douglas-Peucker simplification ** Grid-based approximations * There aren't really any good F/OSS implementations * One could patch ogr2ogr to perform line simplification * Keep an eye on http://mappinghacks.com/ Keep It Simple, Stupid Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005
What if your data needs to be readable and writeable? Get PostGIS from http://postgis.refractions.net/ PostGIS is your friend – use it! The Check is in the Post(GIS) Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005
Adding an R-Tree index to a PostGIS spatial table is easy... Then run SELECT UPDATE_GEOMETRY_STATS() or even just VACUUM ANALYZE GiST indexes are your friend – use them! The Check is in the Post(GIS) CREATE INDEX [indexname] ON [tablename] USING GIST ( [geometrycolumn] ); Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005
Actually using that index is a bit trickier... * The following query.... ... doesn't actually use the index! Operator, Can You Help Me? SELECT id, the_geom FROM some_table WHERE Contains(the_geom, GeomFromEWKT('POINT(0 51.5)'); Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005
What you really meant to say was... Operator, Can You Help Me? SELECT id, the_geom FROM some_table WHERE the_geom && GeomFromEWKT('POINT(0 51.5)') AND Contains( the_geom,GeomFromEWKT('POINT(51.5)')); Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005
Which means you can do nice things like... Spatial operators are your friends – use them! Operator, Can You Help Me? SELECT * FROM some_table WHERE the_geom && Expand( GeomFromEWKT('POINT(0 51.5)'),100) AND Distance( GeomFromEWKT('POINT(0 51.5)'),the_geom) < 100; Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005
EXPLAIN SELECT [...] can tell you when the PostgreSQL is actually using your index and when it isn't EXPLAIN is your friend – use it! Query the Query Planner Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005
* If your table is mostly read-only and relies primarily on a single spatial index, PostgreSQL can re-order the rows to make indexed queries even faster... * The geometry column must be constrained to NOT NULL! Cluster 'round, everyone! CLUSTER my_geom_index ON my_table; Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005
The PostgreSQL docs are stellar – as are the PostGIS docs... Refractions Research are your friends – thank them! Finally... Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005