Handling Large (Vector) Datasets with MapServer

with your hosts, Schuyler Erle and Rich Gibson of Locative Technologies Sebastopol, CA Handling Large (Vector) Datasets with MapServer Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005

Getting MapServer to scale with large datasets is an art ... Handling Large (Vector) Datasets Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005

Fortunately, thanks to the miracle of modern technology, we can all be MapServer artists! Handling Large (Vector) Datasets Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005

MAXSCALE is your friend – use it! The Low Hanging Fruit Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005

If your vector data is static, why not use ESRI Shapefiles? The Shape(file) of Things to Come Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005

* shptree builds quadtree indexes of Shapefiles on disk * The indexes live in .qix files in your data directory * shptree ships with MapServer * shptree is your friend – use it! The Shape(file) of Things to Come Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005

* Stephen Woodbridge's shp2tile builds a quadtree indexes of a Shapefile – and then splits it into a number of smaller Shapefiles of about equal size * Use the -q option to shp2tile for best results * Use ogrtindex and TILEINDEX to stitch them back together in MapServer * Get shp2tile from http://imaptools.org/ * shp2tile is your friend – use it! The Shape(file) of Things to Come Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005

* Sometimes you just have too much data to show at once... * Vector generalization (a.k.a “line simplification”) is the answer ** Douglas-Peucker simplification ** Grid-based approximations * There aren't really any good F/OSS implementations * One could patch ogr2ogr to perform line simplification * Keep an eye on http://mappinghacks.com/ Keep It Simple, Stupid Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005

What if your data needs to be readable and writeable? Get PostGIS from http://postgis.refractions.net/ PostGIS is your friend – use it! The Check is in the Post(GIS) Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005

Adding an R-Tree index to a PostGIS spatial table is easy... Then run SELECT UPDATE_GEOMETRY_STATS() or even just VACUUM ANALYZE GiST indexes are your friend – use them! The Check is in the Post(GIS) CREATE INDEX [indexname] ON [tablename] USING GIST ( [geometrycolumn] ); Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005

Actually using that index is a bit trickier... * The following query.... ... doesn't actually use the index! Operator, Can You Help Me? SELECT id, the_geom FROM some_table WHERE Contains(the_geom, GeomFromEWKT('POINT(0 51.5)'); Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005

What you really meant to say was... Operator, Can You Help Me? SELECT id, the_geom FROM some_table WHERE the_geom && GeomFromEWKT('POINT(0 51.5)') AND Contains( the_geom,GeomFromEWKT('POINT(51.5)')); Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005

Which means you can do nice things like... Spatial operators are your friends – use them! Operator, Can You Help Me? SELECT * FROM some_table WHERE the_geom && Expand( GeomFromEWKT('POINT(0 51.5)'),100) AND Distance( GeomFromEWKT('POINT(0 51.5)'),the_geom) < 100; Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005

EXPLAIN SELECT [...] can tell you when the PostgreSQL is actually using your index and when it isn't EXPLAIN is your friend – use it! Query the Query Planner Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005

* If your table is mostly read-only and relies primarily on a single spatial index, PostgreSQL can re-order the rows to make indexed queries even faster... * The geometry column must be constrained to NOT NULL! Cluster 'round, everyone! CLUSTER my_geom_index ON my_table; Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005

The PostgreSQL docs are stellar – as are the PostGIS docs... Refractions Research are your friends – thank them! Finally... Handling Large (Vector) Datasets with MapServer MapServer User Meeting 2005

Handling Large (Vector) Datasets with MapServer

Handling Large (Vector) Datasets with MapServer

Presentation Transcript

Handling Large Numbers of Entries

Large Angle Convergent Beam Electron Diffraction LACBED

Algorithms for Large Data Sets

Large Vector-Field Visualization, Theory and Practice: Large Data and Parallel Visualization

Expression Vector Expression of cloned genes produces large quantities of protein

Engaging Students in a Large Class

Handling Bit-Propagating Operations in Bit-Vector Reasoning

FILS Handling of Large Objects

FILS Handling of Large Objects

FILS Handling of Large Objects

Functional Link Network

Vector meson production at large t at HERA

Vector

Handling Large Amounts of Biological Data

« HANDLING STANDARD GIS FEATURES »

Vector

Exception Handling in Java

vector

Exception Handling (2)

vector

Lecture 9