320 likes | 499 Views
Database Shootout: Benchmarking spatial DBMSs. Wim de Haas Wilko Quak. You’re in the eye of the storm!. Early morning : Brock Anderson about WMS/PostGIS/Shapefile performance This afternoon : Kevin Neufeld about tips for the PostGIS power user
E N D
Database Shootout: Benchmarking spatial DBMSs Wim de Haas Wilko Quak FOSS4G2007
FOSS4G2007 You’re in the eye of the storm! Early morning: Brock Anderson about WMS/PostGIS/Shapefile performance This afternoon: Kevin Neufeld about tips for the PostGIS power user Now: Reflect on the factor 10 and the framework for testing
FOSS4G2007 Overview • Introduction • What are the problems? • A classification of Spatial DBMS users • How can we help them • Benchmark proposal • First test results • Next steps
FOSS4G2007 Introducing the Ministry of Transport, Public Works and Water Management Our core tasks are: • to offer protection against floods • to guarantee safeand reliable connections over land, water and through the air • to ensure clean and sufficient water • Rijkswaterstaat (RWS) is the executive branche of the Ministry of Transport
FOSS4G2007 Business drivers • How to keep track of all these assets? • How to ensure consistency & coherence in operations and change of Rijkswaterstaat? • How to facilitate decisionmaking and communication • Enter the Digitaal Topografisch Bestand (DTB) • 3D • 1:1000 • EUR 60M
FOSS4G2007 DTB waterway and highway
FOSS4G2007 DTB Birds eye view
FOSS4G2007 DTB Amsterdam Airport
FOSS4G2007 Enter IVRI • The new system for data acquisition and maintenance for the DTB • Oracle 10g • ArcGIS 9.2 • Summit Evolution • Very complex project
FOSS4G2007 First comment on Murphy’s Law • Murphy was an optimist • Oracle and ESRI were pushed to the limits • Took extra time in the project • Triggered us to be less dependent on Oracle • Oracle Spatial is not cheap, so can we use PostGIS as the main datastore for Spatial data?
FOSS4G2007 Why bother … Stonebraker2007: • Where to find dramatic differences in Spatial DBMSs? We define “dramatically outperform” to mean at least a factor 10 advantage […then] customers will be inclined to try the new architecture
FOSS4G2007 Where to expect Dramatic differences? • Operating System (No) • MySQL Spatial Extension vs PostGIS (Yes) • Choice of FileSystem (Maybe) • Functionality Difference (Yes) • Choice of Parameters (Maybe)
FOSS4G2007 Problems with testing • DBMS vendors do not want published results • Oracle explicitly forbids publishing benchmark results • Hardware • Moore’s Law • I/O • Release Frequency of Software • Spatial testing cannot be done on synthetic data • Too many parameters Benchmark results are outdated before they are publised
flat query geometry FOSS4G2007 Benchmark consideration: Weird Cases department diagonal query geometry
FOSS4G2007 Benchmark consideration: Hot vs Cold
FOSS4G2007 Solution • Do not publish the result of the benchmark • Publish a framework that lets people do their own benchmarking • No “One size fits all”: Buyer’s guide • Help different users to find best DBMS
FOSS4G2007 Classification of spatial DBMS users Four classes: • Server Builders: publish spatial data via web services • GIS User: Load various datasets and perform complex analyses • Data Maintainer: Maintain one core dataset • Power Users: All of the above and more
FOSS4G2007 Class 1: Web Server Builders • You do not really need a DBMS for this (You use a fraction of DBMS functionality) • This maybe oversimplified, but is used here for the purpose of clarity • Only one query counts: Find everything within BBOX
FOSS4G2007 Class 2: GIS users • Main interest is functionality • Spend more time on loading data • Need a good query optimiser • Analysis
FOSS4G2007 Class 3: Dataset Maintainers • Limited number of queries • Transactions are an issue • Clustering of data after updates is interesting • More time to tweak • And after all, there are a lot of buttons to push
FOSS4G2007 Class 4: Power users • Do their own testing • Need a platform to discuss their findings
FOSS4G2007 Benchmark components • Functionality test • Literature review • Factual testing • Very simple performance test script with few parameters • BBOX Query • Fixed Dataset (Propasal OpenStreetMap dataset) • Configurable test suite • Full Suite that tests every corner of DBMS • For specialists only
FOSS4G2007 Configuration • HW • Compaq DL380 • OS • Linux RH • SW • MySQL 5.0 • PostgreSQL 8.2.4 • PostGIS 1.3.1 • Dataset is National Road Map
MySQL Almost all operations in MySQL return the same result as the corresponding MBR-based functions However, MySQL is making an effort complying to full OGC support PostGIS Full OGC support FOSS4G2007 Test 1 – Functionality:MySQL vs PostGIS Functionality of MySQL is only suited for simple WMS support and no spatial operations are done on geometry
FOSS4G2007 Test 2: simple BBOX select Write simple script that generates a lot of rectangle queries. Paremeter: • DBMS size • query box size • experiment length
FOSS4G2007 Test 2: grow DBMS size • Question: Does query response time depend on DBMS size or on core memory? • Experiment: Run same test on more then one copies of same database
FOSS4G2007 Test 2 – result: PostGIS vs MySQL
FOSS4G2007 Test 2 – result: Conclusions • As long as dataset fits in core memory differences are small • MySQL can do more with less memory • MySQL degrades faster if you run out of memory • Out of the box installation is bad PR for PostGIS • Maybe because MySQL leaves caching of disk-blocks to OS, while PostgreSQL is doing it otherwise
FOSS4G2007 Test 3: Comprehensive Test Suite • Create set of killer polygons so that every line of source code will be touched by running operations • Test Query optimizer • Test Join Operator • Must be done with Skewed Data
FOSS4G2007 Conclusions (overall) • This is a work in progress • We still miss polygons and spatial queries • The factor 10 is not within reach, yet • No dramatic differences
FOSS4G2007 How to proceed • Finish the work and publish this • Timeline OCT-NOV2007 • TU Delft wiki or osgeo.org wiki? • Start a Special Interest Group a.k.a. Committee?
FOSS4G2007 Questions wim.de.haas@rws.nl c.w.quak@tudelft.nl