1 / 32

Database Shootout: Benchmarking spatial DBMSs

Database Shootout: Benchmarking spatial DBMSs. Wim de Haas Wilko Quak. You’re in the eye of the storm!. Early morning : Brock Anderson about WMS/PostGIS/Shapefile performance This afternoon : Kevin Neufeld about tips for the PostGIS power user

aizza
Download Presentation

Database Shootout: Benchmarking spatial DBMSs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Database Shootout: Benchmarking spatial DBMSs Wim de Haas Wilko Quak FOSS4G2007

  2. FOSS4G2007 You’re in the eye of the storm! Early morning: Brock Anderson about WMS/PostGIS/Shapefile performance This afternoon: Kevin Neufeld about tips for the PostGIS power user Now: Reflect on the factor 10 and the framework for testing

  3. FOSS4G2007 Overview • Introduction • What are the problems? • A classification of Spatial DBMS users • How can we help them • Benchmark proposal • First test results • Next steps

  4. FOSS4G2007 Introducing the Ministry of Transport, Public Works and Water Management Our core tasks are: • to offer protection against floods • to guarantee safeand reliable connections over land, water and through the air • to ensure clean and sufficient water • Rijkswaterstaat (RWS) is the executive branche of the Ministry of Transport

  5. FOSS4G2007 Business drivers • How to keep track of all these assets? • How to ensure consistency & coherence in operations and change of Rijkswaterstaat? • How to facilitate decisionmaking and communication • Enter the Digitaal Topografisch Bestand (DTB) • 3D • 1:1000 • EUR 60M

  6. FOSS4G2007 DTB waterway and highway

  7. FOSS4G2007 DTB Birds eye view

  8. FOSS4G2007 DTB Amsterdam Airport

  9. FOSS4G2007 Enter IVRI • The new system for data acquisition and maintenance for the DTB • Oracle 10g • ArcGIS 9.2 • Summit Evolution • Very complex project

  10. FOSS4G2007 First comment on Murphy’s Law • Murphy was an optimist • Oracle and ESRI were pushed to the limits • Took extra time in the project • Triggered us to be less dependent on Oracle • Oracle Spatial is not cheap, so can we use PostGIS as the main datastore for Spatial data?

  11. FOSS4G2007 Why bother … Stonebraker2007: • Where to find dramatic differences in Spatial DBMSs? We define “dramatically outperform” to mean at least a factor 10 advantage […then] customers will be inclined to try the new architecture

  12. FOSS4G2007 Where to expect Dramatic differences? • Operating System (No) • MySQL Spatial Extension vs PostGIS (Yes) • Choice of FileSystem (Maybe) • Functionality Difference (Yes) • Choice of Parameters (Maybe)

  13. FOSS4G2007 Problems with testing • DBMS vendors do not want published results • Oracle explicitly forbids publishing benchmark results • Hardware • Moore’s Law • I/O • Release Frequency of Software • Spatial testing cannot be done on synthetic data • Too many parameters Benchmark results are outdated before they are publised

  14. flat query geometry FOSS4G2007 Benchmark consideration: Weird Cases department diagonal query geometry

  15. FOSS4G2007 Benchmark consideration: Hot vs Cold

  16. FOSS4G2007 Solution • Do not publish the result of the benchmark • Publish a framework that lets people do their own benchmarking • No “One size fits all”: Buyer’s guide • Help different users to find best DBMS

  17. FOSS4G2007 Classification of spatial DBMS users Four classes: • Server Builders: publish spatial data via web services • GIS User: Load various datasets and perform complex analyses • Data Maintainer: Maintain one core dataset • Power Users: All of the above and more

  18. FOSS4G2007 Class 1: Web Server Builders • You do not really need a DBMS for this (You use a fraction of DBMS functionality) • This maybe oversimplified, but is used here for the purpose of clarity • Only one query counts: Find everything within BBOX

  19. FOSS4G2007 Class 2: GIS users • Main interest is functionality • Spend more time on loading data • Need a good query optimiser • Analysis

  20. FOSS4G2007 Class 3: Dataset Maintainers • Limited number of queries • Transactions are an issue • Clustering of data after updates is interesting • More time to tweak • And after all, there are a lot of buttons to push

  21. FOSS4G2007 Class 4: Power users • Do their own testing • Need a platform to discuss their findings

  22. FOSS4G2007 Benchmark components • Functionality test • Literature review • Factual testing • Very simple performance test script with few parameters • BBOX Query • Fixed Dataset (Propasal OpenStreetMap dataset) • Configurable test suite • Full Suite that tests every corner of DBMS • For specialists only

  23. FOSS4G2007 Configuration • HW • Compaq DL380 • OS • Linux RH • SW • MySQL 5.0 • PostgreSQL 8.2.4 • PostGIS 1.3.1 • Dataset is National Road Map

  24. MySQL Almost all operations in MySQL return the same result as the corresponding MBR-based functions However, MySQL is making an effort complying to full OGC support PostGIS Full OGC support FOSS4G2007 Test 1 – Functionality:MySQL vs PostGIS Functionality of MySQL is only suited for simple WMS support and no spatial operations are done on geometry

  25. FOSS4G2007 Test 2: simple BBOX select Write simple script that generates a lot of rectangle queries. Paremeter: • DBMS size • query box size • experiment length

  26. FOSS4G2007 Test 2: grow DBMS size • Question: Does query response time depend on DBMS size or on core memory? • Experiment: Run same test on more then one copies of same database

  27. FOSS4G2007 Test 2 – result: PostGIS vs MySQL

  28. FOSS4G2007 Test 2 – result: Conclusions • As long as dataset fits in core memory differences are small • MySQL can do more with less memory • MySQL degrades faster if you run out of memory • Out of the box installation is bad PR for PostGIS • Maybe because MySQL leaves caching of disk-blocks to OS, while PostgreSQL is doing it otherwise

  29. FOSS4G2007 Test 3: Comprehensive Test Suite • Create set of killer polygons so that every line of source code will be touched by running operations • Test Query optimizer • Test Join Operator • Must be done with Skewed Data

  30. FOSS4G2007 Conclusions (overall) • This is a work in progress • We still miss polygons and spatial queries • The factor 10 is not within reach, yet • No dramatic differences

  31. FOSS4G2007 How to proceed • Finish the work and publish this • Timeline OCT-NOV2007 • TU Delft wiki or osgeo.org wiki? • Start a Special Interest Group a.k.a. Committee?

  32. FOSS4G2007 Questions wim.de.haas@rws.nl c.w.quak@tudelft.nl

More Related