610 likes | 880 Views
GIS DATABASES. an overview. Contents. the basics of data storage overview of databases the database approach types of databases databases in GIS design considerations development of an ARC/INFO database. Conceptual, logical and physical . Conceptual. Logical. Physical.
E N D
GIS DATABASES an overview
Contents • the basics of data storage • overview of databases • the database approach • types of databases • databases in GIS • design considerations • development of an ARC/INFO database
Conceptual, logical and physical ... Conceptual Logical Physical
A storage hierarchy ... • files/tables • records • fields(types …) • databases • information systems • decision support systems (DSS) • approaches to storage • application/file based • databases increasing complexity
Application based approach Tax/Rates Assessment Assessment Data Permits Permit Data Sewer Maintenance Sewer Data Applications using data stored as Application Specific data
Database approach Tax/Rates Assessment Assessment Data Permits Permit Data Database Management System Sewer Maintenance Sewer Data Database approach and use of shared data - implications for GIS
Database … a definition • A collection of interrelateddata stored together with controlled redundancy to serve one or more applications in an optimal fashion. • A common and controlled approach is used in adding new data and modifying and retrieving existing data within the data base
Databases… objectives/advantages • centralised data storage and management … global view of data … data dictionary • standardisation of all aspects of data management • reduced duplication • multiple access / retrieval flexibility • integrity constraints … validation enforced • ... • data base management system (DBMS)
Database/s… data dictionary • the most critical (?) element of a database • data about data… metadata • essential for system development • uses include • design - entities and data relationships • data capture - entry/validation • operations - program documentation • maintenance (impact assessment of proposed changes , est. of effort, cost …)
DBMS … key modules • a data description/definition module • defines/creates/restructures • enforces rules • a query module • retrieval for queries, ad-hoc queries, simple reports • a report writing program • a high level language interface • ...
Database… stages of development • information systems plan for organisation • system specification … user needs analysis • conceptual design … data modelling • hardware and software independent • physical design … database design • database implementation • monitoring/audit
Organisational strategy and ITLand Information System (LIS) (i) • Problems/issues: • rationalisation of land related information in government agencies • the removal/reduction of duplication • introduction of economies in data capture, maintenance and storage • better (and wider) access to data solutions ...
Organisational strategy and ITLand Information System (LIS) (ii) • Solutions: • better data distribution mechanism (data format and location transparent to user) • knowledge of data distribution built into the data dictionary • reduction of data duplication • uniform query language (SQL) • coding and data interchange standardisation ( … SDTS)
Database types - a history Evolution of Database technology
Database types - hierarchical (i) • lends itself to GIS use as data are often hierarchical in structure e.g. municipality x province x country • records divided into logically related fields … connected in a tree-like arrangement • master field in each group of records … pointers … updates require pointers to be modified • fast preset queries … ad hoc queries difficult or impossible
Database types - hierarchical (ii) COUNTRY (USA) States Counties Boundaries Nodes
Database types - network (i) • similar to hierarchical but have multiple connections between files to accommodate many to many (M:M) relationships • access to a particular file without searching the entire hierarchy above that file • linked records … quick preset searches … large overhead in pointer management • modification after creation difficult
Database types - relational (i) • model developed from mathematics • records and fields in a 2-dimensional table • no pointers etc … any field can be used to link one table to another • normalisation … redundancy/stable structure • ad hoc queries SQL… modifications easy • not very efficient for GIS …SQL3
Hierarchical structure Network structure Relational structure (part…)
Centralised vs distributed • a database does not necessarily mean a centralised arrangement i.e. all data in one physical place
GIS and distributed databases ... • trend towards open systems ... • special hardware and software can be used widely … specific applications optimised • system/network communications is easier • modular implementation from an overall design … incremental change • unlimited capacity (nodes) … lower risks
Approaches to GIS system design • develop a proprietary system • develop a hybrid system: proprietary graphics + commercial DBMS for attribute data (e.g. ARC/INFO) • use commercial DBMS and develop spatial functions and graphics display used in geographic analysis (e.g. siroDBMS, System9) • develop a spatial DBMS from scratch
(1) Separate Spatial and attribute data Software linkages (2) Integrated Spatial and attribute data
GIS databases … some problems (i) • centralised risk • centralisation demands better quality control other higher potential for disaster • cost • large DBMSs are expensive to design, implement and operate • piecemeal design is difficult • complexity • need to keep track of complex hardware and software • need to keep track of graphical as well as attribute data and the links
GIS databases … some problems (ii) Cascading effects of change in a GIS database (ESRI 1989)
Objectives of design • a good design results in a database which: • contains necessary data but no redundant data • organises data so that different users access the same data • accommodates different views of the data • distinguishes applications which maintain data from those that use it • appropriately represents, codes and organises geographic features
Design methodology (for ARC/INFO) • conceptual model • model the users’ view • define entities and their relationships • logical model • identify representation of entities • match to ARC/INFO data model • organise into geographic data sets • physical model
Design methodology (for ARC/INFO) • 1. Model the users’ view • 2. Define entities and their relationships • 3. Identify representation of entities • 4. Match to ARC/INFO data model • 5. Organise into geographic data sets
1. Model the users’ view • create a model of work performed by users for which ‘location’ is a factor • identify organisational functions • identify the data which supports the functions • organise data into sets of geographic features • data function matrix • high level classification of data • interdependence of data and function • difference between users and creators of data
2. Define entities and their relationships • entities: distinguishable objects which have a common set of properties • identify and describe entities • identify and describe the relationship among these entities • document the process • diagrams • data dictionary • Normalise the data
Normalisation • First Normal Form (1NF) • Second Normal Form (2NF) • Third Normal Form (3NF) ASR - Assessor
Underlying entities... Parcel Zoning Owner Ownership
3. Identify representation of entities • determine the most effective spatial representation for geographic features • consider whether: • a feature might be represented on a map • the shape of a feature might be significant in performing geographic analysis • the feature will have different representations and different map scales • textual attributes of the feature will be displayed on map products • ...