870 likes | 1.01k Views
Geografiske informasjonssystemer (GIS) SGO1910 & SGO4930 Vår 2004. Foreleser: Karen O’Brien (karen.obrien@cicero.uio.no) Seminarleder: Gunnar Berglund (gunnarbe@student.sv.uio.no). Geographic Databases. A GIS can answer the question: What is where?.
E N D
Geografiske informasjonssystemer (GIS)SGO1910 & SGO4930 Vår 2004 Foreleser: Karen O’Brien (karen.obrien@cicero.uio.no) Seminarleder: Gunnar Berglund (gunnarbe@student.sv.uio.no)
A GIS can answer the question: What is where? • WHAT: Characteristics of attributes or features. • WHERE: In geographic space.
Attribute Data Flat File Relations Map Data Point File Line File Area File Topology A GIS links attribute and spatial data
Record Value Value Value Record Value Value Value Record Value Value Value Flat File Database Attribute Attribute Attribute
Arc/node map data structure with files 13 1 x y 11 e 2 x y l i 12 3 x y F 10 2 s 4 x y t 7 n 5 x y i 5 o POLYGON “A” 6 x y P 9 7 x y 4 8 x y 6 1 9 x y 2 10 x y 3 11 x y 8 12 x y 13 x y 1 File of Arcs by Polygon 1,2,3,4,5,6,7 1 A , Area, Attributes : 1,2 1,8,9,10,11,12,13,7 2 Arcs File Figure 3.4 Arc/Node Map Data Structure with Files.
What is a Data Model? • A logical construct for the storage and retrieval of information. • Attribute data models are needed for the DBMS. • The origin of DBMS data models is in computer science.
Definitions • Database – an integrated set of data on a particular subject • Geographic (=spatial) database - database containing geographic data of a particular subject for a particular area • Database Management System (DBMS) – software to create, maintain and access databases
A DBMS contains: • Data definition language • Data dictionary • Data-entry module • Data update module • Report generator • Query language
Advantages of Databases • Avoids redundancy and duplication • Reduces data maintenance costs • Applications are separated from the data • Applications persist over time • Support multiple concurrent applications • Better data sharing • Security and standards can be defined and enforced
Disadvantages of Databases • Expense • Complexity • Performance – especially complex data types • Integration with other systems can be difficult
Characteristics of DBMS (1) • Data model support for multiple data types • e.g MS Access supports Text, Memo, Number, Date/Time, Currency, AutoNumber, Yes/No, OLE Object, Hyperlink, Lookup Wizard • Load data from files, databases and other applications • Index for rapid retrieval
Characteristics of DBMS (2) • Query language – SQL • Security – controlled access to data • Multi-level groups • Controlled update using a transaction manager • Backup and recovery
Role of DBMS Task System • Data load • Editing • Visualization • Mapping • Analysis Geographic Information System • Storage • Indexing • Security • Query Database Management System Data
Retrieval • The ability of the DBMS or GIS to get back on demand data that were previously stored. • Geographic search is the secret to GIS data retrieval. • Many forms of data organization are incapable of geographic search. • GIS systems have embedded DBMSs, or link to a commercial DBMS.
Types of DBMS Model • Hierarchical • Network • Relational - RDBMS • Object-oriented - OODBMS • Object-relational - ORDBMS
Historically, databases were structured hierarchically in files... Norge Oppland Akershus Hordaland Bærum Asker Ski
Relational DBMS • Data stored as tuples (tup-el), conceptualized as tables • Table – data about a class of objects • Two-dimensional list (array) • Rows = objects • Columns = object states (properties, attributes)
Relation Rules • Only one value in each cell (intersection of row and column) • All values in a column are about the same subject • Each row is unique • No significance in column sequence • No significance in row sequence
Table Column = property Table = Object Class Row = object Object Classes with Geometry called Feature Classes
Relational Join • Fundamental query operation • Table joins use common keys (column values) • Table (attribute) join concept has been extended to geographic case
Relational Data Bases File Patient Record Key Check-in Check Out Room No. 42 2/1/96 2/4/96 N763 78 2/3/96 2/4/96 N712 Purchase Record File Item Date Price Customer Key Skate Board 2/1/96 49.95 John Smith 42 Baseball Bat 2/1/96 17.99 James Brown 978 File Accident Report Date Injury Name Key Location 2/1/96 Broken Leg John Smith 42 75 Elm Street 2/2/96 Concussion Sylvia Jones 654 12 State Street 2/2/96 Cut on Ear Robert Doe 123 2323 Broad Street
Most DBMS are now relational databases. • Based on multiple flat files for records, with dissimilar attribute structures, connected by a common key attribute.
Retrieval Operations • Searches by attribute: find and browse. • Data reorganization: select, renumber, and sort. • Compute allows the creation of new attributes based on calculated values.
Spatial Retrieval Operations • Attribute queries are not very useful for geographic search. • In a map database the records are features. • The spatial equivalent of a find is locate, the GIS highlights the result. • Spatial equivalents of the DBMS queries result in locating sets of features or building new GIS layers.
The Retrieval User Interface • GIS query is usually by command line, batch, or macro. • Most GIS packages use the GUI of the computer’s operating system to support both a menu-type query interface and a macro or programming language. • SQL is a standard interface to relational databases and is supported by many GISs.
SQL • Structured (Standard) Query Language – (pronounced SEQUEL) • Developed by IBM in 1970s • Now de facto and de jure standard for accessing relational databases • Three types of usage • Stand alone queries • High level programming • Embedded in other applications
Types of SQL Statements • Data Definition Language (DDL) • Create, alter and delete data • CREATE TABLE, CREATE INDEX • Data Manipulation Language (DML) • Retrieve and manipulate data • SELECT, UPDATE, DELETE, INSERT • Data Control Languages (DCL) • Control security of data • GRANT, CREATE USER, DROP USER
Spatial Relations • Equals – same geometries • Disjoint – geometries share common point • Intersects – geometries intersect • Touches – geometries intersect at common boundary • Crosses – geometries overlap • Within– geometry within • Contains – geometry completely contains • Overlaps – geometries of same dimension overlap • Relate – intersection between interior, boundary or exterior
Spatial Methods • Distance – shortest distance • Buffer – geometric buffer • ConvexHull – smallest convex polygon geometry • Intersection – points common to two geometries • Union – all points in geometries • Difference – points different between two geometries • SymDifference – points in either, but not both of input geometries
Spatial Search • Buffering is a spatial retrieval around points, lines, or areas based on distance. • Overlay is a spatial retrieval operation that is equivalent to an attribute join.
Recode OR
Types of overlay operations • And • Or • Max • Min
Buffer (raster) + 1
Complex Retrieval: Map Algebra • Combinations of spatial and attribute queries can build some complex and powerful GIS operations, such as weighting.
Summary • Database – an integrated set of data on a particular subject • Databases offer many advantages over files • Relational databases dominate
Issues to discuss • how attribute data is stored in a table of rows and columns • how attribute data is associated with features • tabular field types supported in ArcGIS • types of table relationships • how tables can be related to each other • how to join tables based on a common field
Review • A geographic database contains both spatial and tabular data. The spatial data contains feature shape and location information, while the tabular data contains the attributes for the features. Often, feature attributes are contained in multiple tables.
Anatomy of a Table • Each table in a database has the same basic format: an array of rows and columns. Rows are also called records, and columns are also called fields. • Some tables, like a feature class's default feature attribute table, have a preset number of columns. For instance, a polygon coverage's feature attribute table has four standard columns: Area, Perimeter, Coverage#, and Coverage-ID, while a line shapefile's feature attribute table has only one default column, named Shape. Other tables are completely user-defined.
The table has three user-added columns: Name, Country, and Population. ArcMap automatically adds a third column (FID) for display purposes. The name of this column may be different depending on the type of data source. For example, it is called FID for a coverage or shapefile, OBJECTID for a geodatabase feature class, and Order_ID for a grid. • Because some databases and some operations do not support fields with blanks in their names, you should avoid creating fields that contain them. In addition, every column in a table should have a unique name but columns in the same table can have a variety of formats. NOTE: Norwegian “æ å ø” can also create problems, as can decimal formats (10,1 versus 10.1).
Tabular data field types • Tables are capable of storing date, number, and text values, but most tabular formats have several different field types to store this information. • Choosing the best field type for the values to be stored is an important consideration. Also, the available field types can vary between tabular formats. In general, you can store numbers, text, and dates. Specifically supported formats in ArcCatalog™ include short integer, long integer, float, double, text, date, object-id, and blob.
Information stored in tables is organized by fields and field types. When defining a table's fields, be aware that each database has its own rules defining what names and characters are permitted.
ArcGIS Tabular Formats • ArcGIS supports the use of multiple formats for storing and managing tabular data. Each of ArcGIS software's primary spatial formats has its own native format. Coverages use INFO-formatted tables; shapefiles store their attributes in dBASE (.dbf) format; geodatabases rely on the format of their supporting RDBMS (Oracle, for example). • Deciding on the proper format in which to store attribute information is an important part of database design and can affect the efficiency with which you are able to access feature attributes. To facilitate sharing data that's in different formats, ArcCatalog and ArcToolbox contain tools to convert between the various tabular formats. In addition, some formats, such as the coverage, can link to independent tables regardless of their format.