560 likes | 667 Views
Environmental GIS. Nicholas A. Procopio, Ph.D, GISP nick@drexel.edu. Data Types. In GIS, there are three main types of data Spatial Attribute Metadata. Zygo, Lisa, Baylor University, Lecture Notes, 2002. Data Sources. Data Types
E N D
Environmental GIS Nicholas A. Procopio, Ph.D, GISP nick@drexel.edu
Data Types • In GIS, there are three main types of data • Spatial • Attribute • Metadata Zygo, Lisa, Baylor University, Lecture Notes, 2002
Data Sources • Data Types • Primary – Measurements collected through first-hand observations • Secondary – Measurements collected through a secondary source (i.e., neighborhood surveys)
Metadata • Data documentation • Data about the data • Explains the form, content, accuracy, precision, usability, creator, purpose, etc. • Metadata standards exist • Metadata is apart of geospatial data
Metadata • Metadata information includes • Identification – title, area, dates, owners, organizations, etc. • Data quality – attribute accuracy and spatial precision, consistency, sources of info, and methods of data production • Spatial data organization – raster-vector format and organization of features in the data set, data model • Spatial reference – map projections, datums, and coordinate system
Metadata • Metadata is created to… • Protect investment in data • Staff turnover, memory loss • Makes it easier to reuse and update data • Provides documentation of sources, quality • Easier to share data • Helping the user understand the data • Provides consistent terminology • Focuses on key elements • Helps user determine fitness for use • Facilitates data transfer, interpretation by new users
Federal Geographic Data Committee http://www.fgdc.gov/ Under Executive Order No. 12906, all federal agencies and organizations must document their geospatial data using the FDGC Content Standard for Digital Geospatial Metadata
Federal Geographic Data Committee • Compliance with this executive order will… • Minimize duplication of data • Foster cooperative digital data collection activities • Establish a national framework of quality data
Metadata Use ArcCatalog to create and edit metadata
Database Models • Database – a collection of non-redundant data, which can be shared by different application systems • Geographic database – database linked to geographic data for a particular area and subject.
Attribute Data The “where” of GIS is determined by the spatial data The “what” is determined by the attribute data The attribute data is just as important as the spatial data
Databases Attribute data are stored in database tables
Databases • Advantages of a DBMS include • Reduced redundancy of data duplication • Various data access methods are possible (queries) • Data is stored independently of the application for which they will be used • Access to data is controlled and data is centralized • Ease of updating and maintaining data
Creating a database • Consider the following… • Storage media • How will the database change over time? • What security is needed? • Should the database be distributed or centralized? • How should database creation be scheduled?
Codd’s Principles for Databases Only one value per cell All values in a column are about the same subject Each row is unique No significance to the sequence of columns No significance to the sequence of rows Keep your table simple!
Attribute Types • Qualitative • No measurement or magnitude • Non-numeric descriptions • No numeric meaning, even if shown as code numbers (i.e., 1=category 1)
Attribute Types • Quantitative • Numeric and have mathematical meaning • Serve as measurements or magnitudes of the features they refer • Example: city population
Types of Databases • Relational • Presents data organized in a series of two-dimensional tables, each containing records for one entity
Relational Database • Flexible approach to linkages between records comes closest to modeling the complexity of spatial relationships between objects • Links attributes contained in separate files with a key attribute • The key attribute is usually a non-redundant, unique identification number for each record • The most popular DBMS model for GIS
Data • Most data is input into a database by keycoding • Other data may be obtained through government sources • USGS • US Census • NOAA • State Agencies • Data may also be obtained from other projects
Methods of Spatial Data Entry • Manual “heads-up” digitizing • Scanners • Appropriate for encoding raster data since this is the output format for most scanners. • Problems may include • Scanning unwanted information • Optical distortion • The higher the resolution, the more volume of data produced
Methods of Spatial Data Entry • Electronic Data Transfers • Downloading data from the internet • Downloading data from a GPS unit • Consider when obtaining electronic data • What data is available • Cost • Media • Format
Sources of Electronic Data • United States Geological Survey (USGS) • Digital Line Graphs (DLG) • Digital Elevation Models (DEM) • Digital Orthophoto Quads (DOQ) • United States Census Bureau (USCB) • Topologically Integrated Geographic Encoding Reference System (TIGER) • First comprehensive GIS database at street level for entire U.S. • National Oceanographic and Atmospheric Agency (NOAA) • Satellite and radar images • Bathymetry maps
Other Sources of Spatial Data • Field Data • Global Positioning System (GPS) • Locating position from receiving a signal from orbiting satellites • Manual Input • Remote Sensing • Utilizing satellite images to develop a base view of area of interest
Spatial Databases Real world is infinitely complex Database size is limited Data model converts real world into elements that can be stored in a database
Toward Realism: Layers A GIS breaks down reality into different layers (themes) A layer can be composed of identical entities such the locational information for trees, manholes, buildings, etc. Layers can be overlapped to show the spatial relationship between various entities Layers can also represent different times
Spatial Databases • There are two primary models for spatial data in a GIS • Raster • a data structure or model based on grid cells • Vector • a data structure composed of nodes, vertices, and arcs or connected points
Raster Data Models • Individual cells are used as the building blocks for creating point, line, and polygon entities • Size of the cell very important because it will reflect how entities are displayed (i.e., more specific shape with greater number of cells). • Cell represents some attribute or a reference ID to a table of attributes
Raster Data Model Raster data are ideal for continuous data such as air temperatures, water pH, etc. What happens when two categories occupy the same cell?
Raster Spatial Databases • Single objects displayed by shading individual cells • Linear features displayed by shading a sinuous series of connected cells • Polygon features displayed by shading a group of connected cells • Relief can be shown by assigning a certain value to each selected cell
Raster Data Models • Cells may be homogenous (each cells contains the same feature) or heterogeneous (one cell contains varying features) • Heterogeneity may be resolved by • Simply looking for the presence or absence of features • Looking at the cell center to determine placement of index code • Dominant area analysis • Transition cells • Percentages
Spatial Databases • Advantages of Raster Format • Simple data structure • Compatible with remotely sensed or scanned data • Simple spatial analysis procedures
Spatial Databases • Disadvantages of Raster Format • Requires large storage space • Graphical output may be less pleasing (depending on resolution) • Projection transformations more difficult • Difficult to maintain topology
Vector Spatial Databases Vector data models arose in the early 1960’s in relation to the development of the hierarchical attribute data structure The first generation were simply lines with an arbitrary start and ending point Files would typically consist of a few long lines and many short lines Often referred to as cartographic spaghetti
Spatial Databases • Vector Data Model • Uses two-dimensional Cartesian coordinates to store the shape of a spatial entity. • The point is the basic building block from which all other spatial entities are constructed. • Lines and areas are constructed by connecting a series of points
Vector Data Models • Uses two-dimensional Cartesian coordinates to store the shape of a spatial entity. • The point is the basic building block from which all other spatial entities are constructed. • Lines and areas are constructed by connecting a series of points (nodes and vertices)
Vector Spatial Databases • Advantages • Requires less storage space • Topology easily maintained • Graphical output usually more pleasing
Vector Spatial Databases • Disadvantages • More complex data structure • Not compatible with remotely sensed data • Spatial analysis operations more difficult • Selecting appropriate number of points to display feature • Too few points would compromise shape or spatial properties (area, perimeter, etc.) • Too many points means possible data duplication and increase costs in terms of data storage
Advancing Toward Topology • The arc/node model developed as a “hierarchy” for spatial data • Based on the principle that each type of structure consists of features built upon simpler features • Coordinates make up points • Connected points make lines • Connected lines make polygons • Allows the user to differentiate between points, line, and polygons, but requires maintenance of links between features
Topologic Models • This new model allowed for drawing a line only once • For example: • If two polygons shared a side, that shared side would have to be traced when both polygons were drawn • This would allow for the possibility of gaps or slivers between the individual lines (topological error) • The new system avoided the error because the one arc “told” which polygon was to the left and which polygon was on the right
Topological Terms • Nodes • Where a line begins, ends, or where two lines intersect • Vertices • Where a line bends • Arcs • Line segment between two nodes Nodes Arcs Vertices
1 x y 2 x y 3 x y 4 x y 5 x y 6 x y 7 x y 8 x y 9 x y 10 x y 11 x y Points File 3 1 2 4 1 5 A Files of arcs by polygons 1 1,2,3,4,5,6,7 2 1,7,8,9,10,11 Arcs File 11 A: 1, 2, Area, Attributes 6 9 7 2 10 8 Topology Example
Topology Example Topology not attained! Sliver Topology is attained!
0 0 1 0 0 0 0 0 0 0 Real World 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 Raster Windmills 0 = No Data 1 = Windmill Summary of Data Models Vector Windmills
Summary of Data Models • Raster • Every location given an object • Vector • Every object is given a location
Vectorization Rasterization Data Conversion • Data can be transformed from one of these data models to the other • You always loose some information when going from one data format to the other
Vector Format Raster Format Rasterization Loose topological features Positional accuracy decreases Zygo, Lisa, Baylor University, Lecture Notes, 2002
Raster Format Vector Format Vectorization Features look “jagged” or “pixelated” in the vector representation Topology is created Zygo, Lisa, Baylor University, Lecture Notes, 2002