430 likes | 758 Views
Introduction to GIS and Data. Francisco Olivera, Ph.D., P.E. Department of Civil Engineering Texas A&M University. Overview. GIS: Geographic Information Systems Geographic Information Systems: Database management systems in which the databases include geographic information.
E N D
Introduction to GIS and Data Francisco Olivera, Ph.D., P.E. Department of Civil Engineering Texas A&M University
Overview • GIS: Geographic Information Systems • Geographic Information Systems: Database management systems in which the databases include geographic information. • A key characteristic of GIS is the explicit linkage between geographic features represented on a map with attribute data that describe the geometric feature.
Early GIS • The term GIS was first used by Roger Tomlinson in the 1960s during his work with the Canada Land Inventory. A GIS was developed to analyze the data collected and to support the development of land management plans for rural areas. • Work accomplished at the Harvard Laboratory for Computer Graphics and Spatial Analysis in the 1970s and early 1980s had a major influence on the development of GIS. • In 1969, the Environmental System Research Institute was founded by Jack Dangermond, a Harvard Lab graduate.
ESRI Software History • Toolbox GIS provides a command line interface, while desktop GIS provides a point-and-click graphical user interface (GUI). • ArcInfo up to 7.x was a toolbox GIS used for spatial data development and analysis. • ArcView 1.x was a desktop GIS used for displaying and printing data only. ArcView 2.x and 3.x, on the contrary, had some limited data development, analysis and programming capabilities (compared to ArcInfo) without giving up its desktop character.
ESRI Software History • The ESRI software ArcInfo 8.x and ArcView 8.x are desktop GIS with strong data development, analysis and display capabilities. • Both ArcInfo 8.x and ArcView 8.x consist of three components: ArcMap, ArcCatalog and ArcTools, each of which performs specific functions. • The differences between ArcInfo 8.x and ArcView 8.x have to do with the number of commands available, but the interfaces are identical. • ArcInfo 8.x includes ArcInfo Workstation which is identical to the toolbox GIS available in previous versions of ArcInfo.
Programming Languages • ArcInfo up to version 7.x and the current ArcInfo Workstation use Arc Macro Language (AML) as its programming language. • ArcView 3.x uses Avenue, and object-oriented programming language developed specifically for ArcView. • ArcInfo and ArcView 8.x use Visual Basic for Applications (VBA), a standard programming language in the Windows environment.
Transition • The transition from ArcInfo 7.x and ArcView 3.x to ArcInfo 8.x and ArcView 8.x is slower than observed for other software packages. • Lack of backward compatibility keeps users from running Avenue applications with ArcInfo 8.x and ArcView 8.x. • Lack of GIS applications in VBA for ArcInfo 8.x and ArcView 8.x also keeps users from switching to the new software.
Introduction to ArcGIS • ArcGIS is a software program, used to create, display and analyze geospatial data. • Developed by Environmental Systems Research Institute (ESRI) of Redlands, California
Variants of ArcGIS • ArcGIS comes in three different versions based on the capabilities provided by the software: ArcView, ArcEditor and ArcInfo. • ArcView provides data visualization, query, analysis and integration capabilities along with the ability to create and edit simple geographic features. • ArcEditor includes all the functionalities of ArcView and extends these to a multi-user environment. • ArcInfo includes all the functionalities of ArcEditor and adds advanced geoprocessing capabilities.
Components of ArcGIS • ArcCatalog is used for browsing for maps and spatial data, managing spatial data, and viewing and creating metadata. • ArcMap is used for visualizing spatial data, performing spatial analysis and creating maps to show the results. • ArcToolbox is an interface for accessing the data conversion and analysis function that come with ArcGIS.
Definitions • Digital Spatial Datasets: Synthesis – in electronic format – of geographic (map) and tabular (table) information. • Data models: Formats in which geographic data is stored and managed.
Data Models • Vector Data Models (Features) • Points • Lines • Polygons • Raster Data Models (Surfaces) • TIN Data Models (Surfaces)
Features • Geographic objects that have different shapes are represented as features
Features • Points are a pair of x,y coordinates One-to-one relation between features in the map and records in the table.
Features • Lines are sets of coordinates that define a shape One-to-one relation between features in the map and records in the table.
Features • Polygons are sets of coordinates defining boundaries that enclose areas. One-to-one relation between features in the map and records in the table.
Vector Data Implementations • ArcGIS uses three different implementations of the vector data: • Coverages • Shapefiles • Geodatabases • These three different types of storage have to do with the type of data structure chosen to store the data. • Coverages and shapefiles are file-based models, whereas geodatabase models are database management system (DBMS) feature models.
Data Structures of Features • Topologic data structures: • Store (1) the geometry of the features and (2) the spatial relationship between connecting or adjacent features (i.e., topology) in tabular format. • Points do not coincide. • Lines are simple. • Polygons are simple and space-filling (i.e., no overlaps nor empty spaces). • Shared polygon boundaries are stored only once. • Coverages have topologic data structures. • Cartographic data structures: • Store the geometry of the features. • Points can coincide. • Lines can be complex. • Polygons can be complex and not necessarily space-filling. • Shared polygon boundaries are stored as part of the definition of each of the adjacent polygons. • Shapefiles have cartographic data structures.
Nodes Vertices Data Structures of Features • A line is an open sequence of points in which the first and last points are called nodes, and the remaining intermediate points are called vertices.
Data Structures of Features • Complex lines • Simple lines
Data Structures of Features • Complex polygons • Simple polygons
Data Structures of Features • Not space-filling polygons • Space-filling polygons
Coverage Topology • The three major topological relationships that coverages maintain are connectivity, area definition and contiguity. Coverages explicitly store spatial relationships in special files. • Connectivity • Storing connectivity by recording the nodes that mark the end points of arcs (lines) is useful for modeling and tracing flows in linear arcs. • Arcs that share a node are connected. This is called Arc-Node topology. • Area Definition • Coverages define areas by keeping a list of connected arcs that form the boundaries of each polygon. This is called Polygon-Arc topology. • Contiguity • Coverages store contiguity by keeping a list of the polygons on the left and right side of each arc. Connectivity Area Definition Contiguity
Shapefile Topology • Shapefiles are simpler than coverages because they do not store full topological associations among features. • Each shapefile stores features of same type. • Shapefiles have two types of point features: points and multipoints. • Line Shapes can be simple continuous lines such as a fault line in a map. They can also be polylines that branch such as a river. • Polygon shapes can be simple areas such as a single island. They can also be multipart areas such as several islands that constitute a single state.
Geodatabase Topology • Each feature in a geodatabase contains its own shape (geometry) and can exist on its own, as opposed to the coverage data model that models the polygon as a collection of arcs and label points. • Topology enables GIS software to answer questions such as adjacency, connectivity, proximity and coincidence. Until ArcGIS 8.3, topology was a feature of the ArcInfo Coverage data model. • Connectivity: Are all my road lines connected? • Adjacency: Are there gaps between my parcel polygons? • Coincidence: Are the coastlines and country boundaries coincident? • Proximity: Which road crosses my road line ? • Topology is implemented as a set of integrity rules that define the behavior of spatially related geographic features and feature classes. • Topology rules can be defined for the features within a feature class or for the features between two or more feature classes. • E.g. Polygons must not overlap, lines must not have dangles, lines must not intersect.
Data Structures of Features Courtesy: ESRI
(x, y) Number of columns Cell size Number of rows Data Structures of Surfaces • Grid datasets: • Cellular-based data structure composed of square cells of equal size arranged in rows and columns. • Grid definition requires: (1) the coordinates of the upper-left corner, (2) the cell size, (3) the number of rows, (4) the number of columns, and (5) the value at each cell. • Cells that do not store any value are called NODATA cells.
Surfaces • Grid datasets
Surfaces • TIN Datasets
Surfaces • Image Datasets
Data Structures of Surfaces • Triangular Irregular Network (TIN) Datasets: • Dataset constructed by connecting points -- for which the TIN parameter is known – forming triangles. • Triangle sides are constructed by connecting adjacent points so that the minimum angle of each triangle is maximized. • Triangle sides cannot cross breaklines. • The TIN format is efficient to store data because the resolution adjusts to the parameter spatial variability.
Data Structures of Surfaces • Triangular Irregular Network (TIN) Datasets
Data Structures of Surfaces • Image datasets: • ARC Digitized Raster Graphics (ADRG) • Windows bitmap images (BMP) [.bmp] • Multiband (BSQ, BIL and BIP) and single band images [.bsq, .bil and .bip] • ERDAS [.lan and .gis] • ESRI Grid datasets • IMAGINE [.img] • IMPELL Bitmaps [.rlc] • Image catalogs • JPEG [.jpg] • MrSID [.sid] • National Image Transfer Format (NITF) • Sun rasterfiles [.rs, .ras and .sun] • Tag Image File Format (TIFF) [.tiff, .tif and .tff] • TIFF/LZW
Storing Datasets • Features: • Coverages are stored partially in their own folder and partially in the common INFO folder. • Shapefiles are stored in at least three files (with extensions .shp, .shx and .dbf) and up to seven files (with extensions .sbx, .sbn, .ain and .aih). • Geodatabase • It is an open storage structure for storing and managing GIS-related data (spatial geometry, tabular and imagery) in a database management system (DBMS). • The geodatabase follows the fundamental relational data model in which each object and its attributes are stored as a row in a table. • Here “objects” refers to features ( i.e. real-world entity). • Feature Class: A collection of similar features (objects), such as buildings or rivers in a DBMS table. • Feature Dataset: A collection of similar feature classes that share the same spatial reference.
Storing Geodatabases Workspace Geodatabase Cities Feature Class Feature Dataset States Feature Class Feature Class
SURFACES Grid and TIN datasets are stored partially in their own folder and partially in the common INFO folder. Image datasets are stored in different ways depending on the image format. Coverage TIN Info Grid Image.tif Shapefile.shp Shapefile.shx Shapefile.dbf Storing datasets • Structure of a folder containing different types digital spatial data.
Managing Datasets • Renaming • Always use ArcGIS utilities to rename coverages, shapefiles, feature classes, grids and TINs because some information is internally stored with the dataset name. • Images can be renamed using the operating system utilities. • Copying and Moving • Always use ArcGIS utilities to copy and move coverages, grids and TINs to make sure the information stored in the INFO folder is included. • Shapefiles, geodatabases and images can be moved or copied using the operating system utilities, making sure all the files are included. • ArcGIS utilities should be used to copy and move feature classes or feature datasets from geodatabases.
Sharing Datasets • Interchange files • Coverages, grids and TINs are shared as interchange files. • An interchange file is a single file – with extension E00 – that includes all information stored in the dataset folder and its share of information contained in the INFO folder. • If a limit is set on the size of the interchange file, then several smaller files (i.e., E00, E01, E02, …) are generated rather than one single file. This option was common when storage media had limited capacity. • An interchange file is obtained by exporting a coverage, grid or TIN. In turn, a coverage, grid or TIN is obtained by importing an interchange file. • Compressed (“zipped”) files • To make sure that all files are included, shapefiles and images can be shared as compressed files.
Sharing Datasets • A geodatabase as a whole can be shared by using operating system utilities. • Feature classes stored in a geodatabase can be either exported to a different geodatabase as feature classes or as shapefiles to a workspace outside the geodatabse. • All the feature classes stored in a feature dataset can be exported at once as shapefiles by exporting the feature dataset.