230 likes | 366 Views
Data Structures. Laurie Garo Geog. 2103 - Introduction to GIScience & Technologies Spring 2012. Outline. What is a Database? What are Data Structures? Data Formats: Descriptive: Attribute Graphic: Vector vs Raster Structuring Data in Different Formats. What is a Database?.
E N D
Data Structures Laurie Garo Geog. 2103 - Introduction to GIScience & Technologies Spring 2012
Outline • What is a Database? • What are Data Structures? • Data Formats: • Descriptive: Attribute • Graphic: Vector vs Raster • Structuring Data in Different Formats
What is a Database? • Data = information about a particular subject, topic, place, person, etc. • Database = Organized storage of data - • Similar to a filing system • For a GIS, a database contains both Spatial (Graphical) and Descriptive (Attribute) data; these are linked to one another • Each are structured differently or stored in a different form (structure)
What are Data Structures? • The organization (structure) of the data within the data base in computer readable (digital) form • Data Structures provide the information that the computer requires to reconstruct the spatial data model in digital form for storage, and in graphic form for display (screen or print)
Data Structures, con’t • Different structure for spatial vs attribute data • Different for vector vs raster graphic (spatial) formats • Within one data format there are different ways to structure the data. • This is a major reason why data exchange between different GIS (and other) software can be a problem.
Attribute Data Structure • Tables of Rows (Records) and Columns (Fields) • Records list the individual areas or items being mapped (e.g., states, cities, soil polygons, roads....) • Fields list the relevant descriptions, e.g., city name, area, population, median income..)
Attribute Data Structure, con’t • Attribute data are stored in various formats: • ASCII (text only) format (binary digits - series of 0’s and 1’s, readable by the computer) - universally readable • Dbase - early database software, saved with .dbf extension and accepted by ArcGIS • Excel or other format or Spreadsheet software - import into ArcGIS
Spatial Data Formats: Vector vs Raster • Vector • Points, e.g., Building, School, Quarry • Lines, e.g., Roads, Streams, Airline Routes • Areas or Polygons, e.g., Soils, Landuse, Census Tracts • Raster • Grid Cells (pixels) where the dominant vector feature is encoded per cell and represented as a number or a color/gray shade • Accuracy dependent on grid cell size
Structuring Vector Format Data • Coordinate Geometry: series of points stored as x, y coordinates for points, lines and polygons • Spaghetti Files are the simplest structure • no connectivity; like a bowl of spaghetti (lines) that have no connection • No information on polygons (points forming polygons, neighboring polygons, polygon identifiers for attaching attributes), thus limited use for mapping
Point Dictionary • Point Dictionary with Polygon File is somewhat more complex • Lists connection of points to form polygons • Identifies different polygons and points that form them • Still limited: duplication of common boundaries, no information on line linkages and connected polygons
Topological Data Structures • Topological Data Structures are the most complex, and most useful for mapping. • Topology is concerned with connectivity between entities (points, lines, polygons) • Topology provides a set of instructions which inform the computer where one point, line or polygon is with respect to its neighbors.
Topological Data Structures • With topology, a point is geographically referenced with respect to other points, lines and polygons • Every line is an ordered set of points with start and end “nodes” and the direction in which the points were entered • Data about points and lines and used to show connectivity to form polygons, and to show neighboring connections.
Topological Data Structures • Various Topological structures exist in GIS. All ensure that: • no node or line segment is duplicated, thus no sliver polygons • line segments and nodes can be referenced to more than one polygon • all polygons have unique identifiers for attribute data linkage • island and hole polygons can be adequately represented.
Relation: Spatial & Attribute • For Vector data: • Computer matches attributes with features by unique ID • Enables symbolization of features • Enables queries of features by their attributes • In ArcGIS, features can be isolated and converted to a new shapefile (containing graphic data layer and associated or linked attribute table)
Raster Data Structures • Raster structures are very simple • They consist of a series of grid cells with row and column identification • Each grid cell contains a data value (“Feature Coding”) that represents the feature which is located at that place in geographic space • A file header indicates the grid cell structure, followed by a list of codes per grid cell
Raster Data Structure • Discrete: form distinct regions on a map, e.g., soil polygons (vector and raster) • Continuous: vary smoothly over a surface or range of values, e.g. elevations (raster only) • Layers by feature type • Each grid cell contains one attribute only, in contrast to many attributes per feature in vector format. • http://webhelp.esri.com/arcgisdesktop/9.2/index.cfm?TopicName=What_is_raster_data%3F
Raster Data Structures • Raster formats simplify or generalize vector data, making it look pixelated • However, grid cell size or resolution influences detail and accuracy of representing real world features • http://webhelp.esri.com/arcgisdesktop/9.2/index.cfm?TopicName=Cell_size_of_raster_data
Quadtree Structure • Quadtree data structures provide a way to break grid cells into smaller units for more accurate representation of features • http://en.wikipedia.org/wiki/Quadtree • Raster structures are also used to construct Digital Terrain Models (3-D surfaces)
Raster Data • For Raster data: A matrix (rows,columns) contain the code for the dominant feature within each grid cell. • Simple relation of grid cell with data value (code) and selected color or shade • Useful also for overlay analysis, map algebra, boolean overlay, distance, density functions, surface analysis, statistical analysis (ref. textbook)