460 likes | 646 Views
GIS UPDATE?. Review Lab 1: Scale Today’s Material: Data Models: Vector Data Raster Data TIN Data Today’s Lab: Data Models. http://www.colorado.edu/geography/gcraft/notes/datacon/datacon_f.html.
E N D
Review Lab 1: Scale • Today’s Material: • Data Models: • Vector Data • Raster Data • TIN Data • Today’s Lab: • Data Models
http://www.colorado.edu/geography/gcraft/notes/datacon/datacon_f.htmlhttp://www.colorado.edu/geography/gcraft/notes/datacon/datacon_f.html
Feature classes are homogeneous collections of common features, each having the same spatial representation, such as points, lines, or polygons, and a common set of attribute columns, for example, a line feature class for representing road centerlines. The four most commonly used feature classes in the geodatabase are points, lines, polygons, and annotation (the geodatabase name for map text). In the illustration below, these are used to represent four datasets for the same area: (1) manhole cover locations as points, (2) sewer lines, (3) parcel polygons, and (4) street name annotation. VECTOR DATA
Generally, feature classes are thematic collections of points, lines, or polygons, but there are seven feature class types: • Points—Features that are too small to represent as lines or polygons as well as point locations (such as GPS observations). • Lines—Represent the shape and location of geographic objects, such as street centerlines and streams, too narrow to depict as areas. Lines are also used to represent features that have length but no area such as contour lines and boundaries. • Polygons—A set of many-sided area features that represents the shape and location of homogeneous feature types such as states, counties, parcels, soil types, and land-use zones. • Annotation—Map text including properties for how the text is rendered. For example, in addition to the text string of each annotation, other properties are included such as the shape points for placing the text, its font and point size, and other display properties. Annotation can also be feature linked and can contain subclasses.
Dimensions—A special kind of annotation that shows specific lengths or distances, for example, to indicate the length of a side of a building, a land parcel, or the distance between two features. Dimensions are heavily used in design, engineering, and facilities applications for GIS. Multipoints—Features that are composed of more than one point. Multipoints are often used to manage arrays of very large point collections, such as lidar point clusters, which can contain literally billions of points. Using a single row for such point geometry is not feasible. Clustering these into multipoint rows enables the geodatabase to handle massive point sets.
Single-part and multipart lines and polygons Line and polygon feature classes in the geodatabase can be composed of single parts or multiple parts. For example, a state can contain multiple parts (Hawaii's islands) but is considered to be a single state feature.
Vertices, segments, elevation, and measurements Feature geometry is primarily composed of coordinate vertices. Segments in lines and polygon features span vertices. Segments can be straight edges or parametrically defined curves. Vertices in features can also include z-values to represent elevation measures and m-values to represent measurements along line features.
Segment types in line and polygon features Lines and polygons are defined by two key elements: (1) an ordered list of vertices that define the shape of the line or polygon and (2) the types of line segments used between each pair of vertices. Each line and polygon can be thought of as an ordered set of vertices that can be connected to form the geometric shape. Another way to express each line and polygon is as an ordered series of connected segments where each segment has a type: straight line, circular arc, elliptical arc, or Bézier curve. The default segment type is a straight line between two vertices. However, when you need to define curves or parametric shapes, you have three additional segment types that can be defined: circular arcs, elliptical arcs, and Bézier curves. These shapes are often used for representing built environments such as parcel boundaries and roadways.
Vertical measurements using z-values Feature coordinates can include x,y and x,y,z vertices. Z-values are most commonly used to represent elevations, but they can represent other measurements such as annual rainfall or air quality.
X,Y tolerance When you create a new feature class, you will be asked to set the x,y tolerance. The x,y tolerance is used to set the minimum distance between coordinates in clustering operations, such as topology validation, buffer generation, and polygon overlay, as well as in some editing operations. Feature processing operations are influenced by the x,y tolerance, which determines the minimum distance separating all feature coordinates (nodes and vertices) during those operations. By definition, it also defines the distance a coordinate can move in x or y (or both) during clustering operations. The x,y tolerance is an extremely small distance (the default is 0.001 meters in on-the-ground units). It is used to resolve inexact intersection locations of coordinates during clustering operations. When processing feature classes using geometry operations, coordinates whose x distance and y distance are within the x,y tolerance of each other are considered to be coincident (in other words, share the same x,y location). Thus, the clustered coordinates are moved to a common location.
RASTER DATA In its simplest form, a raster consists of a matrix of cells (or pixels) organized into rows and columns (or a grid) where each cell contains a value representing information, such as temperature. Rasters are digital aerial photographs, imagery from satellites, digital pictures, or even scanned maps.
Data stored in a raster format represents real-world phenomena: • Thematic data (also known as discrete) represents features such as land-use or soils data. • Continuous data represents phenomena such as temperature, elevation, or spectral data such as satellite images and aerial photographs. • Pictures include scanned maps or drawings and building photographs. • Thematic and continuous rasters may be displayed as data layers along with other geographic data on your map but are often used as the source data for spatial analysis with the ArcGIS Spatial Analyst extension. Picture rasters are often used as attributes in tables—they can be displayed with your geographic data and are used to convey additional information about map features.
Rasters as basemaps A common use of raster data in a GIS is as a background display for other feature layers. For example, orthophotographs displayed underneath other layers provide the map user with confidence that map layers are spatially aligned and represent real objects, as well as additional information. Three main sources of raster basemaps are orthophotos from aerial photography, satellite imagery, and scanned maps. Below is a raster used as a basemap for road data.
Rasters as surface maps Rasters are well suited for representing data that changes continuously across a landscape (surface). They provide an effective method of storing the continuity as a surface. They also provide a regularly spaced representation of surfaces. Elevation values measured from the earth's surface are the most common application of surface maps, but other values, such as rainfall, temperature, concentration, and population density, can also define surfaces that can be spatially analyzed. The raster below displays elevation—using green to show lower elevation and red, pink, and white cells to show higher elevations.
Rasters as thematic maps Rasters representing thematic data can be derived from analyzing other data. A common analysis application is classifying a satellite image by land-cover categories. Basically, this activity groups the values of multispectral data into classes (such as vegetation type) and assigns a categorical value. Thematic maps can also result from geoprocessing operations that combine data from various sources, such as vector, raster, and terrain data. For example, you can process data through a geoprocessing model to create a raster dataset that maps suitability for a specific activity. Below is an example of a classified raster dataset showing land use.
Rasters as attributes of a feature Rasters used as attributes of a feature may be digital photographs, scanned documents, or scanned drawings related to a geographic object or location. A parcel layer may have scanned legal documents identifying the latest transaction for that parcel, or a layer representing cave openings may have pictures of the actual cave openings associated with the point features. Below is a digital picture of a large, old tree that could be used as an attribute to a landscape layer that a city may maintain.
Why store data as a raster? • The advantages of storing your data as a raster are as follows: • A simple data structure —A matrix of cells with values representing a coordinate and sometimes linked to an attribute table • A powerful format for advanced spatial and statistical analysis • The ability to represent continuous surfaces and perform surface analysis • The ability to uniformly store points, lines, polygons, and surfaces • The ability to perform fast overlays with complex datasets
There are other considerations for storing your data as a raster that may convince you to use a vector-based storage option. For example: • There can be spatial inaccuracies due to the limits imposed by the raster dataset cell dimensions. • Raster datasets are potentially very large. Resolution increases as the size of the cell decreases; however, normally cost also increases in both disk space and processing speeds. For a given area, changing cells to one-half the current size requires as much as four times the storage space, depending on the type of data and storage techniques used. • There is also a loss of precision that accompanies restructuring data to a regularly spaced raster-cell boundary.
Rasters are stored as an ordered list of cell values, for example, 80, 74, 62, 45, 45, 34, and so on. The area (or surface) represented by each cell consists of the same width and height and is an equal portion of the entire surface represented by the raster. For example, a raster representing elevation (that is, digital elevation model) may cover an area of 100 square kilometers. If there were 100 cells in this raster, each cell would represent 1 square kilometer of equal width and height (that is, 1 km x 1 km).
Discrete and continuous data Discrete data, which is sometimes called thematic, categorical, or discontinuous data, most often represents objects in both the feature (vector) and raster data storage systems. A discrete object has known and definable boundaries: it is easy to define precisely where the object begins and where it ends. A lake is a discrete object within the surrounding landscape. Where the water’s edge meets the land can be definitively established. Other examples of discrete objects include buildings, roads, and parcels. Discrete objects are usually nouns.
Discrete and continuous data A continuous surface represents phenomena in which each location on the surface is a measure of the concentration level or its relationship from a fixed point in space or from an emitting source. Continuous data is also referred to as field, nondiscrete, or surface data. One type of continuous surface is derived from those characteristics that define a surface, in which each location is measured from a fixed registration point. These include elevation (the fixed point being sea level) and aspect (the fixed point being direction: north, east, south, and west).
Discrete and continuous data Another type of continuous surface includes phenomena that progressively vary as they move across a surface from a source. Illustrations of progressively varying continuous data are fluid and air movement. These surfaces are characterized by the type or manner in which the phenomenon moves. The first type of movement is through diffusion or any other locomotion in which the phenomenon moves from areas with high concentration to those with less concentration until the concentration level evens out. Surface characteristics of this type of movement include salt concentration moving through either the ground or water, contamination level moving away from a hazardous spill or a nuclear reactor, and heat from a forest fire. In this type of continuous surface, there has to be a source. The concentration is always greater near the source and diminishes as a function of distance and the medium the substance is moving through.
The dimension of the cells can be as large or as small as needed to represent the surface conveyed by the raster dataset and the features within the surface, such as a square kilometer, square foot, or even square centimeter. The cell size determines how coarse or fine the patterns or features in the raster will appear. The smaller the cell size, the smoother or more detailed the raster will be. However, the greater the number of cells, the longer it will take to process, and it will increase the demand for storage space. If a cell size is too large, information may be lost or subtle patterns may be obscured. For example, if the cell size is larger than the width of a road, the road may not exist within the raster dataset. In the diagram below, you can see how this simple polygon feature will be represented by a raster dataset at various cell sizes.
The location of each cell is defined by the row or column where it is located within the raster matrix. Essentially, the matrix is represented by a Cartesian coordinate system, in which the rows of the matrix are parallel to the x-axis and the columns to the y-axis of the Cartesian plane. Row and column values begin with 0. In the example below, if the raster is in a Universal Transverse Mercator (UTM) projected coordinate system and has a cell size of 100, the cell location at 5,1 would be 300,500 East, 5,900,600 North. Often you need to specify the extent of a raster. The extent is defined by the top, bottom, left, and right coordinates of the rectangular area covered by a raster, as shown below.
The level of detail (of features/phenomena) represented by a raster is often dependent on the cell (pixel) size, or spatial resolution, of the raster. The cell must be small enough to capture the required detail but large enough so computer storage and analysis can be performed efficiently. More features, smaller features, or a greater detail in the extents of features can be represented by a raster with a smaller cell size. However, more is not often better. Smaller cell sizes result in larger raster datasets to represent an entire surface; therefore, there is a need for greater storage space, which often results in longer processing time.
Choosing an appropriate cell size is not always simple. You must balance your application's need for spatial resolution with practical requirements for quick display, processing time, and storage. Essentially, in a GIS, your results will only be as accurate as your least accurate dataset. If you're using a classified dataset derived from 30-meter resolution Landsat imagery, then creating a digital elevation model (DEM) or other ancillary data at a higher resolution, such as 10 meters, may be unnecessary. The more homogeneous an area is for critical variables, such as topography and land use, the larger the cell size can be without affecting accuracy.
VECTOR AS RASTER http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#/How_features_are_represented_in_a_raster/009t00000006000000/ Points A point is represented by an explicit x,y coordinate in vector format, but as a raster, it is represented as a single cell—the smallest unit of a raster. By definition, a point has no area but is converted to a cell representing area. Therefore, the smaller the cell size, the smaller the area and, thus, the closer the representation of the point feature. For example, it is assumed that a well, a telephone pole, or the location of an endangered plant occupies the entire area covered by a cell.
VECTOR AS RASTER Lines In vector format, a line is an ordered list of x,y coordinates, but in raster format, it is represented as a chain of spatially connected cells with the same value. When there is a break between the chain of same-valued cells, it represents a break in the line feature, which could represent different features such as two roads or two rivers that do not intersect.
VECTOR AS RASTER Polygons A vector polygon is an enclosed area defined by an ordered list of x,y coordinates in which the first and last coordinates are the same, thereby representing area. By contrast, a raster polygon is a group of contiguous cells with the same value that most accurately portray the shape of the area. Polygonal, or area, data is best represented by a series of connected cells. Examples of polygonal features include buildings, ponds, soils, forests, swamps, and fields. The accuracy of the raster representation below is dependent on the scale of the data and the size of the cell. The finer the cell resolution and the greater the number of cells that represent small areas, the more accurate the representation.
RASTER AS VECTOR: Converting raster data to polygons When you convert a raster dataset containing area features, each group of contiguous cells with the same values converts to a polygon. Arcs are created from cell borders in the raster. NoData cells in the input raster do not become polygons in the output. The following is a raster image (left) and what it looks like once converted to polygons (right):
RASTER AS VECTOR: Converting raster data to polylines When you convert a raster dataset containing linear features, a polyline is created from each cell in the input raster dataset. The polyline is positioned so that it passes through the center of each cell. NoData cells in the input raster dataset do not become features in the output. The following is a raster image (left) and what it looks like once converted to polyline features (right):
RASTER AS VECTOR: Converting raster data to points When you convert a raster dataset containing point features, each cell in the input raster dataset converts to a point in the output. Each new point is positioned at the center of the cell it represents. NoData cells do not convert to points. The following is a raster image (left) and what it looks like once converted to point features (right):
Vector Data • Advantages : • Data can be represented at its original resolution and form without generalization. • Graphic output is usually more aesthetically pleasing (traditional cartographic representation); • Since most data, e.g. hard copy maps, is in vector form no data conversion is required. • Accurate geographic location of data is maintained. • Allows for efficient encoding of topology, and as a result more efficient operations that require topological information, e.g. proximity, network analysis. • Disadvantages: • The location of each vertex needs to be stored explicitly. • For effective analysis, vector data must be converted into a topological structure. This is often processing intensive and usually requires extensive data cleaning. As well, topology is static, and any updating or editing of the vector data requires re-building of the topology. • Algorithms for manipulative and analysis functions are complex and may be processing intensive. Often, this inherently limits the functionality for large data sets, e.g. a large number of features. • Continuous data, such as elevation data, is not effectively represented in vector form. Usually substantial data generalization or interpolation is required for these data layers. • Spatial analysis and filtering within polygons is impossible
Raster Data • Advantages : • The geographic location of each cell is implied by its position in the cell matrix. Accordingly, other than an origin point, e.g. bottom left corner, no geographic coordinates are stored. • Due to the nature of the data storage technique data analysis is usually easy to program and quick to perform. • The inherent nature of raster maps, e.g. one attribute maps, is ideally suited for mathematical modeling and quantitative analysis. • Discrete data, e.g. forestry stands, is accommodated equally well as continuous data, e.g. elevation data, and facilitates the integrating of the two data types. • Grid-cell systems are very compatible with raster-based output devices, e.g. electrostatic plotters, graphic terminals. • Disadvantages: • The cell size determines the resolution at which the data is represented.; • It is especially difficult to adequately represent linear features depending on the cell resolution. Accordingly, network linkages are difficult to establish. • Processing of associated attribute data may be cumbersome if large amounts of data exists. Raster maps inherently reflect only one attribute or characteristic for an area. • Since most input data is in vector form, data must undergo vector-to-raster conversion. Besides increased processing requirements this may introduce data integrity concerns due to generalization and choice of inappropriate cell size. • Most output maps from grid-cell systems do not conform to high-quality cartographic needs. http://bgis.sanbi.org/gis-primer/page_19.htm
Triangular irregular networks (TIN) have been used by the GIS community for many years and are a digital means to represent surface morphology. TINs are a form of vector-based digital geographic data and are constructed by triangulating a set of vertices (points). The vertices are connected with a series of edges to form a network of triangles. There are different methods of interpolation to form these triangles, such as Delaunay triangulation or distance ordering. The edges of TINs form contiguous, nonoverlapping triangular facets and can be used to capture the position of linear features that play an important role in a surface, such as ridgelines or stream courses. The graphics on the right show the nodes and edges of a TIN (left) and the nodes, edges, and faces of a TIN (right). http://resources.arcgis.com/en/help/main/10.1/index.html#//006000000001000000
http://upload.wikimedia.org/wikipedia/commons/9/97/Digitales_Gel%C3%A4ndemodell.pnghttp://upload.wikimedia.org/wikipedia/commons/9/97/Digitales_Gel%C3%A4ndemodell.png
Contours Edges Faces Nodes http://resources.arcgis.com/en/help/main/10.1/0060/GUID-0684D3F4-B5BA-4C53-9613-BBC48BBDFC21-web.png
Feature class storage in the geodatabase • In the geodatabase, each feature class is managed in a single table. A Shape column in each row is used to hold the geometry or shape of each feature. • In the feature class table, the following are true: • Each feature class is a table. • Individual features are held as rows. • Feature attributes are recorded in columns . • The Shape column holds each feature's geometry (point, line, polygon, and so forth). • The ObjectID column holds the unique identifier for each feature.