180 likes | 215 Views
TERMS, CONCEPTS and DATA TYPES IN GIS Orhan Gündüz. Data used in GIS systems are of two major types: Shape data Raster data Shape data is further divided into three: Point data Line/Polyline data Polygon data. Point data: 0-D object A point is a combination fo two numbers (X,Y)
E N D
Data used in GIS systems are of two • major types: • Shape data • Raster data • Shape data is further divided into three: • Point data • Line/Polyline data • Polygon data
Point data: • 0-D object • A point is a combination fo two numbers (X,Y) • Represents well locations, crime scenes, cities… • Line/polyline data: • 1-D object • A line is the shortest distance between two points • Has a beginning and an ending point • Represents streams, boundaries, roads… • Polygon data: • 2-D object • A polygon is a set of points connected by line segments that close back to the first vertex • Represents lakes, lots…
POINT LINE POLYLINE POLYGON (X2,Y2) (Xn,Yn) outside (X,Y) (X2,Y2) left right inside (X1,Y1) (X1,Y1) (X1,Y1) (Xn-1,Yn-1) * Always follow counter clockwise direction when creating the polygon.
Node: A special type of point where at least 3 line segments intersect Defined by a pair of coordinates (X1,Y1) (X1,Y1) Pixel: Smallest indivisible element of an image (i.e., pixel in digital pictures) Grid/Grid cell: 2-D object feature that represents a single element of a continuous surface (used in raster data)
Symbol: A graphic element that represents features or attributes on a map Hospital Airport Annotation: Text or label graphically pointing a feature ANKARA Gediz River
GIS Operations • Forward data display (from data to map) • Backward data display (from map to data) • Point in polygon analysis • Line in polygon analysis • Polygon overlay • Buffers • Thematic mapping (data display and capture) • Area/Distance calculation operations • Geocoding/address matching • Network analysis • Surface modeling
Concept of meta data: • “Data about data” • An overall description of the contents of the database • Documents data • Gives description on files, formats, locations, source … • Very important
Types of Computerized Systems • Used in GIS • Standalone systems • (single PC, local data storage and processing) • Networked systems • (NT, local processing, centralized data storage, requires authorization) • Centralized systems • (UNIX, centralized data storage/manipulation)
GIS Vector-based GIS Raster-based GIS • Objects stored as points, lines and polygons • Data can be grouped • All data have (X,Y) coord. • Thematic representation is possible • Overlay operations are difficult • Boundaries are easily defined • Objects stored as grids • The higher the resolution, the better the data representation • Poor in boundary definition • Difficulty in defining vector like objects (eg. A road, a river, a fence) • Best for overlay operations • Powerful in modeling
Topology: • The relationship between and among objects • Topology is the branch of mathematics which concerns itself with the concepts of: • Direction • Connectivity • Adjacency or contiguity • Proximity • This design feature allows the computer to know the actual relationship among its graphic parts • Topological data structure is based on nodes and edges • Commonly used in GIS operations
Data in GIS can be classified according to • following methods: • Natural breaks • Quantiles • Equal area • Equal interval • Standart deviation • Continuous / discontinuous • Normalization
NATURAL BREAKS • Data is listed from minimum to maximum • Boundaries of an abrupt change in data is set a break • Data in between breaks are grouped as a unit • Statistical methods could also be used to set the breaks • Variance minimization is an option • QUANTILES • Data is broken into intervals with same number of observations • Mostly useful for linear data • Otherwise can be misleading
EQUAL AREA • Used to classify polygon data • Data divided to form equal area intervals • EQUAL INTERVAL • Range of equal intervals • Ex: If data range is (12…351), then we have a total interval of 339. If divided into 3 intervals, this corresponds to equal intervals of 113. Thus, we obtain: • 12-125 • 125-238 • 238-351
STANDARD DEVIATION • Mean of data set is computed • Interval breaks are found below and above the mean where these breaks occur at ¼, ½ or 1 standard deviation from the mean • Suitable in presenting data that has density information such as population, traffic accidents • As data accumulates around the mean and disperse around the mean according to standard deviation, one could see the areas where data accumulates and disperses • Continuous / Discontinuous • Using upper limits for continuous data • Using both upper and lower limits for discontinuous data
NORMALIZATION • Instead of data value itself, a normalized version is used in representation • Normalization is generally done by the sum of all data or the maximum value of the data • Data value/ sum(data) * 100