320 likes | 440 Views
Geogra ph i cal Data. Type s , relati ons , m easures , classificati on s, dimensi on , aggregati on. To be seen on maps. urban. grass. water. te x t (nam e , e levation ). dik e. Topogra ph ic map. C lassif ied isolin e map. To be seen on maps. Choropleth map:
E N D
Geographical Data Types, relations, measures, classifications, dimension, aggregation
To be seen on maps urban grass water text (name,elevation) dike Topographic map Classified isoline map
To be seen on maps Choropleth map: Map with administrative boundarieswhich shows per region a value by a color or shade Useofpesticide 1_3_D per county
Maps show ... • Relation of place (geographic location) to a value (here 780 mm precipitation) or name (here is Minnesota). • An abstraction (model, simplification) of reality • A combinationof themes (different sorts of data) • Connections (subway maps) Tokyo subway map
Scales of measurement • Nominal scale • Ordinal scale • Interval scale • Ratio scale • ( Angle/direction, vector, … ) Classificationoftypes of data by statisticalproperties (Stevens, 1946)
Nominal scale • Administrative map (namesofthe countries) • Landuse map (namesof landuse: urban, grass, forest, water, …) • Geologicalmap (namesofsoil types: sand, clay, rock, …) Finite number ofclasses, each with a name. Testingis possible for equivalenceof name.
Ordinal scale • School type (VMBO, HAVO, VWO) • Wind force on schaleof Beaufort (0=no wind, ... 6=heavy wind, …, 9=storm, ...) • Questionnaire-answers (disagree, partly disagree, neutral, partly agree, agree) Finite number of classes, each with a name Testingforequivalenceof nameandfor order
Interval scale • Temperature in degrees Celsius or Fahrenheit • Time/year on Christian calendar Unboundednumber ofclasses, each with a value Testingfor equivalence, for orderandfordifference(aunit distance exists)
Ratio scale • Measurements: concentrationof lead in soil • Counts: population, number ofairports • Percentages: unemployment percentage, percent of landuse type forest Unboundednumber ofclasses, eachwith a value Testing forequivalence, for order, fordifferenceandforratio (a naturalzero exists)
Overview collection twodata
Otherscales • Angle (wind direction, direction of spreading) • Vector:angleandvalue (primary wind direction and speed) • Categorical scaleswith partial membership (fuzzy sets; points onindeterminate boundarybetween “plains” and “mountains”; location of coast line: tide)
Classification schemes Data on nominal scale: hierarchical classification schemes houses living flats urban working agriculture cattle landuse fruit plants nature water
Classification schemes • Data on interval and ratio scales • Fixed intervals • Fixed intervals based onspread • Quantiles: equalrepresentatives • “Natural” boundaries 4, 5, 5, 8, 12, 14, 17, 23, 27 [1-10], [11-20], [21-30] [4-11], [12-19], [20-27] [4-5], [8-14], [17-27] [4-5], [8-17], [23-27]
Classification schemes, cont’d • Statisticalboundaries: average , standarddeviation , then e.g. boundaries - 2, - , , + , + 2 • Arbitrary
Two classifications Counties of Arizona, total population Quartiles Four equal intervals
Why is choice of classification important? • Visualization often needs classification • Choice of class intervals influences interpretationThink of a report that addresses air pollution due to a factory made by the board of the factory or by an environmental organization
Data: object and field view • Object view: discrete objects in the real world • road • telephone pole • lake • Field view: geographic variable has a “value” at every location in the real world • elevation • temperature • soil type • land cover
Referencesystem • Data according to the scales of measurement are attribute values in areferencesystem • A geographical reference system is spatial, temporal or both At 12 noon ofAugust 26, 1999 , a temperature of 17.6 degrees Celsius is measured at 5 degrees longitude and 53 degrees latitude
Spatial objects • Points; 0-dimensional, e.g.measurement point • (Polygonal) line; 1-dimensional, e.g. borderbetweenBoliviaandPeru • Polygons; 2-dimensional, e.g. Switzerland • Sets of points, e.g. locations of accidents • Systems of lines (trees, graphs), e.g. streetnetwork • Sets of polygons, subdivisions, e.g. islandgroup, provinces of Nederland
Dependencyof dimension • Dimensionof an object can be scale dependent: Rhineriver atscale 1 on 25.000 is 2-dim.; Rhineat scale 1 on 1.000.000 is 1-dim. • Dimensionofan object can be application dependent: Rhine as transportroute is 1-dim.(length is relevant; not the surface area); Rhine as land cover in Nederland is 2-dim.
The third dimension • Elevationcan be considered an attribute onthe ratio (!?) scaleat (x,y)-coordinates • For civil engineering:crossing ofstreet and railroad can be at the same level, or one above the other • Data onsubsurface layersand their thickness
The time component • Same region, same themes, different dates: Allows computation of change • Trajectories give the locations at certain times for moving objects
Level of aggregation Income ofan individual Average income in amunicipality Average income in a province Average income in acountry Higherlevel of aggregation
Various aggregations in the Netherlands • Prinvines (12) • Municipalities (441) • COROP regions (40) • Water districts (39) • Economic-geographic regions (129) • 2- and 4-number postal codes • Macro-regions (4 of 5; provinces joined) • Labor exchange district (127), planning region (43), nodal region (80), ...
Aggregation: dangers • MAUP: modifiable areal unit problem Located occurrencesof a raredisease 0 - 1 2 - 4 5 - clustering?
Aggregation: dangers • MAUP: modifiable areal unit problem Located occurrencesof a raredisease 0 - 1 2 - 4 5 - Aggregation boundarieshave got nothing to do with mapped theme clustering?
Aggregation: dangers • Not enough aggregation: privacy violations(e.g. AIDS-cases with complete postal code) • Correctionfor population spreadis necessaryin case of data on people 0 - 1 Locatedoccurrences of a rare disease 2 - 4 5 - clustering?
Huntington’s disease, 1800-1900
Summary • Data is geometry, attribute, and time • Data is coded in a reference system • Attribute data is usually on one of the standard scales of measurement • Classification of interval and ratio data is needed for mapping (isoline or choropleth) and histograms • The object view and field view exist • Geometric data has a dimension (point, line, area), but this may depend on scale and application • Data is often spatially aggregated