1.11k likes | 1.26k Views
Exploratory Analysis of Spatial Data and Decision Making with the Use of Interactive Maps and Linked Dynamic Displays. Tutorial N . Andrienko and G . Andrienko SPADE : Spatial Decision Support Team Fraunhofer AIS: Institute for Autonomous Intelligent Systems
E N D
Exploratory Analysis of Spatial Data and Decision Making with the Use of Interactive Maps and Linked Dynamic Displays Tutorial N. Andrienko and G. Andrienko SPADE : Spatial Decision Support Team Fraunhofer AIS: Institute for Autonomous Intelligent Systems http://www.ais.fraunhofer.de/SPADE
Contents • Introduction • Interactive maps • Dynamically linked views • Exploration of time-variant data • Spatial decision making • Conclusions
Introduction • Introduction • What is Exploratory Data Analysis • Spatially referenced data and cartographic visualisation • Major techniques for thematic mapping • Interactive maps • Dynamically linked views • Exploration of time-variant data • Spatial decision making • Conclusions
Exploratory Data Analysis Originator: John W.Tukey Main idea: represent data so as to facilitate understanding and prompt hypotheses The greatest value of a picture is when it forces us to notice what we never expected to see. John W. Tukey
Exploratory Spatial Analysis (example) Dr. John Snow: Investigation of deaths from cholera London, September 1854 death locations spatial cluster infected water pump? A good data representation is the key to solving the problem
Typical Tasks in Exploratory Spatial Analysis • Identify spatial patterns • Identify information relevant to explaining the patterns • Identify relationships between spatial phenomena
Tools for Exploratory Spatial Analysis: Thematic Maps General reference maps Thematic maps …show spatial distribution of attributes …show locations of objects
Some Techniques for Thematic Mapping (1) Choropleth maps: enumeration units coloured or shaded to represent different magnitudes of an attribute colour scales: sequential (gradient) diverging (double-ended) classified:colours correspond to value intervals unclassified:degrees of darkness are proportional to values
Some Techniques for Thematic Mapping (2) Chart maps: sizes of chart segments are proportional to values of several attributes Pie charts: one slice per attribute, angle proportional to value; pie size (area) proportional to sum of all Bar charts: one bar per attribute,height proportional to value
Current Exploratory Tools High interactivity Due to direct manipulation computer screens will play no less revolutionary role for data exploration than the invention of Cartesian coordinates Enabling multiple complementary views W.Cleveland 1993 allow the user ... to “see” data from multiple perspectives A.MacEachren and M.-J. Kraak 1997
Part 1: Interactive Maps • Introduction • Interactive maps • Manipulation of unclassified choropleth maps • Dynamic classification and cross-classification • Manipulation of chart maps • Dynamically linked views • Exploration of time-variant data • Spatial decision making • Conclusions
Interactive Maps Access data through a map: no need for value decoding Change representation parameters for gaining more conveniences for analysis
Unclassified Choropleth Maps: Removing Outliers (1) almost equal darkness maximal darkness maximal darkness corresponds now to a lower value no darkness for this value Outlier: a very high (or very low) value, far apart from others Effect of outlier removal Interactive outlier removal
Unclassified Choropleth Maps: Removing Outliers (2) more shades to perceive almost equal darkness After the removal of two outliers, the differences are better seen
Removing Outliers: Exercise Find and remove the outlier on the map of “% female 1991” (Portugal)
Unclassified Choropleth Maps: Pattern Investigation (1) click here to transform the colour scale from sequential to diverging move the slider and observe how the map changes
Unclassified Choropleth Maps: Pattern Investigation (2) By moving the slider, we see more patterns and gain more understanding of value distribution Porto Lisboa Clusters of low values around Porto and Lisboa Clusters of high values in central-east One more cluster of low values Coast-inland contrast West-to-east increase
Pattern Investigation: Exercise By manipulating the colour scale, explore the distribution patterns of “% employed in industry 1991” (Portugal) must be “checked”
Unclassified Choropleth Maps: Object Comparison The diverging colour scale allows us to compare an object with all others: Click! After clicking on Beja: lower values than in Beja blue higher values than in Beja brown must be “checked”
Unclassified Choropleth Maps: Pattern Comparison (1) How similarly are these attributes distributed in space? Let us synchronously manipulate the colour scales of the two maps…
Unclassified Choropleth Maps: Pattern Comparison (2) Colour scale manipulation facilitates revealing differences in spatial distributions
Pattern Comparison: Exercise Compare the distribution of “% employed in industry 1981” to 1991
Dynamic Classification (by values of a single numeric attribute) attribute’s whole value range interval breaks: can be interactively moved; the map is dynamically updated
Dynamic Classification: Additional Analytical Facilities Cumulative Frequency Curve How it is built: y x (attribute value) interval breaks X-axis: attribute’s value range Y-axis: object number or % of the whole set y is the number of objects (districts) with values less than or equal to x N of objects (districts) in the corresponding classes
Cumulative Curve: an Extension Cumulative Population Curve What we can learn about the distribution of the population over the classes: How it is built: Y P y p x (attribute value) X-axis: attribute’s value range Y-axis: object number or % of the whole set P-axis: population number or % of the whole country’s population p is the aggregate population of districts with values less than or equal to x …and any other quantitative (summable) attribute can be analogously considered
Using Cumulative Curves (1) Let us move the breaks so that the classes have approx. equal population And here is the result: move this break in the same way move to right until this class has 33% population Look here! Look here!
Using Cumulative Curves (2) Some statistics about the result: In the most part of Portugal (coloured in blue) the proportion of people having high school education is below 4.67. However, on this large territory only one third of the country’s population lives. In these areas over 7.82% people have high school education. Here lives 33.1% of the total country’s population.
Classification and Cumulative Curves: Exercise • Classify the districts of Portugal according to values of the attribute “% employed in agriculture 1991”. • Investigate how this is related to the distribution of the whole population (attribute “Total pop. 1991”) and people having high school education (attribute “N pop. with high school education 1991”). • Find the areas with the lowest % employed in agriculture where about one third of the whole population lives. • What part of country’s population with high school education lives in these areas?
Cross-classification (1) Classification by values of 2 numeric attributes The default class breaks are the median values of the attributes Red: low employment in industry and high employment in agriculture Green: high employment in industry and low employment in agriculture Yellow: low employment in both industry and agriculture Brown: high employment in both industry and agriculture
Cross-classification (2) The classes can be changed: by entering break values in the text fields by clicking on the color bars and moving the sliders
Cross-classification: Exercise • Apply cross-classification to attributes • “% employed in agriculture 1991” and “% pop. change from 1981 to 1991” • Divide into districts with population decrease (change 0) and increase (change >0) • Divide into districts with 20% and over 20% people employed in agriculture • How many districts are in each class? • What are the two largest classes? • How are these classes distributed over the territory? • What district has the most unusual combination of values of the two attributes? (Find it on the scatterplot and on the map)
Piechart Map “Pie” size is proportional to the total (sum of the attribute values) Applicable to several attributes that together give some meaningful whole The division into slices shows proportion of each attribute in the total Here the population is very small in comparison to the large cities. Therefore, the pies are too small to be seen However, the map often looks like this:
Piechart Map: Focusing Move this delimiter to the left The largest pies are gradually removed (replaced by hollow circles) The remaining pies become larger Now the maximum pie size corresponds to this value
Focusing and Data Investigation In districts with much population people work in industry (magenta) and services (cyan). Northwest: more industry Centre-west: more services At this stage, the agricultural part (green) becomes visible In districts with little population considerable proportion of people works in agriculture, but services still prevail
Piechart Map: More Facilities (1) “Unchecking” makes all pies equal in size Dragging attribute names up and down changes the order of the slices Selecting this option results in a minimum circle being used for the minimum total sum (compare to the “strictly proportional” variant)
Piechart Map: More Facilities (2) In the “visual comparison” mode, each pie is surrounded by an outer circle showing the proportions in a selected reference object “Checking” this box allows us to compare any district to all others by just clicking on it on the map
Barchart Map Applicable to several comparable attributes; one bar per attribute Bar height is proportional to the value of the corresponding attribute Convenient for local comparisons, e.g. values in different years
Barchart Map: Focusing As with pies, we can apply focusing Greater values are represented by hollow bars Very small values! The maximum height now corresponds to a smaller value
Barchart Map: Outlier Removal The same technique is suitable for removing outliers: 86% female? This is an outlier (may also be an error)
Barchart Map: Comparison to a Number (1) We have chosen 50 as the reference value to compare with. Now each bar represents the difference between the respective attribute value (% female) and 50. For better visibility of the differences, we have also removed the outliers and focused on the value interval from 46.43 to 56.94 The reference value may be also changed by moving this slider
Barchart Map: Comparison to a Number (2) Comparison of % female in 1981 and 1991 to 50%: Here % female was much higher than 50 in 1981 and even increased in 1991 Here % female was over 50 in 1981 but decreased in 1991 to below 50. Here % female was slightly below 50 in 1981 but increased in 1991 to over 50. Here % female was lower than 50 in 1981. It increased in 1991 but still was below 50.
Chart Maps: Exercise 1 • What type(s) of chart map is (are) suitable for each of the following attribute combinations: • % 0-14 years, % 15-24 years, % 25-64 years, and % 65 or more years (population by age) • % female among employed and % female among unemployed • N pop. no primary school education, N pop. with primary school education, N pop. with preparatory school education, N pop. with high school education • % employed in industry 1981 and % employed in industry 1991
Chart Maps: Exercise 2 • Build a chart map representing the attributes • N pop. no primary school education • N pop. with primary school education • N pop. with preparatory school education • N pop. with high school education • Using map manipulation controls, compare the population structures according to education in Evora and Beja to that in Lisboa (think what type of chart map is suitable for this task) Evora Lisboa Beja
Part 2: Dynamically Linked Views • Introduction • Interactive maps • Dynamically linked views • Linking by object highlighting • Dynamic query • Propagation of object classes • Exploration of time-variant data • Spatial decision making • Conclusions
Linked Displays We Already Used Map and dot plot; each district shown on the map is also represented by a dot Map and scatter plot: the same technique Map Dot plot A district pointed on the map with the mouse is simultaneously highlighted on the map and the plot
Display Linking by Highlighting The same works for arbitrary displays. All of them are linked: and here, An object pointed on the map with the mouse and here, but not here: this is an aggregated view that does not show individual objects is simultaneously highlighted here,
Display Linking by Selection Selection (durable highlighting) does not disappear after the mouse is moved away. One or more objects may be selected by clicking on them or drawing a frame around. These black dots correspond to the selected objects These black lines correspond to the selected objects We have clicked on each of these 3 objects
Using Display Linking (1) Let us examine characteristics of districts in this area Two distinct clusters in the value space of these two attributes The characteristics in terms of the upper 4 attributes are rather coherent This box must be “checked” The values of this attribute greatly vary The values of this attribute are split in two groups with a gap between The districts fit in the left half of the histogram, mostly in bars 1 and 4 Enclose the area in a frame
Using Display Linking (2) Select high values by drawing a frame Let us look at the districts with high % employed in industry: The districts form 3 spatial clusters Low proportions of agricultural workers and people without primary school education The districts have average or high proportions of children and young Population change: mostly between –0.1% and 12.4%
Using Display Linking (3) Click on the rightmost bars in the histogram Let us look at the districts with the highest population growth: The districts form some spatial clusters The proportions of agricultural workers and people without primary school education are mostly low, but there is an outlier The districts have average or high proportions of children and young and low proportions of old people