310 likes | 502 Views
Maps as Data Visualization. Mills College 2013 Dan Ryan. Maps as Data Visualization Or A Picture’s Worth Megabytes. John Snow’s Cholera Map (1854). Minard’s Visualization of Napoleon's 1812 Russian Campaign (1869). Minard’s French Cattle Map (1858). What People are Reading.
E N D
Maps as Data Visualization Mills College 2013 Dan Ryan
Maps as Data Visualization Or A Picture’s Worth Megabytes
Minard’sVisualization of Napoleon's 1812 Russian Campaign (1869)
What is “Big Data”? • Datasets too large for conventional tools • Hardware • Software • Mathematics • Moving Target • Terabytes to petabytes • Examples • Internet documents, call records, meteorology, genomics, medical records, photography archives, video archives, e-commerce.
Data Exhaust • Aka “tertiary data” John McEnroe vs Bjorn Borg Wimbledon 1980 Roger Federer vs. Rafael Nadal 2006
http://stoweboyd.com/post/1712873757/steven-johnson-what-a-hundred-million-calls-tohttp://stoweboyd.com/post/1712873757/steven-johnson-what-a-hundred-million-calls-to
Distributions • How are the values in a dataset spread out among all the possible values? 89 95 78 89 91 95 95 78 95 78 89 91 95 78 78 77 91 90 89 95 78 95 89 89 91 77 91 95 89 91 89 90 89 95 78 89 91 95 78 89 91 95 95 91 78 90 89 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 77
Suppose that on Exam 2 • We get exactly the same distribution • But some did better, some worse, some same 89 95 78 89 91 95 78 89 91 95 95 91 78 90 89 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 77
Data and Color (after C. Brewer) • Characterizing Distributions • Binary: yes/no, present/absent • Diverging: above/below a middle value • Qualitative: categorical or nominal data • Sequential: data ranges from low to high
Review Quiz • %non-white population by census tract ☐ Binary ☐ Diverging ☐ Qualitative ☐Sequential • District name ☐ Binary ☐ Diverging ☐ Qualitative ☐Sequential • Did state vote for presidential winner? ☐ Binary ☐ Diverging ☐ Qualitative ☐ Sequential • Median Household Income by Tract ☐ Binary ☐ Diverging ☐ Qualitative ☐ Sequential
Review Quiz • %non-white population by census tract Sequential is a reasonable first choice, since the data will be distributed from a minimum to a maximum value. But diverging might be better if we are implying a comparison WITHIN the city. The city has an overall average percentage non-white population. Our map would do well to show which tracts are above the city average and which ones are below and by how much • District name This is a completely nominal variable. Not even qualitative if you think about it. The correct color scheme would be to use different contrasting colors for each district.
Review Quiz • Did state vote for presidential winner? This is a strictly binary distribution: yes and no. We probably would not even use red and blue here because we are not trying to communicate the party voted for, just the result. • Median Household Income by Tract One could argue for either sequential or diverging in this case. It would be diverging if we were trying to show which tracts are above and below the city median or the national median household income. OTOH, if we were trying to show the spread from low to high in this city, we would go with sequential.