1 / 59

Visualisation 2012 - 2013 Lecture 4

Visualisation 2012 - 2013 Lecture 4. Visualising Comparisons. Brian Mac Namee Dublin institute of Technology Applied Intelligence Research Centre. Origins. This course is based heavily on a course developed by Colman McMahon ( www.colmanmcmahon.com )

nyoko
Download Presentation

Visualisation 2012 - 2013 Lecture 4

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Visualisation2012 - 2013Lecture 4 VisualisingComparisons Brian Mac Namee Dublin institute of Technology Applied Intelligence Research Centre

  2. Origins • This course is based heavily on a course developed by Colman McMahon (www.colmanmcmahon.com) • Material from multiple other online and published sources is also used and when this is the case full citations will be given

  3. Visualization of the Week www.pinterest.com/brianmacnamee/great-visualisation-examples/

  4. (Un)Visualization of the Week www.pinterest.com/brianmacnamee/terrible-visualisation-examples/

  5. Agenda • This week we are going to look at means through which we can visualise comparisons between variable values • Single variable exploration • Simple comparisons • Multi distribution comparisons

  6. Single Variable Exploration

  7. Histogram “Visualize This”, N. Yau, Wiley, 2011http://shop.oreilly.com/product/0636920022060.do

  8. Histogram • A histogram gives us an in-depth view of a single numeric variable • To construct a histogram: • Divide the data range into bins • Count the occurrence frequency of each bin within the data • Normalize the frequency counts • Plot a bar graph to show the normalised count for each bin “Visualize This”, N. Yau, Wiley, 2011http://shop.oreilly.com/product/0636920022060.do

  9. Histogram “Visualize This”, N. Yau, Wiley, 2011http://shop.oreilly.com/product/0636920022060.do

  10. Histogram Shapes

  11. Density Plot “Visualize This”, N. Yau, Wiley, 2011http://shop.oreilly.com/product/0636920022060.do

  12. Density Plot • Note that constructing a density plot requires that the probability density function underlying the data in the histogram is constructed – this takes a bit of work! • Common approaches include: • Parzen windows • Clustering • Mixture models

  13. Density Plot From Wikipedia! http://en.wikipedia.org/wiki/Parzen_window

  14. Density Plot From Wikipedia! http://en.wikipedia.org/wiki/Parzen_window

  15. Density Plot “Visualize This”, N. Yau, Wiley, 2011http://shop.oreilly.com/product/0636920022060.do

  16. Histogram & Density Plot Combined “Visualize This”, N. Yau, Wiley, 2011http://shop.oreilly.com/product/0636920022060.do

  17. Histogram • The histogram is quite possibly your most important visual data exploration tool!!!

  18. Box Plot 50 40 30 20 10 0

  19. Box Plot OUTLIERS Values that fall outside quartile ± 1.5*IQR VARIABLE VALUES Values displayed for a single variable 50 MAX Max value below 3rd Q + 1.5*IQR 40 3rd QUARTILE The value for the 3rd quartile of the variable values 30 MEDIAN The median value for the variable 20 1st QUARTILE The value for the 1st quartile of the variable values MIN Min value above 1st Q - 1.5*IQR 10 0

  20. Box Plot • The components of a box plot are: • A thick dark line at the minimum • A horizontal lines at the 1st quartiles • A horizontal lines at the 3rd quartiles • A whisker down to the low value • Multiply the IQR by 1.5 to calculate the step • The low value is the lowest value above the 1st quartile minus the step • A whisker up to the high value • The high value is the highest value above the 3rd quartile plus the step • Any values outside low and high are marked as outliers

  21. Box Plot • Some important points about a box plot: • 50% of the data occurs between the lower and upper edges of the box • The lower 50% of the data occurs below the median • The upper 50% of the data occurs above the median line in the box. • The lower 25% of the data occurs between the bottom edge of the box and the bottom edge of the lower whisker • The upper 25% of the data occurs above the top edge of the box and the top edge of the upper whisker

  22. Bar Chart

  23. Bar Chart

  24. Bar Chart

  25. Box Plots & Density Functions From Wikipedia! http://en.wikipedia.org/wiki/Probability_density_function

  26. Simple Comparisons

  27. Simple Bar Graph A B C D E F Categories CATEGORY AXIS A value is displayed for each category “Visualize This”, N. Yau, Wiley, 2011 http://shop.oreilly.com/product/0636920022060.do

  28. Simple Bar Graph Average Score Rating Survey Research & Design in Psychology Course Evaluation http://ucspace.canberra.edu.au/display/7126/Evaluation+2007

  29. Simple Bar Graph Edward Tufte, “The Quantittative Display of Information”, 2009

  30. Simple Bar Graph

  31. Simple Bar Graph

  32. Pie Chart “Visualize This”, N. Yau, Wiley, 2011http://shop.oreilly.com/product/0636920022060.do

  33. Pie Charts http://www.uh.edu/engines/epi1712.htm Google Analytics http://analytics.google.com

  34. Pie Charts http://www.uh.edu/engines/epi1712.htm Google Analytics http://analytics.google.com

  35. Pie Charts Google Analytics http://analytics.google.com

  36. Pie Charts http://www.uh.edu/engines/epi1712.htm Google Analytics http://analytics.google.com

  37. Pie Charts http://www.uh.edu/engines/epi1712.htm Google Analytics http://analytics.google.com

  38. Pie Chart

  39. Pie Chart William Playfair's "Statistical Breviary,” 1801 via The New York Times http://www.nytimes.com/2012/04/22/magazine/who-made-that-pie-chart.html?_r=0

  40. Pie Chart Florence Nightingales’ Crimean War Death Charts via:http://www.uh.edu/engines/epi1712.htm

  41. Pie Charts • Pie charts are the subject of a lot of negative comment • http://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msg_id=00018S • http://www.juiceanalytics.com/writing/the-problem-with-pie-charts/ • The main reason is that their descriptive power is based on our ability to interpret differences in angle • Pie charts are useful when: • We have a small number of categories (< 8) • The values sum to a meaningful whole • The differences are coarse

  42. Doughnut Chart “Visualize This”, N. Yau, Wiley, 2011http://shop.oreilly.com/product/0636920022060.do

  43. Doughnut Chart

  44. Doughnut Chart

  45. Tree Map “Visualize This”, N. Yau, Wiley, 2011http://shop.oreilly.com/product/0636920022060.do

  46. Billion-Dollar-O-Gram www.informationisbeautiful.net/2009/the-billion-dollar-gram/

  47. Treemaps • Treemaps were originally designed to handle hierarchical structures – such as disk drives – but can be used for non-hierarchical data • Treemaps rely on a tiling algorithm to figure out how to position the rectangles • We will come back to this! TreeMap page by Ben Schneiderman (TreeMap Pioneer): http://www.cs.umd.edu/hcil/treemap-history/index.shtml Early paper on TreeMaps: http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?isNumber=4467&arNumber=175815&isnumber=4467&arnumber=175815

  48. Multi Distribution comparisons

  49. Achtung! Average Score Rating Survey Research & Design in Psychology Course Evaluation http://ucspace.canberra.edu.au/display/7126/Evaluation+2007

  50. Achtung! Watch out for bar charts that show an average or other aggregate – these can hide a multitude of detail Average Score Rating Survey Research & Design in Psychology Course Evaluation http://ucspace.canberra.edu.au/display/7126/Evaluation+2007

More Related