1 / 21

Envisioning Information

Envisioning Information Lecture 2 Simple Graphs and Charts Ken Brodlie School of Computing University of Leeds Lecture Outline Preliminaries Definitions Datatypes Simple Data Presentation Graphs and charts Basic Datatypes correspond to different levels of measurement Data can be:

benjamin
Download Presentation

Envisioning Information

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Envisioning Information Lecture 2 Simple Graphs and Charts Ken Brodlie School of Computing University of Leeds ENV 2006

  2. Lecture Outline • Preliminaries • Definitions • Datatypes • Simple Data Presentation • Graphs and charts ENV 2006

  3. Basic Datatypes correspond to different levels of measurement Data can be: Categorical - labels Numerical – numbers Categorical Nominal No sense of order Apples, oranges,… Ordinal Ordered in sequence January, February, .. Numerical Continuous Real numbers Height of students in class Discrete Typically whole numbers Marks in an exam Fundamentals ENV 2006

  4. Give an example for each class in which numbers are involved… Categorical - nominal Categorical - ordinal Numerical – continuous Numerical - discrete Question ENV 2006

  5. Pioneering figure is John Tukey New approach to data analysis, heavily based on visualization, as an alternative to classical data analysis See wikipedia Two stage process: Exploratory: Search for evidence using all tools available Confirmatory: evaluate strength of evidence using classical data analysis Exploratory Data Analysis ENV 2006

  6. Simple Data Presentation ENV 2006

  7. Simple data tables are often presented as line graphs, bar graphs, pie charts, dot graphs, histograms… Which should we use and when? Simple Data Presentation ENV 2006

  8. Fundamental technique of data presentation Used to compare two variables X-axis is often the control variable Y-axis is the response variable Good at: Showing specific values Trends Trends in groups (using multiple line graphs) Mobile Phone use Line Graph Students participating in sporting activities Any critical comments here? Note: graph labelling is fundamental ENV 2006

  9. Bar graph Presents categorical variables Height of bar indicates value Double bar graph allows comparison Note spacing between bars Can be horizontal (when would you use this?) Simple Representations – Bar Graph Number of police officers Internet use at a school Note more space for labels ENV 2006

  10. Very simple but effective… Horizontal to give more space for labelling Dot Graph ENV 2006

  11. Pie chart summarises a set of categorical/nominal data But use with care… … too many segments are harder to compare than in a bar chart Pie Chart Should we have a long lecture? Favourite movie genres ENV 2006

  12. Histograms summarise discrete or continuous data that are measured on an interval scale No gaps if variable is continuous Histograms Distribution of salaries in a company ENV 2006

  13. Used to present measurements of two variables Effective if a relationship exists between the two variables Example taken from NIST Handbook – Evidence of strong positive correlation Scatter Plot Car ownership by household income ENV 2006

  14. The scatter plot is a fundamental tool in Excel Chart type XY (Scatter) and subtype Unconnected Points Scatter Plots in Excel http://www2.ncsu.edu:8010/ncsu/chemistry/resource/excel/excel.html ENV 2006

  15. Excel allows you to add a linear regression line (trend line) Regression Line Remember: correlation does not imply causality… ie a relationship exists but one is not necessarily causing the other – there may be a third factor? ENV 2006

  16. Tukey Sum-Difference Plot Better understanding of residuals … ENV 2006

  17. In some situations we have, not a single data value at a point, but a number of data values, or even a probability distribution When might this occur? Tukey proposed the idea of a boxplot to visualize the distribution of values For explanation and some history, see: http://mathworld.wolfram.com/Box-and-WhiskerPlot.html http://en.wikipedia.org/wiki/Box_plot Darwin’s plant study http://www.upscale.utoronto.ca/GeneralInterest/Harrison/Visualisation/Visualisation.html Box Plots M – median Q1, Q3 – quarrtiles Whiskers – 1.5 * interquartile range Dots - outliers ENV 2006

  18. Acknowledgement • Thanks to Statistics Canada – an excellent web site for simple data presentation • http://www.statcan.ca/english/edu/power/toc/contents.htm ENV 2006

  19. Exercise for next week • Understand a bit more about the merits of pie charts and bar graphs • Create a dataset with roughly equal numbers in each class • Which is best if the task is to discriminate? ENV 2006

  20. Exercise for next week • Over the next week look for examples of basic graphs • In newspapers, magazines or other print media • On news web sites or other electronic media • Analyse two examples • One should be a example where you think the use of graphics is good • One should be bad • Be ready next week to present these results to the class… ENV 2006

  21. Envisioning Information : Practical Work Gnuplot R Excel ENV 2006

More Related