1 / 33

Visualising Variables – Validly!

Visualising Variables – Validly!. 8 September 2010. Damien Jolley. School of Public Health & Preventive Medicine Monash University. AHMRC Posters. Weather information, New York Times, September ‘08. Petrol prices, Melbourne, Aug-Sep, 2010. Motivating examples.

latona
Download Presentation

Visualising Variables – Validly!

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Visualising Variables – Validly! 8 September 2010 Damien Jolley School of Public Health & Preventive MedicineMonash University AHMRC Posters

  2. Weather information,New York Times, September ‘08 Petrol prices, Melbourne, Aug-Sep, 2010 Motivating examples Note: There are 22 x 81 = 382 data points displayed in the NY Times weather chart

  3. Obvious fact #1: • Graphs can communicate data: • quickly • accurately • powerfully • efficiently

  4. “Only 50% of American 17-year-olds can identify information in a graph”* * US National Assessment of Educational Progress,June 1990 Source: Wainer H. Understanding graphs and tables. Educational Researcher 1992; 21:14-23

  5. Whose fault? Drawn using MS Excel ‘XY-chart’ “Like characterising someone’s ability to read by asking questions about a passage full of spelling and grammatical errors. What are we really testing?” Source: Wainer H. Understanding graphs and tables. Educational Researcher 1992; 21:14-23

  6. Image taken from www.healthlinx.com.au Marketers of OvPlex proposed screening test for ovarian cancer Survival from Ovarian cancer

  7. Obvious fact #2: • Bad graphs can hinder communication

  8. A new view of the world http://odtmaps.com

  9. Where in the world is diabetes? http://www.worldmapper.org

  10. http://www.gapminder.org

  11. http://www.safetyandquality.gov.au/ acknowledgement: UQ PhD scholar Megan Preece

  12. Less obvious facts #3, #4, #5: • What characterises a “good” graph? • What are the characteristics of a “bad” graph? • What software to use? How to use it?

  13. Howie’s Helpful Hintsfor bad graph displays • Ten useful pointers to help you create uninformative, difficult-to-read scientific graphs • Adapted from:Wainer H. (1997) Visual Revelations.Mahwah, NJ: Lawrence Erlbaum Associates, Publishers

  14. Steps for better graphs • Identify direction of effect • In almost all cases, the cause or predictor variable should be horizontal (X) • Effect or outcome variable is best vertical (Y) • Identify the levels of measurement • Nominal, ordinal or quantitative are different! • Think of visual perception guides • Columns or dots? Lines or scatterplot? • Minimise guides and non-data • Grid lines, tick marks, legends are non-data

  15. Time Time Standard deviation Standard deviation Cause (X) and effect (Y) Figure 16 Standard deviation of batting averages for all full-time players by year for the first 100 years of professional baseball. Note the regular decline.* Source:Gould, Stephen Jay. Full House: The Spread of Excellence from Plato to Darwin. Random House, 1997.cited: http://www.math.yorku.ca/SCS/Gallery/, 24 Nov 2002 * My emphasis

  16. Source: Killias M. International correlations between gun ownership and rates of homicide and suicide. Can Med Assoc J 1993; 148: 1721-5

  17. 50 USA 10 Belgium Finland Canada Australia Rate of homicide with a gun (per million per year) France 5 Switzerland Spain Netherlands Norway West Germany Scotland 1 England & Wales 10 20 30 40 % of households owning guns Drawn using S-plus

  18. The right display for a variable depends on its level of measurement For univariate graphs, qualitative barplot ordinal column chart quantitative boxplot or histogram For bivariate graphs, X ordinal, Y binaryconnected percents X & Y both quantitativescatterplot X categorical, Y quantbox plots Binary eg gender, death, pregnant Categorical Qualitative eg race, political party, religion Diverging eg change (-ve to +ve) Ordinal eg rating scale, skin type, colour Quantitative Interval only differences matter, eg BP, IQ Ratio absolute zero, ratios matter, eg weight, height, volume Levels of Measurement

  19. Outcome is COHb%, but drawn on X Ordinal variable, but categories mixed Source:Lewis S, Mason C, Srna J. Carbon monoxide exposure in blast furnace workers. Aust J Public Health. 1992 Sep;16(3):262-8.

  20. Outcome variable Predictor variable An alternative display . . . Area of circles proportional to n Drawn using MS Excel ‘bubble plot’

  21. Principles of visual perception • WS Cleveland • much work in psycho-physics of human visual understanding Tells us: • hierarchy of visual quantitative perception • patterns and shade can cause vibration • graphs can shrink with almost no loss of information Source: Cleveland WS. The Elements of Graphing Data. Monterey: Wadsworth, 1985.

  22. Ubiquitous column charts Source: Jamrozik K, SpencerCA, et al. Does the Mediterranean paradox extend to abdominal aortic aneurism? Int J Epidemiol 2001; 30(5): 1071

  23. A dotchart version… Drawn using S-plus “Trellis” graphics

  24. Moiré vibration is easy with a computer !!!

  25. Moiré vibration • Vibration is maximised with lines of equal separation • This is common in scientific column charts cited in Tufte E. The Visual Display of Quantitative Information.

  26. Greeks in Greece Anglo-Celts in Australia Japanese in Japan Swedes in Sweden Greeks in Australia 0.10 0.25 0.50 0.75 1.00 Relative mortality rate (all causes) Minimise non-data ink • Non-data ink includes tick marks, grid lines, background, legend • Explanation of error bars, P-values can be included in caption or in text Note the exception for X-Y orientation: because predictor is qualitative (unordered)

  27. Dedicated programs – thousands! Prism ViSta DeltaGraph SigmaPlot Business graphics MS Excel Visio (MS Office) many other spreadsheet programs Graphics in statistical packages Stata simple, powerful R powerful, free StatsDirect Very like Excel SPSS interactive graphics easy, expensive Systat good reputation SAS expensive, powerful Minitab Popular, powerful Software for scientific graphics Advice: Avoid “default” choice in all programs (almost always wrong).Avoid programs with “Chart Type” menus – wrong approach.

  28. Death by Powerpoint • Powerpoint is power-ful for editing graphs, presenting • But… • Dependence on bullet points • Linear thinking • Presenters READING slide-after-slide • Many design gurus now reject the Powerpoint (keynote, etc) paradigm

  29. Object-oriented lines, shapes, etc can be identified within graph each object has attributes (eg size, colour, font) editable using selection and “grouping” Common formats: Postscript (ps,eps) Windows metafile (wmf,emf) Bit-mapped image exists as a collection of pixels each pixel is light or dark, coloured can edit only pixels not objects often “compressed” to save disk space, bandwidth Common formats graphics interchange (gif) Windows bitmap (bmp) JPEG interchange (jpg) Graph formats Advice: Use WMF format where possible. Paste WMF into PowerPoint, “ungroup”, then edit objects for publication quality.

  30. Cleveland WS. Visualizing Data Summit NJ: Hobart Press, 1993 Wilkinson L. The Grammar of Graphics New York: Springer Verlag, 1999 Tufte ER. The Visual Display of Quantitative Information Cheshire, CT: Graphics Press 2001 www.edwardtufte.com Wainer H. Visual Revelations. Graphical Tales of Fate and Deception from Napoleon Bonaparte to Ross Perot Mahwah, NJ: Lawrence Erlbaum Associates, Publishers. 1997 www.erlbaum.com References, further reading

  31. Summary • Howie’s Helpful Hints for bad graphs: • Don’t show the data • Show the data inaccurately • Obfuscate the data • Steps for better graphs: • Identify direction of cause & effect • Exploit levels of measurement • Accommodate visual perception principles • Minimise non-data ink • Don’t use Excel unless you have to • And if you have to, don’t use the default chart!

  32. Finally, on a personal note, Thank you! To all my friends at Monash & SPHPM for their continuing support and understanding over the last 18 months,particularly Steve, John, Peter & Just

More Related