280 likes | 396 Views
Visualising Variables – Validly!. November 2008. Damien Jolley. Monash Institute of Health Services Research Monash University. AHMRC Posters. Weather information, New York Times, September ‘08. Obvious fact #1:. Graphs can communicate data: quickly accurately powerfully efficiently.
E N D
Visualising Variables – Validly! November 2008 Damien Jolley Monash Institute of Health Services ResearchMonash University AHMRC Posters
Obvious fact #1: • Graphs can communicate data: • quickly • accurately • powerfully • efficiently
“Only 50% of American 17-year-olds can identify information in a graph”* * US National Assessment of Educational Progress,June 1990 Source: Wainer H. Understanding graphs and tables. Educational Researcher 1992; 21:14-23
Whose fault? Drawn using MS Excel ‘XY-chart’ “Like characterising someone’s ability to read by asking questions about a passage full of spelling and grammatical errors. What are we really testing?” Source: Wainer H. Understanding graphs and tables. Educational Researcher 1992; 21:14-23
Obvious fact #2: • Bad graphs can hinder communication
Less obvious facts #3, #4, #5: • What characterises a “good” graph? • What are the characteristics of a “bad” graph? • What software to use? How to use it?
Howie’s Helpful Hintsfor bad graph displays • Ten useful pointers to help you create uninformative, difficult-to-read scientific graphs • Adapted from:Wainer H. (1997) Visual Revelations.Mahwah, NJ: Lawrence Erlbaum Associates, Publishers
Steps for better graphs • Identify direction of effect • In almost all cases, the cause or predictor variable should be horizontal (X) • Effect or outcome variable is best vertical (Y) • Identify the levels of measurement • Nominal, ordinal or quantitative are different! • Think of visual perception guides • Columns or dots? Lines or scatterplot? • Minimise guides and non-data • Grid lines, tick marks, legends are non-data
Time Time Standard deviation Standard deviation Cause (X) and effect (Y) Figure 16 Standard deviation of batting averages for all full-time players by year for the first 100 years of professional baseball. Note the regular decline.* Source:Gould, Stephen Jay. Full House: The Spread of Excellence from Plato to Darwin. Random House, 1997.cited: http://www.math.yorku.ca/SCS/Gallery/, 24 Nov 2002 * My emphasis
Source: Killias M. International correlations between gun ownership and rates of homicide and suicide. Can Med Assoc J 1993; 148: 1721-5
50 USA 10 Belgium Finland Canada Australia Rate of homicide with a gun (per million per year) France 5 Switzerland Spain Netherlands Norway West Germany Scotland 1 England & Wales 10 20 30 40 % of households owning guns Drawn using S-plus
The right display for a variable depends on its level of measurement For univariate graphs, qualitative barplot ordinal column chart quantitative boxplot or histogram For bivariate graphs, X ordinal, Y binaryconnected percents X & Y both quantitativescatterplot X categorical, Y quantbox plots Binary eg gender, death, pregnant Categorical Qualitative eg race, political party, religion Diverging eg change (-ve to +ve) Ordinal eg rating scale, skin type, colour Quantitative Interval only differences matter, eg BP, IQ Ratio absolute zero, ratios matter, eg weight, height, volume Levels of Measurement
Outcome is COHb%, but drawn on X Ordinal variable, but categories mixed Source:Lewis S, Mason C, Srna J. Carbon monoxide exposure in blast furnace workers. Aust J Public Health. 1992 Sep;16(3):262-8.
Outcome variable Predictor variable An alternative display . . . Area of circles proportional to n Drawn using MS Excel ‘bubble plot’
Principles of visual perception • WS Cleveland • much work in psycho-physics of human visual understanding Tells us: • hierarchy of visual quantitative perception • patterns and shade can cause vibration • graphs can shrink with almost no loss of information Source: Cleveland WS. The Elements of Graphing Data. Monterey: Wadsworth, 1985.
Ubiquitous column charts Source: Jamrozik K, SpencerCA, et al. Does the Mediterranean paradox extend to abdominal aortic aneurism? Int J Epidemiol 2001; 30(5): 1071
A dotchart version… Drawn using S-plus “Trellis” graphics
Moiré vibration is easy with a computer !!!
Moiré vibration • Vibration is maximised with lines of equal separation • This is common in scientific column charts cited in Tufte E. The Visual Display of Quantitative Information.
Greeks in Greece Anglo-Celts in Australia Japanese in Japan Swedes in Sweden Greeks in Australia 0.10 0.25 0.50 0.75 1.00 Relative mortality rate (all causes) Minimise non-data ink • Non-data ink includes tick marks, grid lines, background, legend • Explanation of error bars, P-values can be included in caption or in text Note the exception for X-Y orientation: because predictor is qualitative (unordered)
Dedicated programs – thousands! DeltaGraph (SPSS) Prism ViSta Business graphics MS Excel many other spreadsheet programs Graphics in statistical packages Stata simple, powerful S-Plus, R powerful, difficult SPSS interactive graphics easy, expensive Systat good reputation SAS GRAPH language expensive, powerful Software for scientific graphics Advice: Avoid “default” choice in all programs (almost always wrong).Avoid programs with “Chart Type” menus – wrong approach.
Object-oriented lines, shapes, etc can be identified within graph each object has attributes (eg size, colour, font) editable using selection and “grouping” Common formats: Postscript (ps,eps) Windows metafile (wmf,emf) Bit-mapped image exists as a collection of pixels each pixel is light or dark, coloured can edit only pixels not objects often “compressed” to save disk space, bandwidth Common formats graphics interchange (gif) Windows bitmap (bmp) JPEG interchange (jpg) Graph formats Advice: Use WMF format where possible. Paste WMF into PowerPoint, “ungroup”, then edit objects for publication quality.
Cleveland WS. Visualizing Data Summit NJ: Hobart Press, 1993 Wilkinson L. The Grammar of Graphics New York: Springer Verlag, 1999 Tufte ER. The Visual Display of Quantitative Information Cheshire, CT: Graphics Press 2001 www.edwardtufte.com Wainer H. Visual Revelations. Graphical Tales of Fate and Deception from Napoleon Bonaparte to Ross Perot Mahwah, NJ: Lawrence Erlbaum Associates, Publishers. 1997 www.erlbaum.com References, further reading
Summary • Howie’s Helpful Hints for bad graphs: • Don’t show the data • Show the data inaccurately • Obfuscate the data • Steps for better graphs: • Identify direction of cause & effect • Exploit levels of measurement • Accommodate visual perception principles • Minimise non-data ink • Don’t use Excel unless you have to • And if you have to, don’t use the default chart!