140 likes | 271 Views
Presenting Multivariate Data. Harry R. Erwin, PhD School of Computing and Technology University of Sunderland. Resources. Everitt , BS, and G Dunn (2001) Applied Multivariate Data Analysis, London: Arnold.
E N D
Presenting Multivariate Data Harry R. Erwin, PhD School of Computing and Technology University of Sunderland
Resources • Everitt, BS, and G Dunn (2001) Applied Multivariate Data Analysis, London: Arnold. • Everitt, BS (2005) An R and S-PLUS® Companion to Multivariate Analysis, London: Springer • Tukey’s seminal paper: • <http://www.edwardtufte.com/tufte/tukey> • Tufte’s work: • http://en.wikipedia.org/wiki/Edward_Tufte • Murrell, P, (2006) R Graphics, Florida: Chapman & Hall/CRC.
Do not lie about the data. • How do people lie? • By presenting data selectively • By distorting the visual representation of the data • By failing to extrapolate scales from one portion of the image to another. • By changing scales • By inflating vertical scales • By failing to show the zero point or 100% point of an axis. • By representing linear data by area • By representing areal data linearly • By omitting data
Five Principles • Above all else, show the data • Maximize the data/ink ratio • Erase non-data ink • Erase redundant data ink • Revise and edit freely
Avoid Chartjunk • Avoid moiré effects—shimmering • Mute the grid • Dump the duck—avoid self-promoting graphics
A Laundry-List of Plotting Commands in R • Standard scatterplot commands • plot(dataset) # for two-column data • text(xval,yval,TextString) • barplot() produces barplots. • hist() produces histograms • boxplot() produces boxplots • pie() produces piecharts • pairs() produces a pairs diagramme
What can you plot? • The first argument(s) to any plot command is (are) very flexible • A dataframe • A pair of vectors • A relationship (pressure~temperature, data=pressure) • A model
Adding details to a plot • text(locx, locy, TextString) • points(vecx,vecy) to add points • lines(vecx,vecy) to draw connecting lines • matplot() • par() to customise the graphics • axis() to add an axis • grid() to add a grid • abline() to add a line to the plot • arrows() to add arrows to the plot • mtext() to add marginal text • title() to add a title • legend() to add a legend
Trellis Graphics • package(lattice) # creates objects of class trellis • Developed by Deepayan Sarkar • Generates complete plots • Operates just like traditional graphics, but optimised for us based on Bill Cleveland’s recommendations. • Ensure accurate and faithful communication of information • Supports ‘multi-panel conditioning’, which will be very useful.
The Lattice Graphics Model • tplot<-xyplot(lat~long, data=quakes, pch=“.”) • print(tplot) • tplot2<-update(tplot, main=“Earthquakes in the Pacific Ocean (since 1964)”) • Use the trellis.device() function instead of par()
Trellis Plots • Trellis • barchart() • bwplot() • densityplot() • dotplot() • histogram() • qqmath() • stripplot() • Standard • barplot() • boxplot() • nil • dotchart() • hist() • qqnorm() • stripchart()
More Trellis Plots • Trellis • qq() • xyplot() • levelplot() • contourplot() • cloud() • wireframe() • splom() • parallel() • Standard • qqplot() • plot() • image() • contour() • nil • persp() • pairs() • nil
Multipanel Conditioning • depthgroup<-equal.count(quakes$depth, number=3, overlap=0) • xyplot(lat~long|depthgroup, data=quakes, pch=“.”)