240 likes | 525 Views
Envisioning Information Lecture 4 – Multivariate Data Exploration Glyphs and other methods Hierarchical approaches. Ken Brodlie. Glyph Techniques. Map data values to geometric and colour attributes of a glyph – or marker symbol Very many types of glyph have been suggested: Star glyphs
E N D
Envisioning Information Lecture 4 – Multivariate Data Exploration Glyphs and other methods Hierarchical approaches Ken Brodlie ENV 2006
Glyph Techniques ENV 2006
Map data values to geometric and colour attributes of a glyph – or marker symbol Very many types of glyph have been suggested: Star glyphs Faces Arrows Sticks Shape coding Glyph Techniques ENV 2006
How do we place the glyphs on a chart? Sometimes there will be a natural location – for example? If not… two of the variates can be allocated to spatial position, and the remainder to the attrributes of the glyph Glyph Layouts ENV 2006
Each observation represented as a ‘star’ Each spike represents a variable Length of spike indicates the value Glyph Techniques – Star Plots ENV 2006
Each observation represented as a ‘star’ Each spike represents a variable Length of spike indicates the value Crime in Detroit Glyph Techniques – Star Plots ENV 2006
Star Glyphs – Iris Data Set ENV 2006
Chernoff Faces • Chernoff suggested use of faces to encode a variety of variables - can map to size, shape, colour of facial features - human brain rapidly recognises faces ENV 2006
Here are some of the facial features you can use Chernoff Faces http://www.bradandkathy.com/software/faces.html ENV 2006
Chernoff Faces • Demonstration applet at: • http://www.hesketh.com/schampeo/projects/Faces/ ENV 2006
.. And here is Chernoff’s face Chernoff’s Face ENV 2006 http://www.fas.harvard.edu/~stats/People/Faculty/Herman_Chernoff/Herman_Chernoff_Index.html
Glyph is a matchstick figure, with variables mapped to angle and length of limbs As with Chernoff faces, two variables are mapped to display axes Stick figures useful for very large data sets Texture patterns emerge Idea due to RM Pickett & G Grinstein Stick Figures - different angles that may be varied are shown ENV 2006
Stick Figures 5D image data from Great Lakes region ENV 2006
4 1 5 2 6 3 Shape Coding • Suitable where a variable has a Boolean value, ie on/off • A data item is represented as an array of elements, each element corresponding to a variable shade in box if value of corresponding variable is ‘on’ Arrays laid out in a line, or plane, as with other icon-based methods ENV 2006
Shape Coding Time series of NASA earth observation data ENV 2006
This item is { wet, Saturday, Amazon } Daisy Charts * variables and their values placed around circle Dry Wet * lines connect the values for one observation Leeds Showery Sahara Saturday Amazon Sunday http://www.daisy.co.uk ENV 2006
Daisy Charts - Underground Problems ENV 2006
Four variates: day, source, search terms, keywords Daisy Charts – News Analysis ENV 2006
Reducing Complexity in Multivariate Data Exploration ENV 2006
Success has been achieved through clustering of observations Hierarchical parallel co-ordinates Cluster by similarity Display using translucency and proximity-based colour Clustering as a Solution http://davis.wpi.edu/~xmdv/docs/vis99_HPC.pdf ENV 2006
Comparison One of 3 clusters ENV 2006
Hierarchical Parallel Co-ordinates ENV 2006
Reduce number of variables, preserve information Principal Component Analysis Transform to new co-ordinate system Hard to interpret Hierarchical reduction of variable space Cluster variables where distance between observations is typically small Choose representative for each cluster Subgroup has then been identified – showing what? 42 dimensions, 200 observations Reduction of Dimensionality of Variable Space http://davis.wpi.edu/%7Exmdv/docs/vhdr_vissym.pdf ENV 2006