420 likes | 615 Views
A Rough Guide to Data Visualization. VizNET 2007 Annual Event. Ken Brodlie School of Computing University of Leeds. Data Visualization. Visualization now seen as key part of modern computing High performance computing generates vast quantities of data ...
E N D
A Rough Guide to Data Visualization VizNET 2007 Annual Event Ken Brodlie School of Computing University of Leeds
Data Visualization • Visualization now seen as key part of modern computing • High performance computing generates vast quantities of data ... • High resolution measurement technology likewise ... • microscopes, scanners, satellites • Information systems involve not only large data sets but also complex connections... • ... we need to harness our visual senses to help us understand the data
Reality Observation Simulation Data Visualization Images, animation Data Visualization – What is it?
Applications - Meteorology Pressure at levels in atmosphere - illustrated by contour lines in a slice plane Generated by the Vis5D system from University of Wisconsin (now Vis5d+) Vis5d: http://www.ssec.wisc.edu/~billh/vis5d.html Vis5d+ : http://vis5d.sourceforge.net
Applications - Medicine From scanner data, we can visualize 3D pictures of human anatomy, using volume rendering Generated by Anatomy.TV used by Leeds medical students to learn anatomy
Interface between immiscible fluids e.g. oil / water Loops and fingers arise when mixing starts Rayleigh-Taylor instability Simulated on ASCII Blue Pacific (Cook & Dimotakis, 2001) Interface visualized using a density isosurface Applications – Computational Fluid Dynamics
Applications – Hierarchical Information Usenet news groups For history of treemaps see: www.cs.umd.edu/ hcil/treemap-history Developed over many years by Ben Schneiderman and colleagues
Structure of Session Part 1 • Introduction • What is visualization and some examples • The humble graph • Much to learn • Scientific visualization • Understanding 2D and 3D data Part 2 • Exploratory data visualization • Finding relationships in tables of data • Visualizing structures • Information hierarchies • Interacting with visualizations • Focus and context
The First Visualization This picture is taken from Brian Collins ‘Data Visualization - Has it all been seen before?’ in ‘Animation and Scientific Visualization’, Academic Press
Simple data tables are often presented as line graphs, bar graphs, pie charts, dot graphs, histograms… Which should we use and when? Simple Data Presentation
Fundamental technique of data presentation Used to compare two continuous variables X-axis is often the control variable Y-axis is the response variable Good at: Predicting values where data not given Often (dubiously) used for trends when control is a categorical variable Students participating in sporting activities ? Line Graph
Bar graph Presents categorical variables Height of bar indicates value Double bar graph allows comparison Note spacing between bars Can be horizontal (when would you use this?) Simple Representations – Bar Graph Number of police officers Internet use at a school
Very simple but effective… Horizontal to give more space for labelling Dot Graph
Pie chart summarises a set of categorical/nominal data Shows proportions But use with care… … too many segments are harder to compare than in a bar chart Pie Chart Should we have a long lecture? Favourite movie genres
Histograms summarise discrete or continuous data that are measured on an interval scale No gaps if variable is continuous Histograms Distribution of salaries in a company
Used to present measurements of two variables Effective if a relationship exists between the two variables Example taken from NIST Handbook – Evidence of strong positive correlation Scatter Plot Car ownership by household income
Edward Tufte has written a series of books on the design of good visualizations Visit: http://www.edwardtufte.com/tufte/ Here are some of the things he teaches us…. A Visualization Guru
Tufte Design Principles • “Give the viewer the greatest number of ideas in the shortest space of time using the least ink in the smallest space” • Try to maximize the data-ink ratio • Show data variation, not design variation • Tell the truth about the data
Data Ink Ratio = (data-ink) / (total ink to produce graphic) = proportion of ink devoted to non-redundant display of information = 1.0 – proportion of graphic that can be deleted without loss of data-information Data Ink A low value of data ink ratio!
How much can be removed from this graphic? 1 2 3 4 5 6 Exercise Answer at: http://home.ched.coventry.ac.uk/Volume/vol0/dataink.htm
Fundamental purpose of a graph is to show changes in the data Design variation – where the same data is displayed differently for decoration - is to be avoided Leads to ambiguity and deception Design Variation What is wrong with this?
Lie Factor = (Size of effect on graph) / (Size of effect on data) Lie Factor Spot the lie!
Summary • Use the correct type of graph • Line graph for response against continuous control • Bar chart when control is categorical • Pie chart when viewing as proportions • Histograms when aggregating over intervals • Scatter plots to see relationships between two variables • Remember Tufte’s principles when creating a graphic • Thanks to Statistics Canada – an excellent web site for simple data presentation • http://www.statcan.ca/english/edu/power/toc/contents.htm
Scientific Visualization Data defined over 2D regions and 3D volumes
In contouring we are extracting lines of constant ‘height’ from data defined over a 2D region… sometimes called isolines What is the analogy for data defined over a 3D volume? Data over 2D Region - Contouring Topographic map with isohypses of height -wikipedia
The analogy for 3D data is the isosurface: points where the measurements have a constant value… Here we see surface of brain extracted from a 3D medical dataset What limitations do you notice compared with contours in 2D?? Isosurfacing http://www.csit.fsu.edu/~futch/iso/
Famous isosurfacing algorithm is marching cubes Each cube processed in turn For zero isosurface, create surface separating positive and negative vertices of cube After each cube is processed we have a surface (or surfaces) separating all positive vertices from all negative ones Marching Cubes
Lobster – Increasing the Threshold Level From University of Bonn
Isosurfacing by Marching Cubes Algorithm • Advantages • isosurfaces good for extracting boundary layers • surface defined as triangles in 3D - well-known rendering techniques available for lighting, shading and viewing ... with hardware support • Disadvantages • shows only a slice of data
Isosurfacing can be applied to rendering of objects… here an engine Example – mechanical engineering Computer Science, UC Davis
Vertebrae… .. Also from UC Davis Example – medical application
Image Presentation of Data over 2D Region Note here that in addition to the contour lines the height of each ‘dot’ is individually coloured – so there is a mapping from ‘height’ to colour … this is known as a transfer function. What is the analogy in 3D?
Volume Rendering • The analogy in 3D is known as volume rendering • To overcome the step to 3D, we transfer values to colour and opacity • Volume is a partially opaque gel material • By controlling the opacity, we can: • EITHER show surfaces through setting opacity to 0 everywhere except at a specific value where it is set to 1 • OR see both exterior and interior regions by grading the opacity from 0 to 1 [Note: opacity = 1 - transparency]
Opacity a 1 0 CT value fsoft_tissue Data Classification – Assigning Opacity to CT data • CT will identify fat, soft tissue and bone • Each will have known absorption levels, say ffat, fsoft_tissue, fbone This transferfunction will highlight soft tissue
Opacity a 1 0 CT value fsoft_tissue Data Classification – Assigning Opacity to CT Data • To show all types of tissue, we assign opacities to each type and linearly interpolate between them In practice, a is also increased in areas where data changes rapidly – This accentuates boundaries ffat fbone
Data Classification – Constructing the Gel – CT Data • Colour classification is done similarly Known as colour transfer function white red yellow CT number Soft Tissue Air Fat Bone
Volume Rendering Cerebral aneurysm Marcelo Cohen
Volume Rendering Tooth, engine, woman – Marcelo Cohen
Isosurface and Volume Rendering Storm cloud data rendered by IRIS Explorer – Isosurface & volume rendering
Summary • Scientific visualization allows us to understand data defined over 2D and 3D regions • Traditional 2D methods have been generalised to 3D: • Contouring – isosurfacing • Image representation – volume rendering • Excellent new text book • Helen Wright • Introduction to Scientific Visualization – Springer Verlag