1 / 42

A Rough Guide to Data Visualization

A Rough Guide to Data Visualization. VizNET 2007 Annual Event. Ken Brodlie School of Computing University of Leeds. Data Visualization. Visualization now seen as key part of modern computing High performance computing generates vast quantities of data ...

iden
Download Presentation

A Rough Guide to Data Visualization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Rough Guide to Data Visualization VizNET 2007 Annual Event Ken Brodlie School of Computing University of Leeds

  2. Data Visualization • Visualization now seen as key part of modern computing • High performance computing generates vast quantities of data ... • High resolution measurement technology likewise ... • microscopes, scanners, satellites • Information systems involve not only large data sets but also complex connections... • ... we need to harness our visual senses to help us understand the data

  3. Reality Observation Simulation Data Visualization Images, animation Data Visualization – What is it?

  4. Applications - Meteorology Pressure at levels in atmosphere - illustrated by contour lines in a slice plane Generated by the Vis5D system from University of Wisconsin (now Vis5d+) Vis5d: http://www.ssec.wisc.edu/~billh/vis5d.html Vis5d+ : http://vis5d.sourceforge.net

  5. Applications - Medicine From scanner data, we can visualize 3D pictures of human anatomy, using volume rendering Generated by Anatomy.TV used by Leeds medical students to learn anatomy

  6. Interface between immiscible fluids e.g. oil / water Loops and fingers arise when mixing starts Rayleigh-Taylor instability Simulated on ASCII Blue Pacific (Cook & Dimotakis, 2001) Interface visualized using a density isosurface Applications – Computational Fluid Dynamics

  7. Applications – Hierarchical Information Usenet news groups For history of treemaps see: www.cs.umd.edu/ hcil/treemap-history Developed over many years by Ben Schneiderman and colleagues

  8. Structure of Session Part 1 • Introduction • What is visualization and some examples • The humble graph • Much to learn • Scientific visualization • Understanding 2D and 3D data Part 2 • Exploratory data visualization • Finding relationships in tables of data • Visualizing structures • Information hierarchies • Interacting with visualizations • Focus and context

  9. The Humble Graph

  10. The First Visualization This picture is taken from Brian Collins ‘Data Visualization - Has it all been seen before?’ in ‘Animation and Scientific Visualization’, Academic Press

  11. Simple data tables are often presented as line graphs, bar graphs, pie charts, dot graphs, histograms… Which should we use and when? Simple Data Presentation

  12. Fundamental technique of data presentation Used to compare two continuous variables X-axis is often the control variable Y-axis is the response variable Good at: Predicting values where data not given Often (dubiously) used for trends when control is a categorical variable Students participating in sporting activities ? Line Graph

  13. Bar graph Presents categorical variables Height of bar indicates value Double bar graph allows comparison Note spacing between bars Can be horizontal (when would you use this?) Simple Representations – Bar Graph Number of police officers Internet use at a school

  14. Very simple but effective… Horizontal to give more space for labelling Dot Graph

  15. Pie chart summarises a set of categorical/nominal data Shows proportions But use with care… … too many segments are harder to compare than in a bar chart Pie Chart Should we have a long lecture? Favourite movie genres

  16. Histograms summarise discrete or continuous data that are measured on an interval scale No gaps if variable is continuous Histograms Distribution of salaries in a company

  17. Used to present measurements of two variables Effective if a relationship exists between the two variables Example taken from NIST Handbook – Evidence of strong positive correlation Scatter Plot Car ownership by household income

  18. Edward Tufte has written a series of books on the design of good visualizations Visit: http://www.edwardtufte.com/tufte/ Here are some of the things he teaches us…. A Visualization Guru

  19. Tufte Design Principles • “Give the viewer the greatest number of ideas in the shortest space of time using the least ink in the smallest space” • Try to maximize the data-ink ratio • Show data variation, not design variation • Tell the truth about the data

  20. Data Ink Ratio = (data-ink) / (total ink to produce graphic) = proportion of ink devoted to non-redundant display of information = 1.0 – proportion of graphic that can be deleted without loss of data-information Data Ink A low value of data ink ratio!

  21. How much can be removed from this graphic? 1 2 3 4 5 6 Exercise Answer at: http://home.ched.coventry.ac.uk/Volume/vol0/dataink.htm

  22. Fundamental purpose of a graph is to show changes in the data Design variation – where the same data is displayed differently for decoration - is to be avoided Leads to ambiguity and deception Design Variation What is wrong with this?

  23. Lie Factor = (Size of effect on graph) / (Size of effect on data) Lie Factor Spot the lie!

  24. Summary • Use the correct type of graph • Line graph for response against continuous control • Bar chart when control is categorical • Pie chart when viewing as proportions • Histograms when aggregating over intervals • Scatter plots to see relationships between two variables • Remember Tufte’s principles when creating a graphic • Thanks to Statistics Canada – an excellent web site for simple data presentation • http://www.statcan.ca/english/edu/power/toc/contents.htm

  25. Scientific Visualization Data defined over 2D regions and 3D volumes

  26. In contouring we are extracting lines of constant ‘height’ from data defined over a 2D region… sometimes called isolines What is the analogy for data defined over a 3D volume? Data over 2D Region - Contouring Topographic map with isohypses of height -wikipedia

  27. The analogy for 3D data is the isosurface: points where the measurements have a constant value… Here we see surface of brain extracted from a 3D medical dataset What limitations do you notice compared with contours in 2D?? Isosurfacing http://www.csit.fsu.edu/~futch/iso/

  28. Famous isosurfacing algorithm is marching cubes Each cube processed in turn For zero isosurface, create surface separating positive and negative vertices of cube After each cube is processed we have a surface (or surfaces) separating all positive vertices from all negative ones Marching Cubes

  29. Lobster – Increasing the Threshold Level From University of Bonn

  30. Isosurfacing by Marching Cubes Algorithm • Advantages • isosurfaces good for extracting boundary layers • surface defined as triangles in 3D - well-known rendering techniques available for lighting, shading and viewing ... with hardware support • Disadvantages • shows only a slice of data

  31. Isosurfacing can be applied to rendering of objects… here an engine Example – mechanical engineering Computer Science, UC Davis

  32. Vertebrae… .. Also from UC Davis Example – medical application

  33. Example – Heart Modelling

  34. Image Presentation of Data over 2D Region Note here that in addition to the contour lines the height of each ‘dot’ is individually coloured – so there is a mapping from ‘height’ to colour … this is known as a transfer function. What is the analogy in 3D?

  35. Volume Rendering • The analogy in 3D is known as volume rendering • To overcome the step to 3D, we transfer values to colour and opacity • Volume is a partially opaque gel material • By controlling the opacity, we can: • EITHER show surfaces through setting opacity to 0 everywhere except at a specific value where it is set to 1 • OR see both exterior and interior regions by grading the opacity from 0 to 1 [Note: opacity = 1 - transparency]

  36. Opacity a 1 0 CT value fsoft_tissue Data Classification – Assigning Opacity to CT data • CT will identify fat, soft tissue and bone • Each will have known absorption levels, say ffat, fsoft_tissue, fbone This transferfunction will highlight soft tissue

  37. Opacity a 1 0 CT value fsoft_tissue Data Classification – Assigning Opacity to CT Data • To show all types of tissue, we assign opacities to each type and linearly interpolate between them In practice, a is also increased in areas where data changes rapidly – This accentuates boundaries ffat fbone

  38. Data Classification – Constructing the Gel – CT Data • Colour classification is done similarly Known as colour transfer function white red yellow CT number Soft Tissue Air Fat Bone

  39. Volume Rendering Cerebral aneurysm Marcelo Cohen

  40. Volume Rendering Tooth, engine, woman – Marcelo Cohen

  41. Isosurface and Volume Rendering Storm cloud data rendered by IRIS Explorer – Isosurface & volume rendering

  42. Summary • Scientific visualization allows us to understand data defined over 2D and 3D regions • Traditional 2D methods have been generalised to 3D: • Contouring – isosurfacing • Image representation – volume rendering • Excellent new text book • Helen Wright • Introduction to Scientific Visualization – Springer Verlag

More Related