Information Visualization

Information Visualization Yaji Sripada

In this lecture you learn • What is information visualization? • What are the different techniques used for visualizing different types of data • Designing Visualizations • Basic concepts in Java 2D Dept. of Computing Science, University of Aberdeen

Introduction • Visualization is the use of graphical techniques to communicate information and support reasoning or analysis • Visualizations are cost-effective because they exploit • powerful human visual processing capabilities and • high quality graphics created at low cost • Two kinds of visualizations • Scientific Visualization • Information Visualization Dept. of Computing Science, University of Aberdeen

Scientific Visualization • Visual modelling of scientific data using computer graphics • Examples • In our department • HEX program develops visualizations of protein docking • http://www.csd.abdn.ac.uk/hex/gallery/ • molecular structural data is hard to understand without visualization • Laboratory of Neuro Imaging, UCLA • Visualization of brain models • http://www.loni.ucla.edu/SVG/ • Focus is • on modelling (visually) the input data as close to reality as possible • Not on presenting abstractions or relationships from the input data Dept. of Computing Science, University of Aberdeen

Information Visualization (IV) • Visual presentation of abstractions or relationships underlying input data • IV has two goals • Communication • to communicate a rich message • Problem solving/ reasoning/ analysis • to display large amount of information to facilitate reasoning to uncover new facts or relationships • Limited screen sizes pose a serious challenge for using IV on very large data sets • Therefore the main task is to pack large information into a simple graphic • Highlighting all the required (important) information • Creative art? Dept. of Computing Science, University of Aberdeen

Message Communication - Example • Napolean’s 1812 campaign on Russia • Input data • Size of army • at the start of the campaign = 442,000 • at the end of the campaign = 10,000 • Location of the army (2 dimensions) • Direction of the army’s movement • Temperature and • Time Dept. of Computing Science, University of Aberdeen

Minard’s Drawing Created in 1861 by French engineer Charles Joseph Minard Dept. of Computing Science, University of Aberdeen

Minard’s Drawing (2) • Considered the best graphic ever produced • Inspiration for modern IV researchers • Plots all the data corresponding to all the six input variables • Clearly shows the message underlying the input data • Gradual reduction in the size of the army • Linked to the gradual fall in temperatures • Input data is complex • Yet, most important information abstracted out and presented in a simple graphic Dept. of Computing Science, University of Aberdeen

Problem Solving - Example • London cholera epidemic of 1854 • At that time, two hypotheses of causes of cholera: • Cholera is related to miasmas concentrated in the swampy areas of the city • Cholera is related to ingestion of contaminated water • Input Data • Locations of deaths due to cholera • Locations of water pumps Dept. of Computing Science, University of Aberdeen

Dr Snow’s Cholera Map Dots locate deaths due to cholera Crosses Locate water pumps Dept. of Computing Science, University of Aberdeen

Dr Snow’s Cholera Map (2) • Plotting the input data on the map helped Dr Snow • to detect the epicentre of the epidemic • Close to a pump on Broad Street • Considered a classic case of visualization helping reasoning with data Dept. of Computing Science, University of Aberdeen

Design & Technology • There are two requirements for developing visualizations • Graphic Design • mapping information (raw or filtered) into a graphic • Mapping data/information to display variables • Position, orientation, size, motion, colour etc. • Technology • achieving the design programmatically • Graphics programming, flash programming etc Dept. of Computing Science, University of Aberdeen

Graphic Design • Mapping • Data to some graphical element • Such as a cross. • data attributes to the attributes of the graphical element • Such as colour, size, shape etc. • Order of priority for representing quantitative data • Position • Length • orientation • Size • colour Dept. of Computing Science, University of Aberdeen

Inputs to the design process • Data - size and data type • User Task • User characteristics • System resources - PC vs Graphics work station • Standards/guidelines Dept. of Computing Science, University of Aberdeen

Designing Information Visualizations • Gospel like guidelines • If the underlying data is simple, keep the graphic simple • If the underlying data is complex, make the graphic look simple (e.g., Minard’s Graphic) • Always tell the truth - Do not distort the data • Maximize the data-ink ratio (Edward Tufte, www.edwardtufte.com) • Data-ink ratio= data-ink/total ink used on the graphic Dept. of Computing Science, University of Aberdeen

Visual Information Seeking Mantra • Modern visualizations are highly interactive • Users wish to seek information visually and interactively • Visual Information Seeking Mantra recommends designing interfaces using the following guideline “Overview first, zoom and filter, then details on demand” • Details of the mantra are given in the Task by Type Taxonomy (TTT) proposed by Prof. Shneiderman, HCI Lab, University of Maryland (UMD) • TTT is a framework for organizing visualizations. Involves • 7 tasks and • 7 data types Dept. of Computing Science, University of Aberdeen

7 Tasks • The 7 interactive tasks users wish to perform: • Overview: Gain an overview of the entire collection. • Zoom : Zoom in on items of interest • Filter: filter out uninteresting items. • Details-on-demand: Select an item or group and get details when needed. • Relate: View relationships among items. • History: Keep a history of actions to support undo, replay, and progressive refinement. • Extract: Allow extraction of sub-collections and of the query parameters. Dept. of Computing Science, University of Aberdeen

7 Data Types • 1 D Linear • 2D Map • 3D World • Multi-dimensional • Temporal • Tree • Network Dept. of Computing Science, University of Aberdeen

Visualization of Linear Data • Long lists of items • E.g. long lists of menu items and • Software code listings etc. • Bifocal (or Fisheye) displays • E.g. Fisheye menus developed by HCI Lab, UMD • http://www.cs.umd.edu/hcil/fisheyemenu/ Dept. of Computing Science, University of Aberdeen

SeeSoft display of software code Dept. of Computing Science, University of Aberdeen

Visualizing Map Data • GIS are used to visualize map data • E.g. Google maps or Google Earth • http://maps.google.com/ • GIS presents layers of information on a geographic map • GIS supports • spatial querying and • Spatial data analysis Dept. of Computing Science, University of Aberdeen

Visualizing 3D Data • Complex trees and networks are visualized using 3D graphics • Largely used in scientific visualization Dept. of Computing Science, University of Aberdeen

Visualizing Temporal Data • Traditionally time series are visualized using trend graphs and seasonality graphs • A time series can be expressed in terms of its trend and seasonality components • Data = trend + seasonal + remainder Dept. of Computing Science, University of Aberdeen

Trend And Seasonality in Time Series Dept. of Computing Science, University of Aberdeen

LifeLines • Visualization of computerised medical records • For a patient • Horizontal lines (time lines) represent medical problems, hospitalization and medications • Icons on these lines represent events such as tests and physician consultations • All the patient information is fitted into one screen Dept. of Computing Science, University of Aberdeen

Screenshot of LifeLines Dept. of Computing Science, University of Aberdeen

Multi-dimensional data • Example - Records in a relational database • Two solutions • Plot all possible pairs of variables as 2D scatter plots • Simple but not helpful to visualize the data as a whole • Parallel coordinates • A novel way of plotting multi-dimensional data proposed by Alfred Inselberg Dept. of Computing Science, University of Aberdeen

Parallel Coordinates • One vertical bar per dimension drawn in parallel • Each point is represented by a set of lines • Revolutionary representation for multi-dimensional data • But users may need long time to learn how to understand the graphs Dept. of Computing Science, University of Aberdeen

Parallel Coordinates Dept. of Computing Science, University of Aberdeen

Visualizing Tree Data • Examples: Family Trees and file system directory • Not only the data but the structure of the data also needs to be displayed • Data sizes grow rapidly with the increase in height of the tree. • Windows explorer offers visualization of directory structure in a file system • All the items in the file system are not visible without scrolling and expanding nodes • TreeMaps display tree data with the help of nested rectangles • Child items are rectangles nested inside parent rectangles • All the tree data is visible on the screen without scrolling or node expansion Dept. of Computing Science, University of Aberdeen

TreeMap of One million items Dept. of Computing Science, University of Aberdeen

Visualizing Network Data • Examples: Internet, Web, roads etc. • Network = nodes+links • Issues with network visualization are similar to issues with trees • Both have data + structure • Layout design should ensure • Minimum link crossings • Minimum link lengths and • Minimum link bends Dept. of Computing Science, University of Aberdeen

Network Visualization Dept. of Computing Science, University of Aberdeen

Evaluation • Graphics are cool! • Because visualizations use graphics, visualizations are often inappropriately judged by their ‘coolness’ factor • Evaluation of a visualization should be based on • Controlled user experiments • Real world user experiments • Can the users achieve their intended tasks? (performance) • What are the error rates in the user performance tests? • What is the time taken for achieving their intended task? • What is the time taken for learning to read the graphic? • How much information can the user retain for longer periods? Dept. of Computing Science, University of Aberdeen

Evaluation (2) • User based evaluations are expensive • Cheaper alternative is to evaluate visualizations using standard data sets with known patterns or messages • No visualization is perfect for all contexts and tasks • Evaluation should uncover the conditions under which a visualization works Dept. of Computing Science, University of Aberdeen

Accessibility • Visualizations are useless for visually impaired users • Audio based interfaces are required for blind information seekers • iSonic project at HCILab, UMD • Uses non-speech sounds + speech to help blind users to obtain data trends in geo-referenced data • Atlas.txt project in our department • Textual summaries of geo-referenced data for visually impaired users • www.csd.abdn.ac.uk/research/atlas • More efforts are required! Dept. of Computing Science, University of Aberdeen

Summary • Humans can handle large amounts of visual data • Visualizations exploit computer graphics + human perceptual system to communicate a message about the underlying information • Most modern visualizations are interactive • offering an overview of the entire data set • Allowing user to explore details selectively • Issues • Evaluation • Users may require long learning times with many modern complex visualizations • A visualization should be effective for some user task; not look cool • Accessibility is restricted to sighted users only Dept. of Computing Science, University of Aberdeen

Graphics

Simple 2D Graphics • Draw a 2D (not photo realistic) picture • Simplest type of graphic • Similar to drawing with Paint • Use Java2D package • Not terribly well designed • Like a lot of Java… Dept. of Computing Science, University of Aberdeen

Drawing with Paint • Create canvas • Select parameters • Edge colour, fill colour • Line thickness, (line type) • Draw shape • Rectangle, Ellipse, Line, … • Filled or outline • Draw Text (font) • Transform (flip, rotate, etc) Dept. of Computing Science, University of Aberdeen

Drawing with Java 2D • Create canvas widget in GUI • Usually new subclass of JPanel • Set parameters • Line thickness, type (eg, dashed), colour • Fill colour, pattern • Compositing, clip • Draw shapes • Rectangle, Ellipse, Line, … • Filled or outline • Draw text (font, colour, angle, …) • Transform (before drawing, not after) Dept. of Computing Science, University of Aberdeen

Creating a canvas widget • Define a subclass of JPanel • Directly, or use Jpanel template • In this subclass, include a paintComponent(Graphics g) method • In the paintComponent method, include code for drawing graphics • Cast parameter to Graphics2D object to use Java2D package Dept. of Computing Science, University of Aberdeen

Draw a square import java.awt.*; import java.awt.geom.*; import javax.swing.*; public class gPanel extends JPanel { public void paintComponent (Graphics g) { super.paintComponent(g); Graphics2D g2 = (Graphics2D) g; g2.fill(new Rectangle2D.Double(20,20,30,30));} } Dept. of Computing Science, University of Aberdeen

Creating in GUI • In main GUI, add a panel, and then add your graphics panel to this panel (in constructor, after initComp), then pack() public graphicdemo() { initComponents(); jPanel1.add(new gpanel()); pack(); } • Won’t look right in Design View • Set jPanel1 layout to Border if problems • Easier to create with an IDE such as NetBeans Dept. of Computing Science, University of Aberdeen

paintComponent • This routine is called whenever • Panel first appears • Panel re-appears • De-iconified • No longer hidden • Repaint() is called by code • Code should call repaint() if graphics changes Dept. of Computing Science, University of Aberdeen

Coordinate system • By default, coordinates and sizes are specified in pixels. • Can set up a “user” coordinate system that is independent of actual size • Always goes from 0 to 100 (for example), regardless of actual size in pixels • Use scale method Dept. of Computing Science, University of Aberdeen

Scale example // establish 0..100, 0..100 coordinate system // (0,0) is top-left g2.scale(getWidth()/100.0,getHeight()/100.0); // establish 0..100, 0..100 coordinate system // (0,0) is bottom-left g2.scale(getWidth()/100.0,-getHeight()/100.0); g2.translate(0.0,-100.0); Dept. of Computing Science, University of Aberdeen

Coordinate Systems • Major difference between MS paint and drawing with Java code • Becomes more important with more complex graphics • 3-D graphics needs 3-D coord system • Photo-realism requires specs of light source, viewer position in coord system • Won’t do this here (adds complexity) Dept. of Computing Science, University of Aberdeen

Drawing Shapes • Set parameters • Colour, stroke (line thickness, style), etc • g2.draw(Shape) -- outline • g2.fill(Shape) -- filled • Many shapes (see java.awt.geom) • Rectangle2D, Ellipse2D, Line2D, Arc2D, RoundRectangle2D, CubicCurve2D, … • Need to add .Double (strange design) • Rectangle2D.Double Dept. of Computing Science, University of Aberdeen

Parameters • setColor -- drawing color • setStroke -- line type • setFont -- font for text • setComposite– how overwriting is done • setTransform -- transformation Dept. of Computing Science, University of Aberdeen

Information Visualization