Lecture 09: Data Types II

Lecture 09:Data Types II February 19, 2013 COMP 150-2Visualization

Admin • A3 naming convention change • Questions about A3? • A4 (SquarifiedTreemap) out

Visualization Process

Data Definition • A typical dataset in visualization consists of n records • (r1, r2, r3, … , rn) • Each record ri consists of m (m >=1) observations or variables • (v1, v2, v3, … , vm) • A variable may be either independent or dependent • Independent variable (iv) is not controlled or affected by another variable • For example, time in a time-series dataset • Dependent variable (dv) is affected by a variation in one or more associated independent variables • For example, temperature in a region • Formal definition: • ri = (iv1, iv2, iv3, … , ivmi, dv1, dv2, dv3, … , dvmd) • where m = mi + md

Basic Data Types Def: A set of not-ordered and non-numeric values For example: Categorical (finite) data {apple, orange, pear} {red, green, blue} Arbitrary (infinite) data {“12 Main St. Boston MA”, “45 Wall St. New York NY”, …} {“John Smith”, “Jane Doe”, …} • Nominal • Ordinal • Scale / Quantitative • Interval • ratio

Basic Data Types Def: A tuple (an ordered set) For example: Numeric <2, 4, 6, 8> Binary <0, 1> Non-numeric <G, PG, PG-13, R> • Nominal • Ordinal • Scale / Quantitative • Interval • ratio

Basic Data Types Def: A numeric range Interval Ordered numeric elements on a scale that can be mathematically manipulated, but cannot be compared as ratios For example: date, current time (Sept 14, 2010 cannot be described as a ratio of Jan 1, 2011) Ratio where there exists an “absolute zero” For example: height, weight • Nominal • Ordinal • Scale / Quantitative • Interval • ratio

Basic Data Types (Formal) • Nominal (N) {…} • Ordinal (O) <…> • Scale / Quantitative (Q) […] • Q → O • [0, 100] → <F, D, C, B, A> • O → N • <F, D, C, B, A> → {C, B, F, D, A} • N → O (??) • {John, Mike, Bob} → <Bob, John, Mike> • {red, green, blue} → <blue, green, red>?? • O → Q (??) • Hashing? • Bob + John = ?? Readings in Information Visualization: Using Vision To Think. Card, Mackinglay, Schneiderman, 1999

Operations on Basic Data Types • What are the operations that we can perform on these data types? • Nominal (N) • = and ≠ • Ordinal (O) • >, <, ≥, ≤ • Scale / Quantitative (Q) • everything else (+, -, *, /, etc.) • Consider a distance function

Questions?

Connecting Data To Visualization • Data have attributes (dimensions) • Visualizations have attributes (dimensions) • Can the two map to each other? • Jacques Bertin, SemiologieGraphique (Semiology of Graphcis), 1967.

Elements of Visualization • Images are composed of marks: “ink”, graphical primitives Slide courtesy of Sara Su

Visual Channels

Elements of Visualization Slide courtesy of Sara Su

Value (Intensity) Discrete or Continuous? Slide courtesy of Sara Su

Color (Hue) Discrete or Continuous? Slide courtesy of Sara Su

Visual Variables Slide courtesy of Sara Su

Card, Mackinlay (1997) F : Function for recoding data ::= f (unspecified) > (filter) s (sorting) mds(multidimensional scaling) ↑ (interactive input to a function) D’ : Recoded Data Type (see D) CP : Control Processing tx(text) M : Mark types ::= P (Point), L (Line), S (Surface), A (Area), V (Volume) R : Retinal (mark) properties ::= C (Color), S (Size), — (Connection), [] (Enclosure) XYZT : Position in space time ::= N, O, Q, * (non-semantic use of space-time) V : View transformation ::=hb(hyperbolic mapping) W : Widget ::= sl(slider) rb(radio buttons) Symbol Meaning D Data Type ::= N (Nominal), O (Ordinal), Q (Quantitative). QX(Intrinsically spatial), Qlon(Geographical) NxN(Set mapped to itself - graphs)

Example 1: Ozone Mapping

Example 1: Ozone Mapping The rows of the table describe the variables with the case variable (“Samples”) at the top and the value variables below. The nominal (N) set of Samples is mapped to point marks (P in column M), which have their retinal property of color (C in column R) mapped to the Ozone variable. The ozone mapping includes a function (f) that converts the quantitative (Q) ozone measurements to an ordinal (O) set that can be easily mapped to a set of colors. The quantitative (Q) variables of Longitude, Latitude, and Height are mapped to the positions X, Y, and Z, which determine the position of the point marks. The Date variable is mapped to time (T), which creates an animated visualization. Table 1 makes it clear that Figure 1 is a 3D animated visualization involving colored points.

Example 2: GIS

Example 2: GIS Table 2 describes the map part of Figure 2. The Offices variable is mapped to line marks (L). The Profit variable is mapped to the size of these lines (Szin the R column). Profits are also mapped to the Z-axis and via a function (f) to a nominal set indicating the sign of the profits. This nominal set is mapped to the color of the lines (C in the R column). Table 2 clearly reveals that multiple graphical techniques are used to describe the Profit variable in order to enhance the perception of this important data variable

Other Examples

Treemap Example

Treemap Example The problem is that the same variable is mapped onto two different position presentations, each half of the time Q -> X (half time) Q -> Y (half time) giving an inconsistent mapping and prohibiting the user from forming an easy image. What the user should be able to take from the image is essentially Retinal: Size coding, but the same Size can have many different visual manifestations, each with a different aspect ratio. Thus the space-filling property of the visualization comes at a perceptual cost, which is clearly shown in Table 9.

Questions?

Using Visualization to Influence? Image courtesy of http://sambbiblog.spaces.live.com

Appropriateness? • Which data dimension should be mapped to what visual variable?

Appropriateness?

Structure and Form Image courtesy of Barbara Tversky

Visual Metaphors Image courtesy Caroline Ziemkiewicz

Visual Metaphors

fNIRS and Vis

Unintended Consequences

Interaction Effects An example of interference between icon spacing (representing a linear variable) and icon brightness (representing a more general scalar field). Areas of high brightness create false lower-spacing regions. Acevdeo, Laidlaw. “Subjective Quantification of Perceptual Interactions among some 2D Scientific Visualization Methods”, TVCG 2006.

Interaction Effects Process for creating the stimuli for the data resolution identification task. (a) Shows a vertical sine-wave dataset. (b) Shows the same dataset with amplitude values a linearly decreasing from left to right. (c) Shows the final appearance of the datasets used for this task, where we also linearly move the zero value of the sine-wave from a/2 at the top of the image to 1−a/2 at the bottom. (d) Shows how subjects would mark the area where they perceive the sine-wave pattern.

Feature Detection

Lecture 09: Data Types II