1 / 64

Multivariate Data & Representations

Multivariate Data & Representations. CS 7450 - Information Visualization Jan. 20, 2005 John Stasko. Agenda. Data forms and representations Basic representation techniques Multivariate (>3) techniques. Data Sets. Data comes in many different forms Typically, not in the way you want it

obert
Download Presentation

Multivariate Data & Representations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multivariate Data &Representations CS 7450 - Information Visualization Jan. 20, 2005 John Stasko

  2. Agenda • Data forms and representations • Basic representation techniques • Multivariate (>3) techniques CS 7450

  3. Data Sets • Data comes in many different forms • Typically, not in the way you want it • How is stored (in the raw)? CS 7450

  4. Example • Cars • make • model • year • miles per gallon • cost • number of cylinders • weights • ... CS 7450

  5. Example • Web pages CS 7450

  6. Data Tables • Often, we take raw data and transform it into a form that is more workable • Main idea: • Individual items are called cases • Cases have variables (attributes) CS 7450

  7. Data Table Format Case1 Case2 Case3 ... Variable1 Variable2 Variable3 ... Value11 Value21 Value31 Value12 Value22 Value32 Dimensions Value13 Value23 Value33 Think of as a function f(case1) = <Val11, Val12,…> CS 7450

  8. Example Mary Jim Sally Mitch ... SSN Age Hair GPA ... 145 294 563 823 23 17 47 29 brown black blonde red 2.9 3.7 3.4 2.1 People in class CS 7450

  9. Example Baseballstatistics CS 7450

  10. Variable Types • Three main types of variables • N-Nominal (equal or not equal to other values) • Example: gender • O-Ordinal (obeys < relation, ordered set) • Example: fr,so,jr,sr • Q-Quantitative (can do math on them) • Example: age CS 7450

  11. Metadata • Descriptive information about the data • Might be something as simple as the type of a variable, or could be more complex • For times when the table itself just isn’t enough • Example: if variable1 is “l”, then variable3 can only be 3, 7 or 16 CS 7450

  12. How Many Variables? • Data sets of dimensions 1, 2, 3 are common • Number of variables per class • 1 - Univariate data • 2 - Bivariate data • 3 - Trivariate data • >3 - Hypervariate data CS 7450

  13. Representation • What’s a common way of visually representing multivariate data sets? • Graphs! CS 7450

  14. Good Example www.nationmaster.com CS 7450

  15. Basic Symbolic Displays • Graphs  • Charts • Maps • Diagrams From: S. Kosslyn, “Understanding chartsand graphs”, Applied CognitivePsychology, 1989. CS 7450

  16. 1. Graph Showing the relationships between variables’values in a data table CS 7450

  17. Properties • Graph • Visual display that illustrates one or more relationships among entities • Shorthand way to present information • Allows a trend, pattern or comparison to be easily comprehended CS 7450

  18. Issues • Critical to remain task-centric • Why do you need a graph? • What questions are being answered? • What data is needed to answer those questions? • Who is the audience? money time CS 7450

  19. Graph Components • Framework • Measurement types, scale • Content • Marks, lines, points • Labels • Title, axes, ticks CS 7450

  20. Other Symbolic Displays • Chart • Map • Diagram Aside CS 7450

  21. 2. Chart • Structure is important, relates entities to each other • Primarily uses lines, enclosure, position to link entities Examples: flowchart, family tree, org chart, ... CS 7450

  22. 3. Map • Representation of spatial relations • Locations identified by labels CS 7450

  23. Choropleth Map Areas are filled and colored differently to indicate some attribute of that region CS 7450

  24. Cartography • Cartographers and map-makers have a wealth of knowledge about the design and creation of visual information artifacts • Labeling, color, layout, … • Information visualization researchers should learn from this older, existing area CS 7450

  25. 4. Diagram • Schematic picture of object or entity • Parts are symbolic Examples: figures, steps in a manual, illustrations,... CS 7450

  26. Details • What are the constituent pieces of these four symbolic displays? • What are the building blocks? CS 7450

  27. Visual Structures • Composed of • Spatial substrate • Marks • Graphical properties of marks CS 7450

  28. Space • Visually dominant • Often put axes on space to assist • Use techniques of composition, alignment, folding, recursion, overloading to 1) increase use of space 2) do data encodings CS 7450

  29. Marks • Things that occur in space • Points • Lines • Areas • Volumes CS 7450

  30. Graphical Properties • Size, shape, color, orientation... Spatial properties Object properties Position Size Expressing extent Grayscale Color Shape Texture Differentiating marks Orientation CS 7450

  31. Intermission • Getting slides • Getting papers • Photos CS 7450

  32. Back to Data • What were the different types of data sets? • Number of variables per class • 1 - Univariate data • 2 - Bivariate data • 3 - Trivariate data • >3 - Hypervariate data CS 7450

  33. Univariate Data • Representations Bill 7 5 3 1 Tukey box plot Middle 50% low high Mean 0 20 CS 7450

  34. What goes where • In univariate representations, we often think of the data case as being shown along one dimension, and the value in another Line graph Bar graph Y-axis is quantitativevariable Compare relative pointvalues Y-axis is quantitativevariable See changes overconsecutive values CS 7450

  35. Alternative View • We may think of graph as representing independent (data case) and dependent (value) variables • Guideline: • Independent vs. dependent variables • Put independent on x-axis • See resultant dependent variables along y-axis CS 7450

  36. Bivariate Data • Representations Scatter plot is common price Two variables, want tosee relationship Is there a linear, curved orrandom pattern? mileage Each mark is nowa data case CS 7450

  37. Trivariate Data • Representations 3D scatter plot is possible price horsepower mileage CS 7450

  38. Alternative Representation Still use 2D but havemark propertyrepresent thirdvariable CS 7450

  39. Alternative Representation Represent each variablein its own explicit way CS 7450

  40. Hypervariate Data • Ahhh, the tough one • Number of well-known visualization techniques exist for data sets of 1-3 dimensions • line graphs, bar graphs, scatter plots OK • We see a 3-D world (4-D with time) • What about data sets with more than 3 variables? • Often the interesting, challenging ones CS 7450

  41. Multiple Views Give each variable its own display 1 A B C D E 1 4 1 8 3 5 2 6 3 4 2 1 3 5 7 2 4 3 4 2 6 3 1 5 2 3 4 A B C D E CS 7450

  42. Scatterplot Matrix Represent each possible pair of variables in their own 2-D scatterplot Useful for what? Misses what? CS 7450

  43. Chernoff Faces Encode different variables’ values in characteristics of human face Cute applets: http://www.cs.uchicago.edu/~wiseman/chernoff/ http://hesketh.com/schampeo/projects/Faces/chernoff.html CS 7450

  44. Star Plots Var 1 Space out the n variables at equal angles around a circle Each “spoke” encodes a variable’s value Var 5 Var 2 Value Var 4 Var 3 CS 7450

  45. Star Plot examples http://seamonkey.ed.asu.edu/~behrens/asu/reports/compre/comp1.html CS 7450

  46. Star Coordinates E. Kandogan, “Star Coordinates: A Multi-dimensional Visualization Technique with Uniform Treatment of Dimensions”, InfoVis 2000 Late-Breaking Hot Topics, Oct. 2000 Demo CS 7450

  47. Parallel Coordinates • What are they? • Explain… CS 7450

  48. Parallel Coordinates Encode variables along a horizontal row Vertical line specifies values V1 V2 V3 V4 V5 CS 7450

  49. Parallel Coords Example Basic Grayscale Color CS 7450

  50. Application • System that uses parallel coordinates for information analysis and discovery • Interactive tool • Can focus on certain data items • Color Taken from: A. Inselberg, “Multidimensional Detective”InfoVis ‘97, 1997. CS 7450

More Related