1 / 35

Data Visualization

Data Visualization. Prepared for IST597C Ziming Zhuang School of Information Sciences and Technology The Pennsylvania State University November 2005. Outline. What … Why … Concepts How to … - Texts - Web - Images Challenges and to-dos References. Visualization is ….

dot
Download Presentation

Data Visualization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Visualization Prepared for IST597C Ziming Zhuang School of Information Sciences and Technology The Pennsylvania State University November 2005

  2. Outline • What … • Why … • Concepts • How to … - Texts - Web - Images • Challenges and to-dos • References Zhuang 11/2005

  3. Visualization is … • an interdisciplinary area built upon database, human-computer interaction, cognitive science, … • uses interactive visual representations of abstract data to amplify cognition. [Shneiderman et al. ’99] • involves selecting, transforming, and representing abstract data in a form that facilitates human interaction for exploration and understanding. [Cugini et al. ’96] Zhuang 11/2005

  4. Why visualize? - Early endeavor: scientific visualization [McCormick et al, ’86], “to improve the ability of people to understand the data they work with.” - Representation “A good picture's worth a thousand words.” • Graphical manipulation graphical query language, … Zhuang 11/2005

  5. Why visualize? (cont.) • Visualization helps users comprehend large quantities of data. • Visual attributes can present abstract representations of data. • Relationships among displayed entities become apparent. • Graphical techniques allow more direct intuitive interactions with the entities of interest. - [Cugini et al. ’96] Zhuang 11/2005

  6. Data Dimensions • Assume for every x and y we have temperature t and pressure p, we can do: f(x, y) -> (t, p) f1(x, y) -> t, f2(x, y) -> p f3(x, y, t) -> 0 or 1, f4(x, y, p) -> 0 or 1 f5(x, y, t, p) -> 0 or 1 • The key is that the mapping must go to a single value (or vector), e.g. f(x, t) -> 0 or more values of elements with position x and temp t, therefore losing information (e.g. hidden surfaces in projection). Zhuang 11/2005

  7. Graph Entities and Attributes • Entities: point, line, polyline, glyph, surface, solid, image, text • Attributes: color/intensity, location, style, size, relative position/motion Zhuang 11/2005

  8. Rationale of visualization • Gigabit bandwidth of the visual cortex system permits much faster perception of geometric and spatial relationships than any other mode [Consens et al. 1994] • Human eyes are more sensitive to such intuitive representations Zhuang 11/2005

  9. Rationale of visualization • Context varies our sensitivity • In increasing inaccuracy [Ward et al.] 1. Position along a common scale 2. Position along identical, non-aligned scales 3. Length 4. Angle/slope 5. Area 6. Volume 7. Hue/saturation/intensity (informally derived) Zhuang 11/2005

  10. Basic Visualization Methods • Rendering – what to show in a plot • Manipulation – what to do within plots • Linking – what information to share between plots [Sutherland et al. 2000] Zhuang 11/2005

  11. Methods – data rendering • Use scaling and offset to fit in range • Use derived values (residuals, logs) to emphasize changes • Use projections etc. to compress information • Use random jiggling to separate overlaps • Use multiple views to handle hidden relations or high dimensions • Use effective grids, keys and labels to aid understanding Zhuang 11/2005

  12. The PipeLine Method • Proposed in [Buja et al. 1988] Data Model Visualization Geometry Render Zhuang 11/2005

  13. Methods – data manipulating • Dynamically adjust mapping • Tour data by varying views • Deleting to de-cluster / eliminate clusters • Brushing/Highlighting to see correspondence in multiple views • Zoom in to focus attention; zoom out to show context • Panning / spinning to explore neighborhoods Zhuang 11/2005

  14. Visualizing text database • Motivation very large corpus; more sensitive to structure, similarity, and connectivity relationship; provide necessary context. • Methods Graphical browsing/query interface Zhuang 11/2005

  15. Visualizing text db • InfoGrid [Rao et al ’92] Zhuang 11/2005

  16. The Hy+ System • [Consens et al. 1994] Hy+ supports the visual presentation of structured data in the form of hygraphs; supports a visual query language GraphLog; supports filtering of data to reduce visual complexity and building new relationships among the data similar to creating new db views. Zhuang 11/2005

  17. The Hy+ System • Two fundamental capabilities - define new relationships using queries (the derived data or view can be visually presented): define queries - selective data visualization (filter relevant data and control the level of details): filter queries Zhuang 11/2005

  18. Visualizing Text Search • [Baeza-Yates ’96] Goal: visualizing large number of answers in text db. - visual browsing - document visualization Zhuang 11/2005

  19. Visualizing Text Search • [Baeza-Yates ’96] - visualizing query - visualizing answers Zhuang 11/2005

  20. The VIBE System • [Olsen et al. ’93] Location of a document icon is determined by the ratio of similarities between the documents and the POIs. si is the similarity between a given document and POI i. pi is the position vector for POI i. example Zhuang 11/2005

  21. Visualizing Search Results • [Nowell et al. '97] - Goal: to allow users to explore patterns in the large collection of search results. - two dimensional patterns (x- and y- axis) - screenshot Zhuang 11/2005

  22. Visualizing the Web • [Hasan et al. ’96] Maintained a connectivity db; the Hy+ system was interfaced with a web browser. The GraphLog query language restricted the set of docs displayed in the view by considering doc properties, matches in url / anchor text, etc. History was stored (“history graph”). Zhuang 11/2005

  23. Visualizing the Web • Applications - Touchgraph: 12 if 1 doesn’t work - NIRVE (The NIST Information Retrieval Visualization Engine) [Cugini et al. ’97] concept mappingdocument space - PRIZE [Cugini et al. ’96] spiral viewaxis viewnearest-neighbor Zhuang 11/2005

  24. Visualizing Image Retrieval • Related to our project!  • Problems for general Web image search engines (adapted from [Upstill et al. ’01]): - heterogeneity: inconsistence in results - no transparency: why images are retrieved? hard to refine queries - no relationships: grid layout is meaningless - coarse grained interaction: search again, or find similar Zhuang 11/2005

  25. Example of the problems • Example • This example image grid is generated for the query “clown, circus, tent". • Similar images are not adjacent in the grid. • The vector evidence is lost when compressing the ranking into a grid. Zhuang 11/2005

  26. Visualizing Image Retrieval • The VISR system, based on the spring model [Olsen et al. ’93] Zhuang 11/2005

  27. Visualizing Image Retrieval Zhuang 11/2005

  28. Visualizing conventional db • Polaris Project - Initially developed at Stanford; now a commercial tool - Goals: interactive analysis and exploration; simple and consistent interface - interface Zhuang 11/2005

  29. Visualizing conventional db • Polaris Project – multiscale visualizing using data cubes [Stolte ’02] Zhuang 11/2005

  30. Visualizing conventional db • Data Cube – data abstraction • Combine with visual abstraction: achieve multiscale. Zhuang 11/2005

  31. Visualizing conventional db • OpenDX website - initially developed at IBM • Chernoff Faces [Chernoff,1973] - a technique to illustrate trends in multi-dimensional data - different data dimensions were mapped to different facial features - especially effective because data is related to facial features which we are used to differentiating between. - websiteexample Zhuang 11/2005

  32. Challenges & “to-do”s • Standardized metrics? Only see “visualization entropy” and “visualization precision” in Oslen’s paper. Text-based metrics are still used and/or user studies. • Cognitive modeling • High dimensional representation and manipulation require faster processors and bigger RAM. Zhuang 11/2005

  33. References • McCormick et al. Visualization in Scientific Computing. SIGGraph Computer Graphics 21:6, 1987. • Rao et al. The information Grid: A framework for information retrieval and retrieval-centered applications. UIST ’92 • Hasan et al. Applying database visualization to the World Wide Web. SIGMOD Record 25:4, 1996. • Consens et al. Architecture and Applications of the Hy+ Visualization System. IBM Systems Journal 33:3, 1994. • Baeza-Yates. Visualization of Large Answers in Text Databases. • Cugini, Piatko, Laskowski, "Interactive 3D Visualization for Document Retrieval", Proceedings of the Workshop on New Paradigms in Information Visualization and Manipulation , CIKM '96, November 1996. • Cugini, Laskowski, Piatko, "Document Clustering in Concept Space: The NIST Information Retrieval Visualization Engine (NIRVE)", CODATA Euro-American Workshop on Visualization of Information and Data, Paris, France, June 1997. Zhuang 11/2005

  34. References • Novell et al. Exploring Search Results with Envision. CHI ’97. • Olsen et al. Visualization of a document collection: The VIBE system. Information Processing and Management, 29:1, 1993. • Upstill et al. Visual clustering of image search results. In Proc. SPIE Vol. 4302, 2001. • K. Olsen, R. Korfhage, M. Spring, K. Sochats, and J. Williams, Visualization of a Document Collection with Implicit and Explicit Links: The VIBE System," The Scandinavian Journal of Information Systems , August 1993. • Stolte, C. Multiscale Visualization Using Data Cubes. The Eighth IEEE Symposium on Information Visualization, October 2002. • Buja, A., Asimov, D., Hurley, C. & McDonald, J. A. (1988), Elements of a Viewing Pipeline for Data Analysis, in W. S. Cleveland & M. E. McGill, eds, 'Dynamic Graphics for Statistics', Wadsworth, Monterey, CA, pp. 277-308. • Sutherland, P. Rossini, A. Lumley, T., Lewin-Koh, N., Cook, D., Cox, Z. ORCA: A Visualization Toolkit for High-Dimensional Data. NRCSE Technical Report Series No. 046. May 18, 2000. • Herman Chernoff, "The use of faces to represent points in k-dimensional space graphically," Journal of American Statistics Association, v68, 361-368 (1973). Zhuang 11/2005

  35. Thank You • Questions and comments? Zhuang 11/2005

More Related