410 likes | 595 Views
Dimensions / Depth. James Slack CPSC 533C February 10, 2003. Overview. Linear data sources Information processing Aggregate visualization methods Embedding semantics of information Repetition and other patterns Examples in InfoVis. Linear Data Sources.
E N D
Dimensions / Depth James Slack CPSC 533C February 10, 2003
Overview • Linear data sources • Information processing • Aggregate visualization methods • Embedding semantics of information • Repetition and other patterns • Examples in InfoVis
Linear Data Sources • Univariate data arranged spatially or temporally • Complexity issues: • Patterns in text are cognitively hard to find • Text input could be viewed spatially • Cognition from visual abstractions of text is becoming more relevant
Information Processing • Why do we need information? • Technical aspects • Characterizing text by language semantics • Browsing versus querying • Interfacing with text visualization
Considering Visualization? The technical considerations: • Define what needs to be visualized • Transform input; must be possible! • Analyze to suit the input • Technique & derivative data storage
Text Features • 3 general types of features • Frequency based • Statistics on words or other tokens • Semantic features
Text Features • Frequency based text features: • Statistics on presence and count of unique words • Feature sets are word statistics
Text Features • Statistics on words or other tokens • Occurrence, frequency, and context of individual tokens define feature set • Sets can be explicitly specified or deterministically partitioned
Text Features • Semantic features • Natural groups of similar topics • Knowledge of language • Words have semantic meaning
Characterizing Text • Feature sets of text • A shorthand description of the original • Reduction in length, not in meaning • Semantics are often important, although not always necessary • Represented for efficient computation
Browsing vs. Querying • Querying is more precise • Specific results discarded or retained • The most specific features are important • Popularity of query is relative, closeness ratio compares potential matches • Similarity of results appear
Browsing vs. Querying • Browsing is more general • Choose similarity over exactness • The most common features are important • Clustering is a natural partition • Similarity of clusters appears • Analytical information processing
Interfacing With Visualizations • Spatial representations enhance cognition • Clusters can be viewed with browsing • A global overview of data is important • Techniques to visit clusters • Too many data points? • Display cluster centroids instead
Assisting Perception • Interface should provide: • Preconscious visual form for information • Interactions to sustain, enrich process of knowledge building • Fluid environment for reflective cognition • Framework for temporal knowledge building
Aggregate Visualization • Information overloads cognitive abilities • Understanding global, not local contexts • Visualize abstract representations of complex underlying structure • What can we gain from global context?
Embedding Semantics • Are some visualizations without meaning? • Galaxies, ThemeScapes highlight semantic meaning with relevant labels • Cluster viewer uses calendar to highlight temporal univariate patterns • Dot plots, arc diagrams use connectivity of similar input strings independent of semantics
Repetition and Patterns • How can you show something is repeated? • Place two occurrences close together • Colour two occurrences similarly • Connect two occurrences with a line • Each method has merits • No method works in all cases • We want to keep spatial/temporal information
Infovis Examples • SPIRE • Galaxies and ThemeScapes • Calendar Based Visualization • Dot Plots • Arc Diagrams
From SPIRE • Spatial Paradigm for Information Retrieval and Exploration • Galaxies cluster docupoints • ThemeScapes model landscape
Galaxies • Projection of clustering algorithms into 2D • Galaxies are clusters of related data • Proximity of galaxies is relevant • Designed to add temporal patterns to clustering
ThemeScape • Abstract 3D landscape of information • Reduce cognitive load using terrain • Elevation, colour encode theme strength redundantly • Landscape metaphor translates well • Peaks are easy to recognize • Interesting characteristics include ridges and valleys
Calendar Based Visualization • Time is linear, monotonic, scalar • Prediction is a useful side effect of visualizing the past • Time series data is often univariate • Periodic patterns emerge in time series data
Calendar Based Visualization • How about using 3 dimensions? • X-axis: Time of day • Y-axis: Days of data period • Z-axis: Univariate data samples
Calendar Based Visualization • Weekly variation obscured by pretty graphics • Where are the trends? • Is colour necessary for this? • Is colour sufficient for this? • Can everything be shown without overload?
Calendar Based Visualization • A more natural way: use a calendar • Cluster data into meaningful groups • Decide what the groups mean later? • Simple formulae are sufficient for clustering • Use robust statistical techniques • Generate binary clustering trees • Select desired clusters to visualize • Show clusters on calendar layout, simple graphs coloured appropriately
Visualizing Structure in Strings • M. Wattenberg: Arc diagrams • Summarize long strings, indicate repetition
Dot Plots • Finds structure in string data • Correlation matrix • Diagonal symmetry • Redundant information • Interesting repetitions can be confusing
Arc Diagrams • Finds structure in string data • Cognitive improvement over dot plots • Adaptable to reduce noise in data • Applications are varied: • Music • Text • Compiled code • Nucleotide sequences
Arc Diagrams • Interactive demonstration: • http://www.turbulence.org/Works/song/mono.html
Alternate Ending • Something went wrong with the demo, so here is a synopsis of arc diagrams
Paper References • Visualizing the non-visual: spatial analysis and interaction with information from text documents Wise, J.A.; Thomas, J.J.; Pennock, K.; Lantrip, D.; Pottier, M.; Schur, A.; Crow, V., Proc InfoVis 1995. • Cluster and Calendar based Visualization of Time Series Data Jarke J. van Wijk Edward R. van Selow, Proc InfoVis 99. • Arc Diagrams: Visualizing Structure in Strings. Martin Wattenberg, Proc InfoVis 2002.