160 likes | 307 Views
Visual Analytics Research at WPI. Dr. Matthew Ward and Dr. Elke Rundensteiner Computer Science Department. What is Visual Analytics?.
E N D
Visual Analytics Research at WPI Dr. Matthew Ward and Dr. Elke Rundensteiner Computer Science Department
What is Visual Analytics? • “The science of analytical reasoning facilitated by interactive visual interfaces”, from Illuminating the Path – the Research and Development Agenda for Visual Analytics, J. Thomas and K. Cook (eds.), 2005 • More than information visualization or visual data mining, it involves technology to support all aspects of the analysis and reasoning processes.
An Overview of VA at WPI Data Sources Transforms Abstractions Visual Representations Interaction Spaces Discovery & Reasoning -Files -Databases -Numeric -Nominal -Clustering -Sampling -Nominal to ordinal -Dimension reduction -Data (multiple) -Statistics -Structure (hierarchy) -Data -Structure (hierarchy) -Clusters -Associations -Nuggets -Outliers -Spatial -Temporal -Quality -Quality -Uncertainty -Missing values -Data quality -Abstraction quality -Anomalies -Events -Trends -Hypotheses -Clutter reduction -Streaming -Evidence -Past Work -Recent Work -Planned Work
Multiresolution Visualization • For large datasets, visualizations quickly get cluttered • We have extended all of our visualizations to work at multiple resolutions • Hierarchical clustering generates many levels of detail • User can select areas of interest to view at full resolution while the rest of the data is shown via cluster centers and extents (shown as bands of variable opacity) This work was funded by NSF grant IIS-9732897
Dimension Reduction • Dimensions are hierarchically clustered based on similarity measures • Hierarchy displayed using InterRing • Users select clusters of dimensions or representative dimensions for detailed analysis This work was funded by NSF grant IIS-0119276 42 dimension census dataset.
Linking Spatial and Non-Spatial • Diagonal plots of scatterplot matrix can have numerous uses • We’ve implemented histograms, line plots, and 2-D options • Example show multispectral remote sensing data, 1 layer per diagonal plot • User can select in either 2-D or parameter space and see corresponding elements in other views.
Layout Strategies • Different layout strategies can reveal different patterns in the data • Detecting, classifying, and measuring trends, outliers, repeated patterns, clusters, and correlations can be facilitated via specific layouts Cyclic Data Driven Principal Components Order Driven
Visualizing Data with Nominal Fields • Arbitrary assignment of non-numeric fields to numbers can lead to misinterpretation, lost patterns • By looking at similarities in distributions across all dimensions, we can group values of a nominal variable with similar global characteristics • Assignments used to convey order and relative distance Original Assignment Assignment after Correspondence Analysis This work was funded by NSF grant IIS-0119276 and funds from the NSA
Visual Clutter Reduction • In scenes with thousands of moving objects, there is need to reduce clutter • We’ve explored and developed many strategies, including: • Information-preserving • Information-reducing • Visual remapping This work was funded by a grant from the AFRL
Data Quality Visual Encoding • Data quality refers to the degree of uncertainty of data • Quality measures are visually encoded into existing visualizations • This helps users focus on high quality data to draw reliable conclusions This work was funded by NSF grant IIS-0414380
Quality Space Visualization • Quality space is visualized separately to convey patterns in the data quality measures • Records or dimensions can be ordered by quality to reveal structure and relations • Stripe view shows individual data value quality; Histogram view shows summarization and distribution StripeQualityMap HistogramQualityMap This work was funded by NSF grant IIS-0414380
Interactions between Data Spaceand Quality Space • Linking brush: When users select a subset in one space, the corresponding subset in the other space will be highlighted accordingly. • Sample figures: The data points in the data space with high values in the third dimension are highlighted, then the distribution of quality measures for this subset is rendered in the quality map. Data space with highlighting LinkedQuality space This work was funded by NSF grant IIS-0414380
Nugget Management System (NMS) • Nuggets are patterns, clusters, anomalies or other features of a data set that have been visually or computationally isolated. • NMS helps users to extract, consolidate and manage nuggets during their visual exploration. NMS eventually builds a hypothesis view based on the nugget space to support or refute hypotheses of users. Nugget Space Hypothesis View
Common Themes and Strategies • Provide data and attributes in multiple, linked spaces • Use automated and interactive tools for controlling and optimizing views • Measure quality at all stages of the pipeline and convey to the user for decision support • Assess quality measures by comparing them to user responses • Manage scale via abstractions such as sampling and clustering, but communicate information loss to analyst to allow trade-offs • Perform usability testing with all visualizations and interactive tools • Release code to the public domain for widest possible impact
Some References • Hierarchical Parallel Coordinates: • Fua, Y.-H., Ward, M. O., and Rundensteiner, E. A., "Hierarchical Parallel Coordinates for Visualizing Large Multivariate Data Sets," IEEE Conf. on Visualization '99, Oct. 1999. • Hierarchical Dimension Management: • Jing Yang, Matthew O. Ward, Elke A. Rundensteiner and Shiping Huang, "Visual Hierarchical Dimension Reduction for Exploration of High Dimensional Datasets", Proc. VisSym 2003. • Jing Yang, Wei Peng, Matthew O. Ward and Elke A. Rundensteiner, "Interactive Hierarchical Dimension Ordering, Spacing and Filtering for Exploration of High Dimensional Datasets", IEEE Symposium on Information Visualization 2003 (InfoVis 2003), pp 105 - 112, October 2003. • Visual Clutter Measurement and Reduction: • Wei Peng, Matthew O. Ward and Elke A. Rundensteiner, "Clutter Reduction in Multi-Dimensional Data Visualization Using Dimension Reordering", IEEE Symposium on Information Visualization 2004 (InfoVis 2004), pp 89 - 96, October 2004. • Glyph Layout: • Matthew O. Ward, "A taxonomy of glyph placement strategies for multidimensional data visualization", Information Visualization, Vol 1, pp 194-210, 2002. • Nominal Data Visualization: • Geraldine E. Rosario, Elke A. Rundensteiner, David C. Brown, Matthew O. Ward and Shiping Huang, "Mapping Nominal Values to Numbers for Effective Visualization", Information Visualization Journal, Vol 3, pp 80-95, 2004. • Data Quality Visualization: • Z. Xie, S. Huang, M. Ward, and E. Rundensteiner, “Exploratory Visualization of Multivariate Data with Variable Quality,” Proc. IEEE Symposium on Visual Analytics Science and Technology, pp 183-190, 2006. • Zaixian Xie, Matthew O. Ward, Elke A. Rundensteiner, Shiping Huang, "Integrating Data and Quality Space Interactions in Exploratory Visualizations", The Fifth International Conference on Coordinated & Multiple Views in Exploratory Visualization (CMV 2007), pp 47-60, July 2007. • Discovery Management: • Di Yang, Elke A. Rundensteiner, Matthew O. Ward, "Nugget Discovery in Visual Exploration Environments by Query Consolidation", ACM CIKM 2007, November, 2007 • Di Yang, Elke A. Rundensteiner, Matthew O. Ward, "Analysis Guided Visual Exploration to Multivariate Data", IEEE Symposium on Visual Analytics Science and Technology, October 2007.