1 / 25

A Rank-by-Feature Framework for Interactive Multi-dimensional Data Exploration

A Rank-by-Feature Framework for Interactive Multi-dimensional Data Exploration. Jinwook Seo and Ben Shneiderman Human-Computer Interaction Lab. & Department of Computer Science University of Maryland, College Park. Hierarchical Clustering Explorer (HCE).

morley
Download Presentation

A Rank-by-Feature Framework for Interactive Multi-dimensional Data Exploration

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Rank-by-Feature Framework for Interactive Multi-dimensional Data Exploration Jinwook Seo and Ben Shneiderman Human-Computer Interaction Lab. & Department of Computer Science University of Maryland, College Park

  2. Hierarchical Clustering Explorer (HCE)

  3. Hierarchical Clustering Explorer (HCE) “HCE enabled us to find important clusters that we don’t know about yet.”

  4. Goal: Find Interesting Features in Multidimensional Data • Finding correlations, clusters, outliers, gaps, … is difficult in multidimensional data • Cognitive difficulties in >3D • Therefore utilize low-dimensional projections • Perceptual efficiency in 1D and 2D • Use Rank-by-Feature Framework to guide discovery

  5. Do you see anything interesting?

  6. Do you see any interesting feature?

  7. Correlation…What else?

  8. Outliers He Rn

  9. Demo Demonstration • Breakfast Cereals • 77 cereals • 8 dimensions (or variables) : sugar, potassium, fiber, protein, etc. • US counties census data • 3138 counties • 14 dimensions : population density, poverty level, unemployment, etc.

  10. X3 X1 Low-dimensional Projections -2X1+X2 • Techniques • General • combination of variables for an axis • Axis parallel • a variable for an axis • Number of projections • Interface for Exploration X1+2X2

  11. Exploration by Projections • XGobi, GGobi – Scatterplot Browsing www.research.att.com/areas/stat/xgobi/ www.ggobi.org

  12. Exploration by Projections • Spotfire DecisionSite – Scatterplots www.spotfire.com

  13. Exploration by Projections • XGobi, GGobi – Grand Tour

  14. Exploration by Projections • XmdvTool – Scatterplot Matrix Worcester Polytechnic Institute

  15. Square Matrix Display Dimension selection tool Corrgram by Michael Friendly in GeoVISTA studio by Alan M. MacEachren

  16. Exploration by Projections • Spotfire DecisionSite– View Tip orders scatterplots

  17. Design Considerations • Hard to interpret arbitrary linear projections  Axis-parallel projections • Interestingness depends on applications  Incorporate users’ interest • Overview of all possible projections • Rapid change of axis

  18. Demo Demonstration • Breakfast Cereals • 77 cereals • 11 dimensions (or variables) : sugar, potassium, fiber, protein, etc. • US counties census data • 3138 counties • 14 dimensions : population density, poverty level, unemployment, etc.

  19. Rank-by-Feature Framework: 1D Ranking Criterion Rank-by-Feature Prism Score List Manual Projection Browser

  20. Rank-by-Feature Framework: 2D Ranking Criterion Rank-by-Feature Prism Score List Manual Projection Browser

  21. A Ranking Example 3138 U.S. counties with 17 attributes Ranking Criterion: Uniformity (entropy) (6.7, 6.1, 4.5, 1.5) Ranking Criterion: Pearson correlation (0.996, 0.31, 0.01, -0.69)

  22. Ongoing and Future Work • Identify & implement more ranking criteria • Gaps, outliers, etc. • Ranking based on users’ selection of items • Separability of the selected items • Ranking by using only the selected items • Scalability Issue • How to handle a large number of dimensions • Grouping by clustering dimensions • Filtering uninteresting entries in the prism

  23. More about HCE • In collaboration and sponsored by Eric Hoffman: Children’s National Medical Center • Freely downloadable at www.cs.umd.edu/hcil/hce • Version 3.0 beta, May 2004 • About 2,000 downloads since April 2002 • Licensing to ViaLactia Biosciences (NZ) Ltd.

  24. More Applications? • Try HCE and the Rank-by-Feature Framework with your problems and data • Join the case studies on the use of HCE and the Rank-by-Feature Framework • Welcome suggestions and comments

  25. Thank you !

More Related