440 likes | 457 Views
Structuring Interactive Cluster Analysis. Wayne Oldford University of Waterloo. Overview. Argument:. Content by example:. ill-defined problem high-interaction desirable explore partitions recast algorithms. problems resources interactive clustering partition moves implications.
E N D
Structuring Interactive Cluster Analysis Wayne Oldford University of Waterloo Dept. of Computer Science Memorial University of Newfoundland
Overview Argument: Content by example: • ill-defined problem • high-interaction desirable • explore partitions • recast algorithms • problems • resources • interactive clustering • partition moves • implications Dept. of Computer Science Memorial University of Newfoundland
Problem … geometric/visual structure Dept. of Computer Science Memorial University of Newfoundland
Problem … context matters Dept. of Computer Science Memorial University of Newfoundland
Problem … structure in context … segmentation in MRI … image source Dept. of Computer Science Memorial University of Newfoundland
Problem … context specific structure … image source Dept. of Computer Science Memorial University of Newfoundland
Problem … some specific some not … image source Dept. of Computer Science Memorial University of Newfoundland
Problem … some specific some not … image source Dept. of Computer Science Memorial University of Newfoundland
Problem • Find groups in data • Similar objects are together • Groups are separated • What do you mean similar? • Problem is ill defined: • E.g. what is contiguous structure? • When are groups separate? • Can we believe it? Dept. of Computer Science Memorial University of Newfoundland
Computational resources 1. Processing 2. Memory 3. Display Balance and integrate Dept. of Computer Science Memorial University of Newfoundland
High interaction • multiple displays • integrate computational resources • software design? Dept. of Computer Science Memorial University of Newfoundland
Example: image analysis Dept. of Computer Science Memorial University of Newfoundland
Example: context and function plots Dept. of Computer Science Memorial University of Newfoundland
Example: mutual support and shapes Dept. of Computer Science Memorial University of Newfoundland
Example: exploratory data analysis Dept. of Computer Science Memorial University of Newfoundland
Interactive clustering • visual grouping • location, motion, shape, texture, ... • linking across displays • manual • selection • cases, variates, groups, ... • colouring • focus • immediate and incremental • context can be used to form groups • multiple partitions Dept. of Computer Science Memorial University of Newfoundland
Automated clustering: typical software • resources dedicated to numerical computation • teletype interaction • runs to completion • graphical “output” • don’t always work so well (no universal solution) • confirm via exploratory data analysis Must be integrated with interactive methods Dept. of Computer Science Memorial University of Newfoundland
Example: K-means clustering Dept. of Computer Science Memorial University of Newfoundland
Example: VERI Visual Empirical Regions of Influence join points if no third point falls in this region Dept. of Computer Science Memorial University of Newfoundland
Example: VERI Dept. of Computer Science Memorial University of Newfoundland
Integrating automatic methods: Move about the space of partitions: Pa --> Pb --> Pc --> …. Which operators f f(Pa) --> Pb are of interest? Dept. of Computer Science Memorial University of Newfoundland
Refine Reduce Dept. of Computer Science Memorial University of Newfoundland
Reassign Dept. of Computer Science Memorial University of Newfoundland
-> 2 -> 3 -> 4 -> 5 Refinement sequence: 1 Dept. of Computer Science Memorial University of Newfoundland
-> 5 Reassign, reduce sequence: 5 Dept. of Computer Science Memorial University of Newfoundland
-> 5 -> 4 -> 3 -> 2 Reassign, reduce sequence: 5 Dept. of Computer Science Memorial University of Newfoundland
Moves: examples: • refine (Pold) --> Pnew break minimal spanning tree • reduce (Pold) --> Pnew join near centres • reassign (Pold) --> Pnew k-means maximize F • partition (graphic) --> Pnew colours from point cloud Dept. of Computer Science Memorial University of Newfoundland
Challenges: • varying focus • subsets (selected manually and at random) • merging new data into partition • exploring multiple partitions • interactive display and comparison • resolving many to one • interface design • control panels, options • interaction Dept. of Computer Science Memorial University of Newfoundland
Interface - reduce Dept. of Computer Science Memorial University of Newfoundland
Interface - refine Dept. of Computer Science Memorial University of Newfoundland
Interface - reassign Dept. of Computer Science Memorial University of Newfoundland
Interaction Dept. of Computer Science Memorial University of Newfoundland
Interaction - refine 2 Dept. of Computer Science Memorial University of Newfoundland
Interaction - refine 3 Dept. of Computer Science Memorial University of Newfoundland
Interaction -save partition movie Dept. of Computer Science Memorial University of Newfoundland
Interaction -refine 4 Dept. of Computer Science Memorial University of Newfoundland
Interaction - refine 5 Dept. of Computer Science Memorial University of Newfoundland
Interaction - refine 5 dendrogram Dept. of Computer Science Memorial University of Newfoundland
Interaction - reassign Dept. of Computer Science Memorial University of Newfoundland
Interaction - cluster plot movie Dept. of Computer Science Memorial University of Newfoundland
Implications: • Algorithms (re)cast in terms of moves: • refine, reduce • reassign • partition, partition-path • easily understandable (e.g. geometric structures) • specify required data structures • e.g. ms tree, triangulation, var-cov matrix, … Dept. of Computer Science Memorial University of Newfoundland
New problems: • interface design • multiple partitions • comparison and/or resolution • multiple display • inference Dept. of Computer Science Memorial University of Newfoundland
Summary • Cluster analysis is naturally exploratory and needs integration with modern interactive data analysis. • Enlarging the problem to partitions: • simplifies and gives structure • encourages exploratory approach • integrates naturally • introduces new possibilities (analysis and research) Dept. of Computer Science Memorial University of Newfoundland
Acknowledgements: • Catherine Hurley, Erin McLeish, Rayan Yahfoufi, Natasha Wiebe • U(W) students in statistical computing • Quail: Quantitative Analysis in Lisp http://www.stats.uwaterloo.ca/Quail Dept. of Computer Science Memorial University of Newfoundland