1 / 28

Overlapping Matrix Pattern Visualization: a Hypergraph Approach

This research focuses on reordering rows and columns of a data matrix to display submatrices effectively. Explore the visualization cost, hypergraph ordering problem, and leveraging MLA for optimal visualization results.

Download Presentation

Overlapping Matrix Pattern Visualization: a Hypergraph Approach

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Overlapping Matrix Pattern Visualization: a Hypergraph Approach Ruoming Jin Kent State University Joint with Yang Xiang, David Fuhry, and Feodor F. Dragan (KSU)

  2. The Problem • Given a set of discovered submatrices, how can we reorder the rows and columns of the data matrix to best display these submatrices and their relationship?

  3. Motivation: Overlapping Bicluster Visualization • Gene expression profiles (row: genes, columns: conditions, matrix entry: expression level) • Biclustering: homogeneous submatrices (genes  conditions) • Biclustering visualization problem [GMM06, KG07]

  4. i1 i2 i8 i9 t1 i1 i2 i3 i4 i5 i6 i7 i8 i9 t2 {t1,t2,t7,t8}X{i1,i2,i8,i9} t7 t1 t8 t2 i4 i5 i6 t3 t4 t4 {t4,t5}X{i4,i5,i6} t5 t5 t6 i2 i3 i7 i8 t7 t2 t8 t3 {t2,t3,t6,t7}X{i2,i3,i7,i8} t6 t7 Motivation: Transactional Data Visualization • Shopping-basket data (rows: transaction, columns: item, binary matrix) • Transactional data summarization using a set of dense submatrices [CK07, WK06, XJFD08] Summarization Cost=8+8+5=21

  5. Roadmap • Problem Definition • Visualization cost • Hardness of the visualization problem • Hypergraph ordering problem • Minimum linear arrangement (MLA) • Algorithm • Leveraging MLA and local convergence • Experimental Results

  6. i1 i2 i3 i4 i5 i6 i7 i8 i9 t1 t2 t3 t4 t5 t6 t7 t8 Submatrix Visualization Cost • Given a display of the matrix (a fixed row-order and column-order), how can we measure the goodness of “visualization” of a submatrix? {t1,t2,t7,t8}X{i1,i2,i8,i9} {t1,t2,t7,t8}X{i1,i2,i8,i9} i1 i2 i8 i9 i3 i7 i4 i5 i6 t1 t8 t2 t7 t3 t6 t4 t5 Why the second one is intuitively better than the second one?

  7. i1 i2 i3 i4 i5 i6 i7 i8 i9 t1 t2 t3 t4 t5 t6 t7 t8 Submatrix Visualization Cost {t1,t2,t7,t8}X{i1,i2,i8,i9} {t1,t2,t7,t8}X{i1,i2,i8,i9} • Area: 8x8, 6x6, 4x4, 4x4 • Perimeter: 8+8, 6+6, 4+4, 4+4 • Given a row order and a column order, the visualization cost of a submatrix is the sum of • difference between its first and last row w.r.t. the row order • difference between its first and last column w.r.t. the column order i1 i2 i8 i9 i3 i7 i4 i5 i6 t1 t8 t2 t7 t3 t6 t4 t5

  8. Matrix Visualization Cost • Given a row order and a column order, and a set of submatrices, the matrix visualization cost is the sum of these submatrices’ visualization cost. • Matrix Optimal Visualization Problem: • Find the optimal row order and column order such that the matrix visualization cost is minimal.

  9. Roadmap • Problem Definition • Visualization cost • Hardness of the visualization problem • Hypergraph ordering problem • Minimal linear arrangement (MLA) • Algorithm • Leveraging MLA and Local convergence • Experimental Results

  10. Hypergraph Ordering • Hypergraph HG=(V,X), • V is the set of vertices • X={x1,x2,…,} is the set of hyperedges, where each hyperedge is the set of vertices • Hyperedge cost and Hypergraph cost • Hypergraph Ordering Problem Hyperedge {0,2,3,4} cost = 4 0 1 2 3 4 5 6 Hypergraph cost=16 Hyperedge {1,3,5} cost = 4

  11. i1 i2 i3 i4 i5 i6 i7 i8 i9 i4 i3 i8 i1 t1 i5 t2 HG1 i9 i2 i7 i6 t3 t4 t5 t3 t2 t6 t1 t4 HG2 t7 t5 t8 t6 t8 t7 The Link between Matrix Visualization and Hypergraph Ordering • Relationship between matrix visualization cost and hypergraph cost • Finding minimum visualization (or hypergraph) cost is NP-hard

  12. Graph cost w.r.t. a vertex order MLA (Minimal Linear Arrangement): Find an optimal vertex ordering to minimize graph cost Hypergraph Ordering Problem is the Generalization of MLA 0 1 2 3 4 5 6 Graph cost=2+2+2*1+1+4+3+2=16 0 1 2 5 4 3 6 Graph cost=2+4+2*3+4+2+1+1=18

  13. Roadmap • Problem Definition • Visualization cost • Hardness of the visualization problem • Hypergraph ordering problem • Minimal linear arrangement • Algorithm • Leveraging MLA and Local convergence • Experimental Results

  14. Basic Idea for Hypergraph Ordering • Many existing work on solving MLA problem (heuristic or bounded-approximation) • Instead of working from scratch for the hypergraph ordering problem, can we somehow leverage the MLA algorithms? • The answer is YES!

  15. Basic Procedure Given the hypergraph HG=(V,X), and starts with a random vertex order  : • Step 1: Transforming the hypergraph HG into a graph G=(V,E) based on the vertex order ; • cost(HG, )=cost(G, ) • Step 2: Run MLA algorithm for graph G to produce a new optimal vertex order ’ • cost(G, ) cost(G, ’) • Step 3: If the new order improve the hypergraph cost, cost(HG, ) > cost(HG, ’), then use ’ as the new order (= ’), and repeat Step 1 and 2. • cost(G, ’)  cost(HG, ’) Cost(HG,  )=cost(G,  )cost(G, ’)cost(HG, ’)

  16. (Step1) Transformation: Hyperedge->Path 0 1 2 3 4 5 6 0 1 2 3 4 5 6 0 1 2 3 4 5 6 Hyperedge cost=path cost!

  17. Step 1->Step 2 0 1 2 3 4 5 6 0 1 2 3 4 5 6 Step 1 (Hypergraph->Graph): cost(G, )=2+2+2*1+1+4+3+2=16=cost(HG, ) 0 2 3 5 6 4 1 Step 2 (MLA): cost(G, ’)=1+2+2*1+2+1+2+3=13<cost(G, )

  18. Step 1->Step 2->Step 3 0 1 2 3 4 5 6 0 2 3 5 6 4 1 Step 1 (Hypergraph->Graph): cost(G, )=cost(HG, )=16 Step 2 (MinLA): cost(G, ’)=13<cost(G, ) 0 2 3 5 6 4 1 0 2 3 5 6 4 1 With the new ordering, hyperedge costpath cost!

  19. Step 1->Step 2->Step 3 0 1 2 3 4 5 6 0 1 2 3 4 5 6 0 2 3 5 6 4 1 Step 1 (Hypergraph->Graph): cost(G, )=cost(HG, )=16 Step 2 (MinLA): cost(G, ’)=13<cost(G, ) 0 2 3 5 6 4 1 Step 3: cost(HG, ’)=10<cost(G, ’)=13 Cost(HG,  )=cost(G,  )>cost(G, ’)>cost(HG, ’)

  20. Run Iteratively and Local Convergence

  21. Other conversions of hyperedge • Converting hyperedge to cycle • Converting hyperedge to mulicycles

  22. Roadmap • Problem Definition • Visualization cost • Hardness of the visualization problem • Hypergraph ordering • Algorithm • Minimum linear arrangement (MLA) • Leveraging MLA and local convergence • Experimental Results

  23. Visualization effects

  24. Visualization effects (continued)

  25. Visualization effects (continued)

  26. Cost and running time

  27. Conclusion • We found an interesting link from matrix visualization problem to a well-know graph theoretical problem: the minimal linear arrangement (MLA) problem. • Theoretically, we introduce a generalization of the MLA problem for the hypergraphs, and develop a novel local convergence algorithm • Our method can be incorporated into an interactive visualization environment to allow users to focus on different parts of the data and patterns.

  28. Thanks!!

More Related