1 / 34

Enhancing Set-Analysis through Scalable Visualizations

Enhancing Set-Analysis through Scalable Visualizations. Presented by: Hamid Haidarian Shahri ( hamid@cs.umd.edu ) Mudit Agrawal ( mudit@cs.umd.edu ). Content. Problem Definition Motivation Dataset Architecture Visualization Methods Interaction Tools Demo Future Work.

sera
Download Presentation

Enhancing Set-Analysis through Scalable Visualizations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Enhancing Set-Analysis through Scalable Visualizations Presented by: Hamid Haidarian Shahri (hamid@cs.umd.edu) Mudit Agrawal (mudit@cs.umd.edu)

  2. Content • Problem Definition • Motivation • Dataset • Architecture • Visualization Methods • Interaction Tools • Demo • Future Work CMSC 838S    Information Visualization Spring 2006

  3. Problem Definition • Analysis of sets by • representing the clusters graphically • depicting their internal and external links • Scaling visualization CMSC 838S    Information Visualization Spring 2006

  4. Motivation • Sets are encountered in various domains • websites • commodities • publications • anything that has attributes!! • Visualization of sets to aid human perception is still an unsolved problem • no direct relations between sets (or its elements) in spatial domain • can be grouped based on various attributes CMSC 838S    Information Visualization Spring 2006

  5. Dataset • 2700 law cases • Each case identified by a numerical id ranging from 1000 to 3718 • Tuples in the dataset imply a referencing • Relation is unidirectional and not symmetric (the referencing also implies a temporal constraint on the cases) CMSC 838S    Information Visualization Spring 2006

  6. Snapshot of the data First 50 links (approximately 0.1 percent of whole dataset) (1001,1105,'100 S.Ct. 318'),(1001,1612,'101 S.Ct. 2352'),(1001,1018,'107 S.Ct. 1232'),(1001,1016,'112 S.Ct. 2886'),(1001,2923,'113 S.Ct. 2264'),(1001,1016,'120 L.Ed.2d 798'),(1001,2923,'124 L.Ed.2d 539'),(1001,2286,'138 F.3d 1036'),(1001,2396,'238 F.3d 382'),(1001,3410,'438 U.S. 104'),(1001,1105,'444 U.S. 51'),(1001,1612,'452 U.S. 264'),(1001,1018,'480 U.S. 470'),(1001,1016,'505 U.S. 1003'),(1001,2923,'508 U.S. 602'),(1001,3410,'57 L.Ed.2d 631'),(1001,1105,'62 L.Ed.2d 210'),(1001,1612,'69 L.Ed.2d 1'),(1001,1789,'926 F.2d 1169'),(1001,1018,'94 L.Ed.2d 472'),(1001,3410,'98 S.Ct. 2646'),(1002,1276,'100 S.Ct. 2138'),(1002,1101,'105 S.Ct. 3108'),(1002,1018,'107 S.Ct. 1232'),(1002,1098,'107 S.Ct. 2378'),(1002,1016,'112 S.Ct. 2886'),(1002,1015,'114 S.Ct. 2309'),(1002,1016,'120 L.Ed.2d 798'),(1002,1013,'121 S.Ct. 2448'),(1002,1012,'122 S.Ct. 1465'),(1002,1015,'129 L.Ed.2d 304'),(1002,2316,'142 F.3d 1319'),(1002,1013,'150 L.Ed.2d 592'),(1002,1012,'152 L.Ed.2d 517'),(1002,1121,'266 F.3d 487'),(1002,3028,'306 F.3d 113'),(1002,3410,'438 U.S. 104'),(1002,1276,'447 U.S. 255'),(1002,1101,'473 U.S. 172'),(1002,1018,'480 U.S. 470'),(1002,1098,'482 U.S. 304'),(1002,1016,'505 U.S. 1003'),(1002,1015,'512 U.S. 374'),(1002,1013,'533 U.S. 606'),(1002,1012,'535 U.S. 302'),(1002,3410,'57 L.Ed.2d 631'),(1002,2091,'59 F.3d 852'),(1002,1276,'65 L.Ed.2d 106'),(1002,1889,'746 F.2d 135'),(1002,1101,'87 L.Ed.2d 126'),(1002,1018,'94 L.Ed.2d 472'),(1002,2319,'953 F.2d 1299'),(1002,1098,'96 L.Ed.2d 250'),(1002,3410,'98 S.Ct. 2646'),(1002,1022,'980 F.2d 84'),(1002,2670,'989 F.2d 362'),(1003,1104,'100 S.Ct. 383'),(1003,1611,'104 S.Ct. 2862'),(1003,1100,'106 S.Ct. 1018'),(1003,1099,'107 S.Ct. 2076'),(1003,1016,'112 S.Ct. 2886'),(1003,3110,'116 S.Ct. 2432'),(1003,1016,'120 L.Ed.2d 798'),(1003,1012,'122 S.Ct. 1465'),(1003,1881,'13 F.3d 1192'),(1003,3054,'133 F.3d 893'),(1003,3110,'135 L.Ed.2d 964'),(1003,1012,'152 L.Ed.2d 517'),(1003,1047,'18 F.3d 1560'),(1003,1886,'265 F.3d 1237'),(1003,2689,'271 F.3d 1090'),(1003,1358,'271 F.3d 1327'),(1003,1149,'28 F.3d 1171'),(1003,1040,'331 F.3d 891') (1001,1105,'100 S.Ct. 318') CMSC 838S    Information Visualization Spring 2006

  7. Architecture Visualization Module Clustering Module Clustered Data Data Similarity Metric CMSC 838S    Information Visualization Spring 2006

  8. Routine K-Means Clustering • Data points are in vector space. • x andare vectors. • This assumption does not hold for cases represented as sets. • Centroids are not simple geometric means. • In fact, mean does not make any sense. CMSC 838S    Information Visualization Spring 2006

  9. Routine Self Organizing Map • Wvand D are assumed to be vectors. • Wv(t + 1) = Wv(t) + Θ(t)α(t) [D(t) - Wv(t)] • This assumption does not hold. CMSC 838S    Information Visualization Spring 2006

  10. Similarity Measures • Jaccard similarity • Reference-based similarity • Weighted reference-based similarity CMSC 838S    Information Visualization Spring 2006

  11. Contribution to clustering • Applying K-means and SOM for producing better visualizations • Not apparent at first glance, but the above algorithms are not applicable to set visualization directly • They assume a 2D or nD (vector) representation for each data point (i.e. law case). More specifically, the attributes must form a vector space. • This assumption does not hold • no clear geometric attribute corresponding to the dataset CMSC 838S    Information Visualization Spring 2006

  12. Similarity Metrics  Geometric Metrics • 1-D Partitioning • 2-D Partitioning • Sequential arrangement • Distance based arrangement CMSC 838S    Information Visualization Spring 2006

  13. K-Means CMSC 838S    Information Visualization Spring 2006

  14. K-Means CMSC 838S    Information Visualization Spring 2006

  15. SOM after K-Means CMSC 838S    Information Visualization Spring 2006

  16. Various Interactive Tools • Referencing pattern (activating all links) • Local referencing • Density map • Representative element • Tool tip • Link follow-up • Search CMSC 838S    Information Visualization Spring 2006

  17. Referencing Pattern CMSC 838S    Information Visualization Spring 2006

  18. Local Referencing CMSC 838S    Information Visualization Spring 2006

  19. Local Referencing CMSC 838S    Information Visualization Spring 2006

  20. Density Map CMSC 838S    Information Visualization Spring 2006

  21. Density Map CMSC 838S    Information Visualization Spring 2006

  22. Representative Element CMSC 838S    Information Visualization Spring 2006

  23. Link Follow-up CMSC 838S    Information Visualization Spring 2006

  24. Link Follow-up CMSC 838S    Information Visualization Spring 2006

  25. Link Follow-up CMSC 838S    Information Visualization Spring 2006

  26. Link Follow-up CMSC 838S    Information Visualization Spring 2006

  27. Link Follow-up CMSC 838S    Information Visualization Spring 2006

  28. Link Follow-up CMSC 838S    Information Visualization Spring 2006

  29. Link Follow-up CMSC 838S    Information Visualization Spring 2006

  30. Link Follow-up CMSC 838S    Information Visualization Spring 2006

  31. DEMO

  32. Future Work • Other clustering algorithms can be explored: • Spectral • Fuzzy C-means • More similarity functions • Better initial posting of data • Zooming and Panning CMSC 838S    Information Visualization Spring 2006

  33. References • Abello, J., Korn, J., Visualizing Massive Multi-Digraphs. Proceedings of the IEEE Symposium on Information Visualization 2000. • Berry, M.W., Drma, Z., Jessup, E.R., Matrices, Vector Spaces, and Information Retrieval. SIAM Review, 41:2, 1999, pp. 335-362. • Gansner , E.R., Koutsofios, E., North, S.C., Vo, K.P., A Technique for Drawing Directed Graphs. IEEE Trans. on Soft. Eng. 19(3), 1993, pp. 214-230. • Guimerà, R., Mossa, S., Turtschi, A., Amaral, L.A.N., The Worldwide Air Transportation Network: Anomalous Centrality, Community Structure, and Cities' Global Roles. Proceedings of the National Academy of Sciences 102, May 31, 2005, pp. 7794-7799. • Jain, A.K., Murty, M.N., Flynn, P.J., Data Clustering: A Review. ACM Computing Surveys, 1999. • Kohonen, T., The Self-Organizing Map. Proceedings of the IEEE, Volume 78, Issue 9, Sept. 1990, pp. 1464-1480. • Kohonen, T., Kaski, S., Lagus, K., Salojärvi, J., Honkela, J., Paatero, V., Saarela, A., Self organization of a massive document collection. IEEE Transactions on Neural Networks, Vol. 11, 2000, pp. 574-585. • Kunz, C., Botsch, V., Ziegler, J., Spath, D., Contextualizing Search Results in Networked Directories. Proceedings of HCII, 2003. • Leuski, A., Strategy-based Interactive Cluster Visualization for Information Retrieval. International Journal on Digital Libraries, Vol. 3, Issue 2, 2000, pp. 170. • Liu, X., Luo, M., Shneiderman B. Visualization of Sets. Unpublished manuscript, 2005. • McQueen, J.B., Some Methods for classification and Analysis of Multivariate Observations. Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, University of California Press, 1967, pp. 281-297. • Murata, T., Visualizing the Structure of Web Communities Based on Data Acquired From a Search Engine. IEEE Trans. on Industrial Electronics, Vol. 50, No. 5, 2003. • Palla, G., Derenyi, I., Farkas, I., Vicsek, T., Uncovering the Overlapping Structure of Complex Networks in Nature and Society. Nature Letters, Vol. 435, 9 June 2005, pp. 814. • Self-organizing map. Wikipedia, The Free Encyclopedia. • Seo, J., Shneiderman, B., Understanding Hierarchical Clustering Results by Interactive Exploration of Dendograms: A Case Study with Genomic Microarray Data. IEEE Computer Special Issue on Bioinformatics, Volume 35, No. 7, July 2002, pp. 80-86. CMSC 838S    Information Visualization Spring 2006

  34. Thanks!

More Related