520 likes | 606 Views
The Fiedler Vector and Graph Partitioning. Barbara Ball baljmb@aol.com Clare Rodgers clarerodgers@hotmail.com. College of Charleston Graduate Math Department Research Under Dr. Amy Langville. Outline. General Field of Data Clustering Motivation Importance Previous Work
E N D
The Fiedler Vector and Graph Partitioning Barbara Ball baljmb@aol.com Clare Rodgers clarerodgers@hotmail.com College of Charleston Graduate Math Department Research Under Dr. Amy Langville
Outline General Field of Data Clustering • Motivation • Importance • Previous Work • Laplacian Method • Fiedler Vector • Limitations • Handling the Limitations
Outline Our Contributions • Experiments • Sorting eigenvectors • Testing Non-symmetric Matrices • Hypotheses • Implications Future Work • Non-square matrices • Proofs References
Understanding Graph Theory 2 3 4 1 6 5 10 7 8 9 Given this graph, there are no apparent clusters.
Understanding Graph Theory 7 3 2 4 1 6 5 9 10 8 Although the clusters are now apparent, we need a better method.
Finding the Laplacian Matrix • A = adjacency matrix • D = degree matrix • Find the Laplacian matrix, L • L = D - A Rows sum to zero 3 2 4 1 6 5 7 10 8 9
Behind the Scenes of theLaplacian Matrix • Rayleigh Quotient Theorem: • seeks to minimize the off-diagonal elements of the matrix • or minimize the cutset of the edges between the clusters Clusters apparent Not easily clustered
Behind the Scenes of theLaplacian Matrix • Rayleigh Quotient Theorem Solution: • 1=0, the smallest right-hand eigenvalue of the symmetric matrix, L • 1 corresponds to the trivial eigenvector v1= e = [1, 1, …, 1]. • Courant-Fischer Theorem: • also based on a symmetric matrix, L, searches for the eigenvector, v2, that is furthest away from e.
Using the Laplacian Matrix • v2, gives relation information about the nodes. • This relation is usually decided by separating the values across zero. • A theoretical justification is given by Miroslav Fiedler. Hence, v2 is called the Fiedler vector.
Using the Fiedler Vector • v2 is used to recursively partition the graph by separating the components into negative and positive values. Entire Graph: sign(V2)=[-, -, -, +, +, +, -, -, -, +] Reds: sign(V2)=[-, +, +, +, -, -] 1 2 3 7 8 9 4 6 5 7 2 1 3 9 8 10
Problems With Laplacian Method • The Laplacian method requires the use of: • an undirected graph • a structurally symmetric matrix • square matrices • Zero may not always be the best choice for partitioning the eigenvector values of v2 (Gleich) • Recursive algorithms are expensive
Current Clustering Method • Monika Henzinger, Director of Google Research in 2003, cited generalizing directed graphs as one of the top six algorithmic challenges in web search engines.
How Are These Problems Currently Being Solved? • Forcing symmetry for non-square matrices: • Suppose A is an (ad x term) non-square matrix. • B imposes symmetry on the information: • Example:
How Are These Problems Currently Being Solved? • Forcing symmetry in square matrices: • Suppose C represents a directed graph. • D imposes bidirectional information by finding the nearest symmetric matrix: D = C + CT • Example:
How Are These Problems Currently Being Solved? • Graphically Adding Data: 1 1 2 2 3 3
How Are These Problems Currently Being Solved? • Graphically Deleting Data: 1 1 2 2 3 3
Our Wish: • Use Markov Chains and the subdominant right-hand eigenvector ( Ball-Rodgers vector) to cluster asymmetric matrices or directed graphs.
Where Did We Get the Idea? Stewart, in An Introduction to Numerical Solutions of Markov Chains, suggests the subdominant, right-hand eigenvector (Ball-Rodgers vector) may indicate clustering.
Different Matrices: A: connectivity matrix L = D – A : Laplacian matrix rows sum to 0 P: probability Markov matrix the rows sum to 1 Q = I – P: transitional rate matrix rows sum to 0 Respective Eigenvectors: 2nd largest of A 2nd smallest of A 2nd smallest of L (Fiedler vector) 2nd largest of P (Ball-Rodgers vector) 2nd smallest of Q Different Matrices and Eigenvectors
4 Graph 1EigenvectorValue Plots 6 5 7 2 1 3 9 8 10 Second Largest of A Fiedler Vector 0 . 4 0 . 4 0 . 3 0 . 3 0 . 2 0 . 2 0 . 1 0 . 1 0 0 - 0 . 1 - 0 . 1 - 0 . 2 - 0 . 2 - 0 . 3 - 0 . 4 - 0 . 3 - 0 . 5 - 0 . 4 9 2 3 8 1 7 1 0 5 4 6 8 1 9 2 7 3 1 0 5 4 6 Ball-Rodgers Vector Second Smallest of Q 0 . 4 0 . 4 0 . 3 0 . 3 0 . 2 0 . 2 0 . 1 0 . 1 0 0 - 0 . 1 - 0 . 1 - 0 . 2 - 0 . 2 - 0 . 3 - 0 . 3 - 0 . 4 - 0 . 4 1 8 9 2 7 3 1 0 4 5 6 4 5 6 1 0 3 7 2 9 1 8
4 Graph 1Banding UsingEigenvector Values 6 5 7 2 1 3 9 8 10 Banded A Banded L Reorders just by using the indices of the sorted eigenvector – No Recursion Banded P Banded Q
4 Graph 1Reordering UsingLaplacian Method 6 5 7 2 1 3 9 8 10 Reordered L A
4 Graph 1Reordering UsingMarkov Method 6 5 7 2 1 3 9 8 10 A Reordered P Reordered Q
4 16 12 13 5 1 6 11 3 Graph 2EigenvectorValue Plots 20 8 7 17 14 18 15 22 9 19 2 21 10 23 Second Largest of A Fiedler Vector 0 . 1 0 . 3 0 . 0 5 0 . 2 0 0 . 1 - 0 . 0 5 - 0 . 1 0 - 0 . 1 5 - 0 . 1 - 0 . 2 - 0 . 2 5 - 0 . 2 - 0 . 3 - 0 . 3 - 0 . 3 5 - 0 . 4 - 0 . 4 1 0 2 2 2 0 2 1 1 5 1 8 2 1 9 1 2 2 3 1 4 3 1 6 5 1 3 1 7 1 1 9 6 4 8 7 1 9 4 7 1 8 6 1 1 1 7 1 6 3 2 0 2 1 9 2 1 1 5 2 2 1 0 2 3 1 8 1 2 1 4 5 1 3 Second Smallest of Q Ball-Rodgers Vector 0 . 4 0 . 3 0 . 3 0 . 2 0 . 2 0 . 1 0 . 1 0 0 - 0 . 1 - 0 . 1 - 0 . 2 - 0 . 2 - 0 . 3 - 0 . 3 - 0 . 4 5 1 3 2 3 1 8 1 4 1 0 2 2 1 2 1 5 2 1 2 1 9 2 0 3 1 6 1 7 1 1 6 8 1 7 4 9 9 4 7 1 8 6 1 1 1 7 1 6 3 2 0 2 1 9 2 1 1 5 1 2 1 0 2 2 1 4 1 8 2 3 5 1 3
4 16 12 13 5 1 6 11 3 Graph 2Banding UsingEigenvector Values 20 8 7 17 14 18 15 22 9 19 2 21 10 23 Banded A Banded L Nicely banded, but no apparent blocks. Banded P Banded Q
4 16 12 13 5 1 6 11 3 Graph 2Reordering UsingLaplacian Method 20 8 7 17 14 18 15 22 9 19 2 21 10 23 Reordered L A
4 16 12 13 5 1 6 11 3 Graph 2Reordering UsingMarkov Method 20 8 7 17 14 18 15 22 9 19 2 21 10 23 Reordered P A Reordered Q
Directed Graph 1 7 2 1 4 9 3 8 10 5 Although it is directed, the Fiedler vector still works. 6
Directed Graph 1 7 2 1 4 9 3 8 10 5 6 v2 = -0.5783 -0.2312 -0.0388 0.1140 0.1255 0.1099 -0.1513 -0.5783 -0.4536 0.0821
7 2 Directed Graph 1Reordering UsingLaplacian Method 3 1 9 4 10 8 5 6 Reordered L A
7 2 Directed Graph 1Reordering UsingMarkov Method 3 1 9 4 10 8 5 6 Reordered P A
Directed Graph 1-B 7 2 1 4 3 9 8 10 5 Was bi-directional 6
Directed Graph 1-B • The Laplacian Method no longer works on this graph. • Certain edges must be bi-directional in order to make the matrix irreducible. • Currently, to deal with this problem, a small number is added to each element of the matrix. 7 2 .01 3 1 9 4 10 8 5 6
7 2 3 Directed Graph 1-BReordering UsingMarkov Method 4 10 1 9 5 6 8 Reordered P A
Answer Directed Graph 2Reordering Using Markov Method Reordered P A
Answer Directed Graph 3Reordering Using Markov Method Reordered P A
Answer Directed Graph 4Reordering Using Markov Method Reordered P 10% anti-block elements A
Answer Directed Graph 5Reordering Using Markovian Method 30% anti-block elements Reordered P A Only the first partition is shown.
Plotting the eigenvector values gives better estimates of the number of clusters Sometimes, sorting the eigenvector values clusters the matrix without any type of recursive process. Using the stochastic matrix P cluster asymmetric matrices or directed graphs A number other than zero may be used to partition the eigenvector values. Recursive methods are time-consuming. The eigenvector plot takes virtually no time at all and requires very little programming or storage! Non-symmetric matrices (or directed graphs) can be clustered without altering data! Hypothesis Implications
Future Work • Experiments on Large Non-Symmetric Matrices • Non-square matrices • Clustering eigenvector values to avoid recursive programming • Proofs Questions
References • Friedberg, S., Insel, A., and Spence, L. Linear Algebra: Fourth Edition. Prentice-Hall. Upper Saddle River, New Jersey, 2003. • Gleich, David. Spectral Graph Partitioning and the Laplacian with Matlab. January 16, 2006. http://www.stanford.edu/~dgleich/demos/matlab/spectral/spectral.html • Godsil, Chris and Royle, Gordon. Algebraic Graph Theory. Springer-Verlag New York, Inc. New York. 2001. • Karypis, George. http://glaros.dtc.umn.edu/gkhome/node • Langville, Amy. The Linear Algebra Behind Search Engines. The Mathematical Association of America – Online. http://www.joma.org. December, 2005. • Mark S. Aldenderfer, Mark S. and Roger K. Blashfield. Cluster Analysis . Sage University Paper Series: Quantitative Applications in the Social Sciences,1984. • Moler, Cleve B. Numerical Computing with MATLAB. The Society for Industrial and Applied Mathematics. Philadelphia, 2004. • Roiger, Richard J. and Michael W. Geatz. Data Mining: A Tutorial-Based Primer Addison-Wesley, 2003. • Vanscellaro, Jessica E. “The Next Big Thing In Searching” Wall Street Journal. January 24, 2006.
References • Zhukov, Leonid. Technical Report: Spectral Clustering of Large Advertiser Datasets Part I. April 10, 2003. • Learning MATLAB 7. 2005. www.mathworks.com • www.Mathworld.com • www.en.wikipedia.org/ • http://www.resample.com/xlminer/help/HClst/HClst_intro.htm • http://comp9.psych.cornell.edu/Darlington/factor.htm • www-groups.dcs.st-and.ac.uk/~history/Mathematicians/Markov.html • http://leto.cs.uiuc.edu/~spiros/publications/ACMSRC.pdf • http://www.lifl.fr/~iri-bn/talks/SIG/higham.pdf • http://www.epcc.ed.ac.uk/computing/training/document_archive/meshdecomp-slides/MeshDecomp-70.html • http://www.cs.berkeley.edu/~demmel/cs267/lecture20.html • http://www.maths.strath.ac.uk/~aas96106/rep02_2004.pdf
Eigenvector Example back
Theory Behind the Laplacian • Minimize the edges between the clusters
Theory Behind the Laplacian • Minimizing edges between clusters is the same as minimizing off-diagonal elements in the Laplacian matrix. • min pTLp where pi = {-1, 1} and i is the each node. • p represents the separation of the nodes into positives and negatives. • pTLp = pT(D-A)p = pTDp – pTAp • However, pTDp is the sum across the diagonal, so is is a constant. • Constants do not change the outcome of optimization problems.
Theory Behind the Laplacian • min pTAp • This is an integer nonlinear program. • This can be changed to a continuous program by using Lagrange relaxation and allowing p to take any value from –1 to 1. We rename this vector x, and let its magnitude be N. So, xTx=N. • min xTAx - (xTx – N) • This can be rewritten as the Rayleigh Quotient: min xTAx/xTx = 1
Theory Behind the Laplacian • 1=0 and corresponds to the trivial eigenvector v1=e • The Courant-Fischer Theorem seeks to find the next best solution by adding an extra constraint of x e. • This is found to be the subdominant eigenvector v2, known as the Fiedler vector.
Theory Behind the Laplacian • Our Questions: • The symmetry requirement is needed for the matrix diagonalization of D. Why is D important since it is irrelevant for a minimization problem? • If diagonalization is important, could SVD be used instead? future