1 / 8

Presented by: Jonathan Huang (jch1@cs.cmu) Advisor: Carlos Guestrin 1/24/2006

NIPS 2005 Review: Diffusion Maps, Spectral Clustering, and Eigenfunctions of Fokker-Planck Operators Boaz Nadler, St é phane Lafon, Ronald R. Coifman, Ioannis G. Kevrekidis. Presented by: Jonathan Huang (jch1@cs.cmu.edu) Advisor: Carlos Guestrin 1/24/2006. Main Idea .

rory
Download Presentation

Presented by: Jonathan Huang (jch1@cs.cmu) Advisor: Carlos Guestrin 1/24/2006

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. NIPS 2005 Review: Diffusion Maps, Spectral Clustering, and Eigenfunctions of Fokker-Planck Operators Boaz Nadler, Stéphane Lafon, Ronald R. Coifman, Ioannis G. Kevrekidis Presented by: Jonathan Huang (jch1@cs.cmu.edu) Advisor: Carlos Guestrin 1/24/2006

  2. Main Idea • A diffusion interpretation of clustering and dimensionality reduction methods which use the spectrum of the normalized graph Laplacian.

  3. The Normalized Graph Laplacian • Given a point cloud, x1,x2,…,xn, first form a matrix based on the Heat Kernel: • Let D be a diagonal matrix with Dii=j Wij • An algorithm for spectral clustering or dimensionality reduction might at this point find the first few eigenvectors of M=D-1W (or the last eigenvectors of D-W).

  4. Random Walks on a Graph • Notice that M is a stochastic matrix! (Dividing by D normalizes all the rows to sum to one) • We can view M as the transition matrix for a random walk on a graph, where the probabilities make it easy to jump to nearby points, and difficult to jump to far away points.

  5. The Eigenvalue Connection • Let p(t,xj|xi) be the probability of being at point xj at time t given that we started at point xi. What does this distribution look like as t!1? (What does xMt tend to?) • Answer: • The eigenvalues of M look like: 0 = 1 > 1¸2¸ … ¸n-1¸ 0 • This means that no matter how we begin the random walk, we will always converge to the principle eigenvector, the stationary distribution (0).

  6. Main Results • Define the diffusion distance as: • Define the diffusion mapt(x) as the mapping from the original space onto the space spanned by the first k eigenvectors. • Theorem: Diffusion distances in the original space are the same as Euclidean distances in the image of the diffusion map. • This theorem justifies using Euclidean distances in the diffusion map space for clustering/dimensionality reduction purposes! • At large enough times t, this distance is well approximated by using only a few eigenvectors.

  7. More Results • It was previously known that M converges in probability (as time steps get very small and number of points get large) to a certain Fokker-Planck operator which corresponds to a diffusion pde. (continuous in time and space). • Boundary Conditions: • Assumptions: the point cloud is obtained by sampling from a probability density which is confined to a compact connected set  with smooth boundary . • Result: In the limit !0, we have reflecting boundary conditions on :

  8. Thank you.

More Related