1 / 23

Author: Hao Cheng, Kien A Hua, and Khanh Vu University of Central Florida ICTAI ’07

Author: Hao Cheng, Kien A Hua, and Khanh Vu University of Central Florida ICTAI ’07. Local and Global Structures Preserving Projection. Overview. Introduction Proposed Algorithm Experiments Conclusions. Introduction. Data usually reside in a high dimensional space.

Download Presentation

Author: Hao Cheng, Kien A Hua, and Khanh Vu University of Central Florida ICTAI ’07

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Author: Hao Cheng, Kien A Hua, and Khanh Vu University of Central Florida ICTAI ’07 Local and Global Structures Preserving Projection

  2. Overview • Introduction • Proposed Algorithm • Experiments • Conclusions

  3. Introduction • Data usually reside in a high dimensional space. • The intrinsic dimensionality of data is much lower. • Manifold learning • finds a low dimensional embedding of the raw data; and the embedding can well preserve the intrinsic structures of the data. • a recent popular research topic.

  4. Related Work • Principal Component Analysis (PCA) • Local Preserving Projection (LPP) • Many others…

  5. PCA • Principal Component Analysis (PCA) • PCA projects the data along a set of axes which exhibit greater variances than other axes; • PCA minimizes the distortion of all the pairwise distances of the data after the reduction. • PCA can well preserve the global structures of the data.

  6. LPP • Local Preserving Projection (LPP) • LPP constructs a similarity matrix W: • If point i is the top K nearest neighbor of point j, then W(i,j) = W(j,i) = 1. Otherwise W(i,j) = 0. • W encodes local neighborhood information. • LPP finds a set of axes in order to minimize the pairwise distances of the data (indicated by W). • LPP can well preserve the neighborhoods.

  7. Nonlinear Methods • Both PCA and LPP are linear methods. • Nonlinear methods: • ISOMAP, Locally Linear Embedding (LLE), Hessian LLE (HLLE), Local Tangent Space Alignment (LTSA), Diffusion Maps (DM). • Problems: • Computational intensive. • Do not scale well. • Performances are not very robust.

  8. Motivation • PCA: global structure • LPP: local structure • Both global and local structures are important, and should be properly preserved! • Look at the toy examples.

  9. PCA LPP Toy Example 1 • Two classes of data

  10. LPP PCA Toy Example 2 • Two classes of data Neither of them does well!

  11. LGSPP • Local and Global Structure Preserving Projection (LGSPP): • Extracts local and global structures; • Derives the embedding to preserve the structures. Minimizes the distortions.

  12. Local Structure • For each data point x, • S(x) is the set of points include x itself and its Ks nearest neighbors (Ks is a system parameter). • S(x) is the local neighborhood around the point x.

  13. Distance preserving (black dotted lines) can prevent space collapsing! point x Global Structure • For each data point x, • D(x) is the set of Kd points, which are far from point x and also far from each other (Kd is another parameter). • For example: Blue dot x; Red/Green dots in D(x). Points in D(x) and point x are from different dense regions.

  14. Extraction Algorithm • Select a random sample set. • Pick the one farthest from point x, denoted as d1. • Pick the one which is farthest from x and d1, denoted as d2. • Continue till find Kd points.

  15. S(x) and D(x) • S(x): local neighborhood of x. • D(x): point x and points in D(x) are highly likely from different dense regions in the dataset. • Local and global structures: • S(x) and D(x) for each point x.

  16. Embedding • Goals of embedding: • Keep the points in S(x) close to each other in the reduced space: minimize the pairwise distances in S(x) • Keep the points in D(x) far from those in S(x) in the reduced space: maximize the pairwise distances between S(x) and D(x)

  17. Optimization • find a set of projection axes pi: • Equivalent to:

  18. Rewrite • Equivalent to: • Generalized Eigenvalue Problem.

  19. Toy Examples Revisit • LGSPP

  20. Synthetic datasets • 2-dimensional data. • : free variable, from -1 to 1. • 1st dimension: • 2nd dimension:

  21. More datasets • LGSPP

  22. Conclusions • LGSPP: • Extracts local and global structures. • Computes a salient embedding. • LGSPP: • Address the limitations of PCA and LPP. • Linear, fast, robust. • Works well on both synthetic and real-world examples.

  23. Questions?

More Related