Manifold learning

Manifold learning Xin Yang Data Mining Course

Outline • Manifold and Manifold Learning • Classical Dimensionality Reduction • Semi-Supervised Nonlinear Dimensionality Reduction • Experiment Results • Conclusions Data Mining Course

What is a manifold? Data Mining Course

Examples: sphere and torus Data Mining Course

Why we need manifold? Data Mining Course

Data Mining Course

Manifold learning • Raw format of natural data is often high dimensional, but in many cases it is the outcome of some process involving only few degrees of freedom. Data Mining Course

Manifold learning • Intrinsic Dimensionality Estimation • Dimensionality Reduction Data Mining Course

Dimensionality Reduction • Classical Method: Linear: MDS & PCA (Hastie 2001) Nonlinear: LLE (Roweis & Saul, 2000) , ISOMAP (Tenebaum 2000), LTSA (Zhang & Zha 2004) -- in general, low dimensional coordinates lack physical meaning Data Mining Course

Semi-supervised NDR • Prior information Can be obtained from experts or by performing experiments Eg: moving object tracking Data Mining Course

Semi-supervised NDR • Assumption: Assuming the prior information has a physical meaning, then the global low dimensional coordinates bear the same physical meaning. Data Mining Course

Basic LLE Data Mining Course

Basic LTSA • Characterized the geometry by computing an approximate tangent space Data Mining Course

SS-LLE & SS-LTSA • Give m the exact mapping data points . • Partition Y as • Our problem : Data Mining Course

SS-LLE & SS-LTSA • To solve this minimization problem, partition M as: • Then the minimization problem can be written as Data Mining Course

SS-LLE & SS-LTSA • Or equivalently • Solve it by setting its gradient to be zero, we get: Data Mining Course

Sensitivity Analysis • With the increase of prior points, the condition number of the coefficient matrix gets smaller and smaller, the computed solution gets less sensitive to the noise in and Data Mining Course

Sensitivity Analysis • The sensitivity of the solution depends on the condition number of the matrix Data Mining Course

Inexact Prior Information • Add a regularization term, weighted with a parameter Data Mining Course

Inexact Prior Information • Its minimizer can be computed by solving the following linear system: Data Mining Course

Experiment Results • “incomplete tire” --compare with basic LLE and LTSA --test on different number of prior points • Up body tracking --use SSLTSA --test on inexact prior information algorithm Data Mining Course

Incomplete Tire Data Mining Course

Data Mining Course

Relative error with different number of prior points Data Mining Course

Up body tracking Data Mining Course

Results of SSLTSA Data Mining Course

Results of inexact prior information algorithm Data Mining Course

Conclusions • Manifold and manifold learning • Semi-supervised manifold learning • Future work Data Mining Course

Thank you ! Data Mining Course

Manifold learning