110 likes | 283 Views
Maximum likelihood estimation of intrinsic dimension. Authors: Elizaveta Levina & Peter J. Bickel presented by: Ligen Wang. Plan. Problem Some popular methods MLE approach Statistical behaviors Evaluation Conclusions. Problem. Facts:
E N D
Maximum likelihood estimation of intrinsic dimension Authors: Elizaveta Levina & Peter J. Bickel presented by: Ligen Wang
Plan • Problem • Some popular methods • MLE approach • Statistical behaviors • Evaluation • Conclusions
Problem • Facts: • Many real-life high-D data are not truly high-dimensional • Can be effectively summarized in a space of much lower dimension • Why discover this low-D structure? • Help to improve performance in classification and other applications • Our target: • How much is this lower dimension exactly, i.e., the intrinsic dimension • Importance of this lower dimension: • If our estimation is too low, features are collapsed onto the same dimension • If too high, the projection becomes noisy and unstable
Some popular methods • PCA • Decides the dimension by users by how much covariance they want to preserve • LLE • User provides the manifold dimension • ISOMAP • Provides error curves that can be ‘eyeballed’ to estimate dimension • Etc.
Conclusions • MLE produces good results on a range of simulated (both non-noisy and noisy) and read datasets • Outperforms two other methods • Suffers from a negative bias for high dimensions • Reason: approximation is based on observations falling in a small sphere, which requires very large sample size when the dimension is high • Good news: in reality, the intrinsic dimensions are low for most interesting applications