10 likes | 137 Views
x i. x i. Xj’s. Xj’s. O( N 2 x Number of Iterations). O( NlogN x Number of Iterations). O( N 3 ). O( N 2 x Number of Iterations). True Manifold. Sampled data. Embedding of SNE. Embedding of SNE+IFGT. Fast Krylov Methods for N-Body Learning.
E N D
xi xi Xj’s Xj’s O(N2 x Number of Iterations) O(NlogN x Number of Iterations) O(N3 ) O(N2 x Number of Iterations) True Manifold Sampled data Embedding of SNE Embedding of SNE+IFGT Fast Krylov Methods for N-Body Learning Maryam Mahdaviani, Nando de Freitas, Yang Wang and Dustin Lang, University of British Columbia Experimental results Step 2: Fast N-Body Methods Gaussian Processes with Large Dimensional Features: Using GPs to predict the labels of 128-dimensional SIFT features for object class detection and localization. Thousands of features per image => BIG covariance matrix Fast Multipole Methods (Greengard et al): We present a family of low storage, fast algorithms for: Spectral clustering Dimensionality reduction Gaussian processes Kernel methods } Work in low dimensions but O(N) Fast Gauss Transform (FGT) Improved FGT (IFGT) Dual Trees (Gray & Moore) Spectral Clustering and Image Segmentation: A generalized eigenvalue problem. The inverse problem (e.g. in GPs) and the eigenvalue problem (e.g. in spectral clustering) have a computational cost of O(N3) and storage requirement O(N2) in the number of data points N. Step 1: Krylov subspace iteration And the storage is now O(N) instead of O(N2) Does it still work with step 2? Arnoldi GMRES for solving Ax = b But how do the errors accumulate over successive iterations? Original IFGT Dual Tree Nystrom Stochastic neighbor embedding: Projecting to low dimensions (Roweis and Hinton) In the paper we prove that: • The deviation in residuals is upper bounded Other similar Krylov Methods: Lanczos, Conjugate Gradients, MINRES 2) The orthogonality of the Krylov subspace can be preserved The upper-bounds in terms of the measured residuals will enable us to design adaptive algorithms Expensive Step: Requiring solving two kernel estimates Embedding on S-curve and Swiss-roll datasets