Feature Generation: Linear Transforms

Feature Generation: Linear Transforms By Zhang Hongxin State Key Lab of CAD&CG 2004-03-24

Outline • Introduction • PCA and SVD • ICA • Other transforms

Introduction • Goal: choosing suitable transforms, so as to obtain high “information packing”. • Raw data -> Meaningful features. • Unsupervised/Automatic methods. • To exploit and remove information redundancies via transform.

Basis Vectors and Images • Input samples • Unitary NxN matrix A and transformed Vector • Basis vector representation

Basis Vectors and Images (cont’) • When X is an image, A is a huge piece of bread to eat ( ) • An alternative possibility: • Let U and V be two unitary matrices, and • Then Y is diagonal

The Karhunen-Loeve Transform • Goal: to generate features that are optimally uncorrelated, that is, • Correlation matrix • is symmetric, A is chosen so that its columns are the orthonormal eigenvectors of

Properties of KL transform • Mean square error approximation: • Error estimation: Approximation!

Principle Component Analysis • Choosing the eigenvectors corresponding to the m largest eigen-values of the correlation matrix, to obtain minimal error • This is also the minimum MSE, compare with any other approximation of x by an m-dimensional vector. • A different form: computing A in terms of eigen-values of the covariance matrix.

Remarks of PCA • Total variance • From all possible sets of m features, obtained via any orthorgnal linear transformation on x, KL have the largest sum variance. • Entropy • When zero mean Gaussian

Geometry interpretation • If the data points form an ellipsoidal shaped cloud • the eigenvectors are the principal axes of this hyper-ellipsoid • the first principal axis is the line that passes through its greatest dimension

Singular value decomposition • SVD of X Singular values Unitary Matrices

An example: Eigenfaces • G. D. Finlayson, B. Schiele & J. Crowley. Comprehensive colour image normalisation. ECCV 98 pp. 475~490.

Problem of PCA

Independent component analysis • Goal: find independence rather than un-correlation of the data. • Given the set of input samples X, determine an NxN invertible matrix W such that the entries y(i) of the transformed vector • are mutually independent. • ICA is meaningful only the involved random variables are non-Gaussian.

ICA based on Second and Fourth-order Cumulants • Hint: let Second and Fourth-order Cumulants be zero. • Step1. Perform a PCA on the input data. • Step2. Compute another unitary matrix, so that the fourth-order cross cumulants of the components of are zero. Equivalent to find Matrix diagonalization Finally, independent components is given by the combined transform

ICA based on mutual information • An iterative method.

Other transforms • Discrete Fourier Transform • Discrete Wavelet Transform • Please think about the relationship among those Linear Transforms.

Feature Generation: Linear Transforms