230 likes | 385 Views
Image Compression by Learning Matrix Ortho-normal Bases. Karthik Gurumoorthy Ajit Rajwade Arunava Banerjee Anand Rangarajan Department of CISE University of Florida. Overview. A new approach to lossy image compression based on machine learning .
E N D
Image Compression by Learning Matrix Ortho-normal Bases • Karthik Gurumoorthy • Ajit Rajwade • Arunava Banerjee • Anand Rangarajan Department of CISE University of Florida
Overview • A new approach to lossy image compression based on machine learning. • Key idea: Learning of Matrix Ortho-normal Bases from training data to efficiently code images. • Applied to compression of well-known face databases like ORL, Yale. • Competitive with JPEG.
Background: Images as Vectors Vector Image Conventional learning methods in vision like PCA, ICA, etc.
Background: Images as Matrices Treated as a Image Matrix Our approach following Rangarajan [EMMCVPR-2001] & Ye [JMLR-2004]
Image Patches Image Image of size divided into N patches of size each treated as a Matrix.
SVD of a Patch P U S V = U and V: Ortho-normal matrices S: Diagonal Matrix of singular values
Exponentially decreasing Singular Values useful for compression (e.g.: SSVD [Ranade et al-IVC 2007]).
SVD for Ensemble of Patches? • Consider a set of N image patches: • SVD of each patch gives: • Costly in terms of storage as we need to store N ortho-normal basis pairs.
Common Orthonormal Basis-pairs? • Produce ortho-normal basis-pairs, common for all N patches. • Since storing the basis pairs is not expensive.
Away from SVD • Non-diagonal • Non-sparse
Away from SVD • What sparse matrix will optimally reconstruct from ? • Optimally = least error: • Sparse = matrix has at most some non-zero elements.
Away from SVD • We have a simple, provably optimal greedy method to compute such a • Compute the matrix . • In matrix , nullify all except the largest elements to produce .
Learning algorithm • A set of N image patches . • Learning K << N ortho-normal basis pairs Projection Matrices Memberships
Summary of Training Algorithm • Input: N image patches of size . • Output: K pairs of ortho-normal bases called as dictionary.
Testing phase • Divide each test image into patches of size • Fix per-pixel average error (say e), similar to the “quality” user-parameter in JPEG.
Testing phase . . . . . . . . .
Results: ROC Curve (ORL Database) RPP = number of bits per pixel
Sample Reconstructions 0.92 bits 1.36 bits 0.5 bits 1.78 bits 3.023 bits
Results: ORL Database • Size of original database is 3.46 MB. • Size of dictionary of 50 ortho-normal basis pairs is 56 KB=0.05MB. • Size of database after compression and coding with our method with e = 0.0001 is 1.3 MB. • Total compression rate achieved is 61%.
Results: ROC Curve (Yale Database) RPP = number of bits per pixel
Conclusions • New lossy image compression method using machine learning. • Key idea 1: matrix based image representation. • Key idea2: Learning small set of matrix ortho-normal basis pairs tuned to a database. • Results competitive with JPEG standard. • Future extensions: video compression.
References • A. Rangarajan, Learning matrix space image representations, Energy Minimizing Methods in Computer Vision and Pattern Recognition, 2001. • J. Ye, Generalized low rank approximation of matrices, Journal of Machine Learning Research ,2004. • M. Aharon, M. Elad and A. Bruckstein, The K-SVD: An algorithm for designing of overcomplete dictionaries for sparse representation. IEEE Transactions on Signal Processing, 2006. • A. Ranade, S. Mahabalarao and S. Kale. A variation on SVD based image compression. Image and Vision Computing, 2007.
Questions ???