30 likes | 244 Views
Extensions of Non-Negative Matrix Factorization (NMF) to Higher Order Data. Morten Mørup , Department of Signal Processing, Informatics and Mathematical Modeling, Technical University of Denmark, mm@imm.dtu.dk webpage: www.imm.dtu.dk/~mm.
E N D
Extensions of Non-Negative Matrix Factorization (NMF) to Higher Order Data Morten Mørup, Department of Signal Processing, Informatics and Mathematical Modeling, Technical University of Denmark, mm@imm.dtu.dk webpage: www.imm.dtu.dk/~mm Increasing attention has lately been given to Non-negative Matrix Factorization due to its part based representation and ease of algorithmic implementation (Lee & Seung, 1999 & 2001). However, NMF is not in general unique – only when data adequately spans the positive orthant (Donoho and Stodden, 2004). Consequently, constraints in the form of sparsity is useful to achieve unique decompositions (Hoyer 2002,2004 Eggert & Körner 2004). As a result, algorithms for sparse coding using multiplicative updates have been derived (Eggert & Körner 2004, Mørup & Scmidt 2006b) Mathematical notation: Title of Nature article on NMF from 1999 NMF is based on gradient descent: Each component is updated by a step in the negative gradient direction NMF uses the concept of multiplicative updates: The derivative of the cost function can be split into a positive part i,d and a negative part i,d. Choosing the step size as the ratio of W i,d to the positive part of the derivative i,d yield multiplicative updates since the gradient step then cancel the Wi,d term in the gradient based update. The resulting NMF updates: The least squares (LS) and Kullback-Leibler (KL) divergence updates derived from the multiplicative update approach (Lee & Seung, 2001). Sparse Coding NMF: Sparse Coding NMF regularizes H while keeping W normalizes such that regularization is not simply achieved by letting H go to zero while W goes to infinity (Eggert and Körner, 2004 Mørup & Schmidt 2006b). Csparse(H) can be any function with positive derivative - a frequently used function is the 1-norm. NMF not in general unique: If the data does not adequately span the positive orthant no unique solution can be obtained. Here red and green vectors both perfectly span the data points. However, the green vectors represent the solution the most sparse. NTF (Non-negative Tensor Factorization) HONMF (Higher Order Non-negative Matrix Factorization) NTF2D/SNTF2D ((Sparse) Non-negative Tensor Factor 2D Deconvolution) Model NTF is based on the PARAFAC model (Harshman 1970, Carrol & Chang 1970, Fitzgerald et al., 2005) Model The NTF2D is a PARAFAC model convolutive in 2 dimensions (Mørup & Schmidt 2006c): Model The HONMF is based on the Tucker model (Tucker, 1977) where non-negativity is imposed on all modalities (Mørup et al. 2006e). Algorithms The PARAFAC model is a generalization of the factor analysis to higher orders, where the data is explained by an outer product of factor effects pertaining to each modality. To the right is given the general expression of the PARAFAC model for N-order tensors Three equivalent ways of stating the Tucker model. The Tucker model accounts for all possible linear interactions between the factor effects pertaining to each modality. Algorithms Algorithms Table giving how to update when imposing sparseness/normalizing the various modalities of the model Updates for the NTF2D - by including updates marked in gray sparseness is imposed on H forming the SNTf2D. Data results The algorithms were used on a dataset containing the inter trial phase coherence (ITPC) of wavelet transformed EEG data. Briefly stated the data consist of 14 subject recorded during a proprioceptive stimuli consisting of a weight change of left hand during odd trials and right hand during even trials giving a total of 14·2=28 trials. Consequently, the data has the following form XChannel Time-Frequency Trials (Mørup et al. 2006a) Data results The algorithms were used to analyze the absolute value of the log spectrogram of stereo recordings of music, i.e. the data had the form XChannel Log-Frequency Time Data results The algorithms were tested on a dataset of flow injection analysis (Nørgaard, 1994 Smilde, 1999), i.e. XSpectre Time Batch number The HONMF with sparseness imposed on the core and third modality resulted in a very consistent decomposition of the flow injection data capturing unsupervised the true concentrations present in each batch (given by modality 3). And also on the inter trial phase coherence (ITPC) of EEG data (see section on NTF for dataset details). Synthetic data True stereo music Decomposition result of a real stereo recording of music consisting of a Flute and Harp playing ”The Fog is Lifting” by Carl Nielsen. Scores given at the top. Clearly the SNTF2D separates the log-spectrogram into two components pertaining to the harp and flute respectively. By spectral masking of the log-spectrograms the two components are reconstructed revealing that the one component indeed pertains to the harp whereas the other pertains to the flute. Result obtained by the SNTF2D algorithms (bottom panel) when decomposing the log-spectrogram of synthetically generated stereo music (middle panel) generated from the true components given in the top panel. While the HONMF is not unique when no sparseness is imposed, it becomes unique when imposing sparseness on the core. Here revealing that the appropriate model to the data is a PARAFAC model (Mørup et al., 2006e). Furthermore, the HONMF decomposition gives a more part based representation that is easier to interpret than the solution found by HOSVD (Lathauwer et al., 2000). The NTF decomposition reveals a right parietal activity mainly present during odd trials corresponding to left hand stimuli as well as a more frontal and a higher frequent central parietal activity Mørup, M. and Schmidt, M.N. Sparse non-negative matrix factor 2-D deconvolution. Technical report, Institute for Mathematical Modeling, Tehcnical University of Denmark, 2006b Mørup, M and Schmidt, M.N. Non-negative Tensor Factor 2D Deconvolution for multi-channel time-frequency analysis. Technical report, Institute for Mathematical Modeling, Technical University of Denmark, 2006c Schmidt, M.N. and Mørup, M. Non-negative matrix factor 2D deconvolution for blind single channel source separation. In ICA2006, pages 700-707, 2006d Mørup, M. and Hansen, L.K.and Arnfred, S.M. Algorithms for Sparse Higher Order Non-negative Matrix Factorization (HONMF), Technical report, Institute for Mathematical Modeling, Technical University of Denmark, 2006e Nørgaard, L and Ridder, C.Rank annihilation factor analysis applied to flow injection analysis with photodiode-array detection Chemometrics and Intelligent Laboratory Systems 1994 (23) 107-114 Schmidt, M.N. and Mørup, M. Sparse Non-negative Matrix Factor 2-D Deconvolution for Automatic Transcription of Polyphonic Music, Technical report, Institute for Mathematical Modelling, Tehcnical University of Denmark, 2005 Smaragdis, P. Non-negative Matrix Factor deconvolution; Extraction of multiple sound sources from monophonic inputs. International Symposium on independent Component Analysis and Blind Source Separation (ICA)W Smilde, Age K. Smilde and Tauller, Roma and Saurina, Javier and Bro, Rasmus, Calibration methods for complex second-order data Analytica Chimica Acta 1999 237-251 Tamara G. Kolda Multilinear operators for higher-order decompositions technical report Sandia national laboratory 2006 SAND2006-2081. Tucker, L. R. Some mathematical notes on three-mode factor analysis Psychometrika 31 1966 279—311 Welling, M. and Weber, M. Positive tensor factorization. Pattern Recogn. Lett. 2001 References: Carroll, J. D. and Chang, J. J. Analysis of individual differences in multidimensional scaling via an N-way generalization of "Eckart-Young" decomposition, Psychometrika 35 1970 283--319 Eggert, J. and Korner, E. Sparse coding and NMF. In Neural Networks volume 4, pages 2529-2533, 2004 Eggert, J et al Transformation-invariant representation and nmf. In Neural Networks, volume 4 , pages 535-2539, 2004 Fiitzgerald, D. et al. Non-negative tensor factorization for sound source separation. In proceedings of Irish Signals and Systems Conference, 2005 FitzGerald, D. and Coyle, E. C Sound source separation using shifted non.-negative tensor factorization. In ICASSP2006, 2006 Fitzgerald, D et al. Shifted non-negative matrix factorization for sound source separation. In Proceedings of the IEEE conference on Statistics in Signal Processing. 2005 Harshman, R. A. Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-modal factor analysis},UCLA Working Papers in Phonetics 16 1970 1—84 Lathauwer, Lieven De and Moor, Bart De and Vandewalle, Joos MULTILINEAR SINGULAR VALUE DECOMPOSITION.SIAM J. MATRIX ANAL. APPL.2000 (21)1253–1278 Lee, D.D. and Seung, H.S. Algorithms for non-negative matrix factorization. In NIPS, pages 556-462, 2000 Lee, D.D and Seung, H.S. Learning the parts of objects by non-negative matrix factorization, NATURE 1999 Mørup, M. and Hansen, L.K.and Arnfred, S.M.Decomposing the time-frequency representation of EEG using nonnegative matrix and multi-way factorizationTechnical report, Institute for Mathematical Modeling, Technical University of Denmark, 2006a Parts of the above work done in collaboration with (see also references): Informatics and Mathematical Modeling Sidse M. Arnfred, Dr. Med. PhD Cognitive Research Unit Hvidovre Hospital University Hospital of Copenhagen Mikkel N. Schmidt, Stud. PhDDepartment of Signal Processing Informatics and Mathematical Modeling, Technical University of Denmark Lars Kai Hansen, Professor Department of Signal Processing Informatics and Mathematical Modeling, Technical University of Denmark