440 likes | 637 Views
Realization, identification and filtering for hidden Markov models using matrix factorization techniques. Bart Vanluyten. 04/’06 06/’06 08/’06 10/’06 12/’06 02/’07 04/’07 06/’07 08/’07 10/’07 12/’07 02/’08 04/’08. Mathematical modeling. Bel-20.
E N D
Realization, identification and filtering for hidden Markov models using matrix factorization techniques Bart Vanluyten
04/’06 06/’06 08/’06 10/’06 12/’06 02/’07 04/’07 06/’07 08/’07 10/’07 12/’07 02/’08 04/’08 Mathematical modeling Bel-20 Process with finite valued output: { Ç,È,= } 1. INTRODUCTION Modeling— HMMs—Finite valued process—Open problems—Relation to LSM
60% 30% 50% Bull Market Bear Market 20% 10% BEL20 Ç60% BEL20 È30% BEL20 = 70% BEL20 Ç10% BEL20 È20% BEL20 = 40% 20% 10% 20% StableMarket 30% BEL20 Ç30% BEL20 È40% BEL20 = 50% Hidden Markov model • Example: Bel-20 • Output process {up, down, unchanged} • State process {bull market, bear market, stable market} Andrey Markov (1856 - 1922) • State process has Markov property and is hidden 1. INTRODUCTION Modeling— HMMs—Finite valued process—Open problems—Relation to LSM
BEL20 4-06 8-06 12-06 4-07 8-07 12-07 4-08 Finite-valued processes Bio-informatics Coin flipping - dice-tossing (with memory) TGGAGCCAACGTGGAATGTCACTAGCTAGCTTAGATGGCTAAACGTAGGAATAC ACGTGGAATATCGAATCGTTAGCTTAGCGCCTCGACCTAGATCGAGCCGATCGGACTAGCTAGCTCGCTAGAAGCACCTAGAAGCTTAGACGTGGAAATTGCTTAATC { head, tail} { A, C, G, T} { 1, 2, ..., 6} FINITE-VALUEDPROCESSES Economics Speech recognition BEL20 4.800 4.600 4.400 4.200 4.000 3.800 3.600 { i:, e, æ, a:, ai, ..., z} { Ç,È,= } 1. INTRODUCTION Modeling— HMMs—Finite valued process—Open problems—Relation to LSM
Open problems for HMMs Realization problem Given: string prob’s Find: HMM generating string prob’s Identification problem Given: output sequence Find: HMM that models the sequence Obtain model from data Estimation problem Given: output sequence Find: state distribution at time Use model for estimation 1. INTRODUCTION Modeling— HMMs—Finite valued process—Open problems—Relation to LSM
Relation to linear stochastic model (LSM) • Mathematical model for stochastic processes • Output process continuous range of values • State process continuous range of values NOISE NOISE STATE OUTPUT + + 1. INTRODUCTION Modeling— HMMs—Finite valued process—Open problems—Relation to LSM
Relation to linear stochastic model Hidden Markov model Linear stochastic model Realization Identification Realization Identification Estimation Estimation 1. INTRODUCTION Modeling— HMMs—Finite valued process—Open problems—Relation to LSM
Relation to linear stochastic model Hidden Markov model Linear stochastic model Singular value decomposition Realization Identification Realization Identification Estimation Estimation 1. INTRODUCTION Modeling— HMMs—Finite valued process—Open problems—Relation to LSM
Relation to linear stochastic model Hidden Markov model Linear stochastic model Nonnegative matrix factorization Singular value decomposition Realization Identification Realization Identification Estimation Estimation 1. INTRODUCTION Modeling— HMMs—Finite valued process—Open problems—Relation to LSM
Outline Matrix factorizations Given: matrix Find: low rank approximation of 2nd objective Realization problem Given: string prob’s Find: HMM generating string prob’s Identification problem Given: output sequence Find: HMM that models the sequence 1st objective Estimation problem Given: output sequence Find: state distribution at time
Outline Matrix factorizations Given: matrix Find: low rank approximation of Realization problem Given: string prob’s Find: HMM generating string prob’s Identification problem Given: output sequence Find: HMM that models the sequence Estimation problem Given: output sequence Find: state distribution at time
Matrix – Decomposition – Rank : example • Matrix • Matrix decomposition • Matrix rank minimal inner dimension of exact decomposition 2. MATRIX FACTORIZATIONS Introduction— Existing factorizations—Structured NMF—NMF without nonneg. factors
Low rank matrix approximation • Rank approximation of James Sylvester (1814 - 1897) • Singular value decomposition (SVD) orthogonal • SVD yields (global) optimal low rank approximation in Frobenius distance 2. MATRIX FACTORIZATIONS Introduction— Existing factorizations—Structured NMF—NMF without nonneg. factors
Nonnegative matrix factorization • In some applications is nonnegative and and need to be nonnegative too • Nonnegative matrix factorization (NMF) of NONNEGATIVE NONNEGATIVE NONNEGATIVE • Algorithm (Kullback-Leibler divergence) [Lee, Seung] • This thesis: 2 modifications to NMF 2. MATRIX FACTORIZATIONS Introduction— Existing factorizations—Structured NMF—NMF without nonneg. factors
Structured NMF • Structured nonnegative matrix factorization of NONNEGATIVE NONNEGATIVE NONNEGATIVE NONNEGATIVE • Algorithm (Kullback-Leibler divergence) • Convergence to stationary point of divergence 2. MATRIX FACTORIZATIONS Introduction— Existing factorizations—Structured NMF—NMF without nonneg. factors
PETAL Structured NMF: application • Applications apart from HMMs: clustering data points • petal width • petal length • sepal width • sepal length Given: of 150 iris flowers SEPAL • Asked: Divide 150 flowers into clusters Setosa Versicolor Virginica 2. MATRIX FACTORIZATIONS Introduction— Existing factorizations—Structured NMF—NMF without nonneg. factors
cluster 1 cluster 2 cluster 3 Structured NMF: application • Computing distance matrix between points • Applying structured nonnegative matrix factorization on distance matrix • Clustering obtained by: PETAL LENGTH SEPAL WIDTH PETAL WIDTH PETAL LENGTH SEPAL LENGTH PETAL WIDTH SEPAL WIDTH SEPAL LENGTH 2. MATRIX FACTORIZATIONS Introduction— Existing factorizations—Structured NMF—NMF without nonneg. factors
NMF without nonnegativity of the factors • NMF without nonnegativity constraints on the factors of NONNEGATIVE NO NONNEGATIVITY CONSTRAINTS NONNEGATIVE • Example 3 3 • We provide algorithm (Kullback-Leibler divergence) • Problem allows to deal with upper bounds in an easy way 2. MATRIX FACTORIZATIONS Introduction— Existing factorizations—Structured NMF—NMF without nonneg. factors
NMF without nonnegativity of the factors • Applications apart from HMMs: database compression • Given: Database containing 1000 facial images of size 19 x 19 = 361 pixels • Asked: Compression of database using matrix factorization techniques 20 1000 361 . . . NMF without nonneg. factors Upperbounded NMF without nonneg. fact. ORIGINAL NMF > 1 Kullback-Leibler divergence: 564 339 383 2. MATRIX FACTORIZATIONS Introduction— Existing factorizations—Structured NMF—NMF without nonneg. factors
Outline Matrix factorizations Given: matrix Find: low rank approximation of Realization problem Given: string prob’s Find: HMM generating string prob’s Identification problem Given: output sequence Find: HMM that models the sequence Estimation problem Given: output sequence Find: state distribution at time
Hidden Markov models: Moore - Mealy ORDER • Moore HMM = NONNEGATIVE • Mealy HMM NONNEGATIVE 3. REALIZATION Introduction— Realization—Quasi realization—Approx. realization—Modeling DNA
NONNEGATIVE Realization problem • String from • String probabilities • String probabilities generated by Mealy HMM POSITIVE REALIZATION such that 3. REALIZATION Introduction— Realization—Quasi realization—Approx. realization—Modeling DNA
NONNEGATIVE Realization problem: importance • Theoretical importance: transform ‘external’ model into ‘internal’ model • Realization can be used to identify model from data POSITIVE REALIZATION 3. REALIZATION Introduction— Realization—Quasi realization—Approx. realization—Modeling DNA
Realizability problem • Generalized Hankel matrix Hermann Hankel (1839 - 1873) • Necessary condition for realizability: Hankel matrix has finite rank • No necessary and sufficient conditions for realizability are known • No procedure to compute minimal HMM from string probabilities • This thesis: two relaxations to positive realization problem • Quasi realization problem • Approximate positive realization problem 3. REALIZATION Introduction— Realization—Quasi realization—Approx. realization—Modeling DNA
Quasi realization problem NO NONNEGATIVITY CONSTRAINTS ! QUASI REALIZATION such that • Finiteness of rank of Hankel matrix = N & S condition for quasi realizability • Rank of hankel matrix = minimal order of exact quasi realization • Quasi realization is more easy to compute than positive realization • Quasi realization typically has lower order than positive realization • Negative probabilities • No disadvantage in several estimation applications 3. REALIZATION Introduction— Realization—Quasi realization—Approx. realization—Modeling DNA
Partial quasi realization problem • Given: String probabilities of strings up to length t • Asked: Quasi HMM that generates the string probabilities • This thesis: • Partial quasi realization problem has always a solution • Minimal partial quasi realization obtained with quasi realization algorithm if a rank condition on the Hankel matrix holds • Minimal partial quasi realization problem has unique solution (up to similarity transform) if this rank condition holds 3. REALIZATION Introduction— Realization—Quasi realization—Approx. realization—Modeling DNA
Approximate quasi realization problem • Given: String probabilities of strings up to length t • Asked: Quasi HMM that approximately generates the string probabilities • This thesis: algorithm • Compute low rank approximation of largest Hankel block subject to consistency and stationarity constraints Upperbounded NMF without nonnegativity of the factorswith additional constraints • Reconstruct Hankel matrix from largest block We prove that rank does not increase in this step • Apply partial quasi realization algorithm 3. REALIZATION Introduction— Realization—Quasi realization—Approx. realization—Modeling DNA
NONNEGATIVE Approximate positive realization problem • Given: String probabilities of strings up to length t • Asked: Positive HMM that approximately generates the string probabilities APPROXIMATE POSITIVE REALIZATION such that 3. REALIZATION Introduction— Realization—Quasi realization—Approx. realization—Modeling DNA
Approximate positive realization problem • Moore, t = 2 • Define • If string probabilities are generated by Moore HMM where Structured nonnegative matrix factorization • Mealy, general t Generalize approachfor Moore, t = 2 3. REALIZATION Introduction— Realization—Quasi realization—Approx. realization—Modeling DNA
Modeling DNA sequences • DNA TGGAGCCAACGTGGAATGTCACTAGCTAGCTTAGATGGCTAAACGTAGGAATACCCT ACGTGGAATATCGAATCGTTAGCTTAGCGCCTCGACCTAGATCGAGCCGATCGGTCT ACTAGCTAGCTCGCTAGAAGCACCTAGAAGCTTAGACGTGGAAATTGCTTAATCTAG • 40 sequences of length 200 • String probabilities of strings up to length 4 stacked in Hankel matrix Ù SINGULAR VALUE Ù • Kullback-Leibler divergence Ù Ù Ù Ù Ù Ù Ù Ù Ù Ù Ù Ù Ù Ù Ù Ù Ù Ù Ù 3. REALIZATION Introduction— Realization—Quasi realization—Approx. realization—Modeling DNA
Outline Matrix factorizations Given: matrix Find: low rank approximation of Realization problem Given: string prob’s Find: HMM generating string prob’s Identification problem Given: output sequence Find: HMM that models the sequence Estimation problem Given: output sequence Find: state distribution at time
Identification problem • Given: Output sequence of length T • Asked: (Quasi) HMM that models the sequence NO NONNEGATIVITY CONSTRAINTS! NONNEGATIVE • Approach Linear Stochastic Models Prediction error identification Subspace basedidentification SVD HiddenMarkovModels Baum-Welch identification Subspace inspiredidentification NMF 4. IDENTIFICATION Introduction— Subspace inspired identification—HIV modeling
Identification problem output sequence system matrices state sequence state sequence system matrices Baum-Welch Subspace inspired 4. IDENTIFICATION Introduction— Subspace inspired identification—HIV modeling
Subspace inspired identification • Estimate the (quasi) state distribution • quasi state predictor can be built from data using upperbounded NMF without nonnegativity of the factors • state predictor can be built from data using NMF We have shown that: . . . . . . . . . . . . . . . . . . . . . • Compute the system matrices: least squares problem Quasi HMM: Positive HMM: 4. IDENTIFICATION Introduction— Subspace inspired identification—HIV modeling
A Modeling sequences from HIV genome • Mutation • HIV virus ENVELOPE CORE MATRIX • 25 mutated sequences of length 222 from the part of the HIV1 genome that codes for the envelope protein [NCBI database] • Training set • Test set • HMM model using Baum-Welch – Subspace inspired identification 4. IDENTIFICATION Introduction— Subspace inspired identification—HIV modeling
Modeling sequences from HIV genome • Kullback-Leibler divergence (string probabilities of length-4 strings) • Mean likelihood of the given sequences • Likelihood of using third order subspace inspired model • Model can be used topredict new viral strains and to distinguish between different HIV subtypes 4. IDENTIFICATION Introduction— Subspace inspired identification—HIV modeling
Outline Matrix factorizations Given: matrix Find: low rank approximation of Realization problem Given: string prob’s Find: HMM generating string prob’s Identification problem Given: output sequence Find: HMM that models the sequence Estimation problem Given: output sequence Find: state distribution at time
Estimation for HMMs • Filtering – smoothing – prediction • State estimation – output estimation HMM HMM = span of available measurements FILTERING: t TIME SMOOTHING: t TIME PREDICTION: t TIME • We derive recursive formulas to solve state and output filtering, prediction and smoothing problems 5. ESTIMATION Estimation for HMMs— Application
Estimation for HMMs • Example: • Recursive algorithm to compute • Recursive output estimation algorithms effective with quasi HMM • Finiteness of rank of Hankel matrix = N & S condition for quasi realizability • Rank of hankel matrix = minimal order of exact quasi realization • Quasi realization is easier to compute than positive realization • Quasi realization typically has lower order than positive realization • Negative probabilities • No disadvantage in output estimation problems 5. ESTIMATION Estimation for HMMs— Application
Mef-2 Myf Sp-1 SRF TEF Finding motifs in DNA sequences • Findmotifsin muscle specific regulatory regions [Zhou, Wong] • Make motif model • Make quasi background model (see Section realization) • Build joint HMM • Perform output estimation • Results (compared to results from Motifscanner [Aerts et al.]) MOTIF PROBABILITY MOTIF PROBABILITY POSITION POSITION 5. ESTIMATION Estimation for HMMs— Application
Conclusions • Two modification to the nonnegative matrix factorization • Structured nonnegative matrix factorization • Nonnegative matrix factorization without nonnegativity of the factors • Two relaxations to the positive realization problem for HMMs • Quasi realization problem • Approximate positive realization problem • Both methods were applied to modeling DNA sequences • We derive equivalence conditions for HMMs • We propose a new identification method for HMMs • Method was applied to modeling DNA sequences of HIV virus • Quasi realizations suffice for several estimation problems • Quasi estimation methods were applied to finding motifs in DNA sequences 6. CONCLUSIONS Conclusions— Further research— List of publications
Further research Matrix factorizations • Develop nonnegative matrix factorization with nesting property (cfr. SVD) Hidden Markov models • Investigate Markov models (special case of hidden Markov case) • Develop realization and identification methods that allow to incorporate prior-knowledge in the Markov chain • Method to estimate minimal order of positive HMM from string probabilities • Canonical forms of hidden Markov models • Model reduction for hidden Markov models • System theory for hidden Markov models with external inputs . . . 6. CONCLUSIONS Conclusions— Further research— List of publications
List of publications • Journal papers • B. Vanluyten, J.C. Willems and B. De Moor. Recursive Filtering using Quasi-Realizations. Lecture Notes in Control and Information Sciences, 341, 367–374, 2006. • B. Vanluyten, J.C. Willems and B. De Moor. Equivalence of State Representations for Hidden Markov Models. Systems and Control Letters, 57(5), 410–419, 2008. • B. Vanluyten, J.C. Willems and B. De Moor. Structured Nonnegative Matrix Factorization with Applications to Hidden Markov Realization and Filtering. Accepted for publication in Linear Algebra and its Applications, 2008. • B. Vanluyten, J.C. Willems and B. De Moor. Nonnegative Matrix Factorization without Nonnegativity Constraints on the Factors. Submitted for publication. • B. Vanluyten, J.C. Willems and B. De Moor. Approximate Realization and Estimation for Quasi hidden Markov models. Submitted for publication. • International conference papers • I. Goethals, B. Vanluyten, B. De Moor. Reliable spurious mode rejection using self learning algorithms. In Proc. of the International Conference on Modal Analysis Noise and Vibration Engineering (ISMA 2004), Leuven, Belgium, pages 991–1003, 2004. • B. Vanluyten, J. C.Willems and B. De Moor. Model Reduction of Systems with Symmetries. In Proc. of the 44th IEEE Conference on Decision and Control (CDC 2005), Seville, Spain, pages 826–831, 2005. • B. Vanluyten, J. C. Willems and B. De Moor. Matrix Factorization and Stochastic State Representations. In Proc. of the 45th IEEE Conference on Decision and Control (CDC 2006), San Diego, California, pages 4188-4193, 2006. • I. Markovsky, J. Boets, B. Vanluyten, K. De Cock, B. De Moor. When is a pole spurious? In Proc. of the International Conference on Noise and Vibration Engineering (ISMA 2007), Leuven, Belgium, pp. 1615–1626, 2007. • B. Vanluyten, J. C. Willems and B. De Moor. Equivalence of State Representations for Hidden Markov Models. In Proc. of the European Control Conference 2007 (ECC 2007), Kos, Greece, 2007. • B. Vanluyten, J. C. Willems and B. De Moor. A new Approach for the Identification of Hidden Markov Models. In Proc. of the 46th IEEE Conference on Decision and Control (CDC 2006), New Orleans, Louisiana, 2007. 6. CONCLUSIONS Conclusions— Further research— List of publications