1 / 25

Extended Baum-Welch algorithm

Extended Baum-Welch algorithm. Present by shih-hung Liu 20060121. References. A generalization of the Baum algorithm to rational objective function - [Gopalakrishnan et al.] IEEE ICASP 1989

Download Presentation

Extended Baum-Welch algorithm

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Extended Baum-Welch algorithm Present by shih-hung Liu 20060121

  2. References • A generalization of the Baum algorithm to rational objective function-[Gopalakrishnan et al.] IEEE ICASP 1989 • An inequality for rational function with applications to some statistical estimation problems [Gopalakrishnan et al.] - IEEE Transactions on Information Theory 1991 • HMMs, MMIE, and the Speech Recognition problem-[Normandin 1991] PhD dissertation • Function maximization- [Povey 2004] PhD thesis chapter 4.5 NTNU Speech Lab.

  3. Outline • Introduction • Extended Baum-Welch algorithm [Gopalakrishnan et al.] • EBW from discrete to continuous [Normandin] • EBW for discrete [Povey] • Example of function optimization [Gopalakrishnan et al.] • Conclusion NTNU Speech Lab.

  4. Introduction • The well-known Baum-Eagon inequality provides an effective iterative scheme for finding a local maximum for homogeneous polynomials with positive coefficients over a domain of probability values • However, we are interesting in maximizing a general rational function. We extend the Baum-Eagon inequality to rational function NTNU Speech Lab.

  5. Extended Baum-Welch algorithm (1/6) [Gopalakrishnan 1989] • an arbitrary homogeneous polynomial with nonnegative coefficient of degree d in variables Assuming that this polynomial is defined over a domain of probability values, they show how to construct a transformation for some such that following the property: property A : for any and , unless NTNU Speech Lab.

  6. Extended Baum-Welch algorithm (2/6) [Gopalakrishnan 1989] • is a ratio of two polynomials in variables defined over a domain we are looking for a growth transformation such that for any and , unless • A reduction of the case of rational function to polynomial we reduce the problem of finding a growth transformation for a rational function to of finding that for a specially formed polynomial • reduce to Non-homogeneous polynomial with nonnegative • Extend Baum-Eagon inequality to Non-homogeneous polynomial with nonnegative NTNU Speech Lab.

  7. Extended Baum-Welch algorithm (3/6) [Gopalakrishnan 1989] • Step1: NTNU Speech Lab.

  8. Extended Baum-Welch algorithm (4/6) [Gopalakrishnan 1989] • Step2: NTNU Speech Lab.

  9. Extended Baum-Welch algorithm (5/6) [Gopalakrishnan 1989] • Step3: finding a growth transformation for a polynomial with nonnegative coefficients can be reduce to the same problem for a homogeneous polynomial with nonnegative coefficients 1 NTNU Speech Lab.

  10. Extended Baum-Welch algorithm (6/6) [Gopalakrishnan 1989] • Baum-Eagon inequality: NTNU Speech Lab.

  11. EBW for CDHMM – from discrete to continuous (1/3) [ Normandin 1991 ] • Discrete case for emission probability update NTNU Speech Lab.

  12. j j j EBW for CDHMM – from discrete to continuous (2/3) [ Normandin 1991 ] M subintervals Ik of width NTNU Speech Lab.

  13. EBW for CDHMM – from discrete to continuous (3/3) [ Normandin 1991 ] EBW NTNU Speech Lab.

  14. EBW for discrete HMMs (1/6) [Povey 2004] • The Baum-Eagon inequality is formulated for the case where there are variables in a matrix containing rows with a sum-to-one constraint , and we are maximizing a sum of polynomial terms in with nonnegative coefficient • For ML training, we can find an auxiliary function and optimize it • Finding the maximum of the auxiliary function (e.g. using lagrangian multiplier) leads to the following update, which is a growth transformation for the polynomial: NTNU Speech Lab.

  15. EBW for discrete HMMs (2/6) [Povey 2004] • The Baum-Welch update is an update procedure for HMMs which uses this growth transformation together with an algorithm known as the forward-backward algorithm for finding the relevant differentials efficiently NTNU Speech Lab.

  16. EBW for discrete HMMs (3/6) [Povey 2004] • An update rule as convenient and provable correct as the Baum-Welch update is not available for discriminative training of HMMs, which is a harder optimization problem • The Extended Baum-Welch update equation as originally derived is applicable to rational function of parameters which are subject to sum-to-one constraints • The MMI objective function for discrete-probability HMMs is an example of such a function NTNU Speech Lab.

  17. EBW for discrete HMMs (4/6) [Povey 2004] 1. Instead of maximizing for positive and ,we can instead maximize where and are the value of previous iteration ; increasing will cause to increase this is because is a strong sense auxiliary function for around 2. If some terms in the resulting polynomial are negative, we can add to the expression a constant C times a further polynomial which is constrained to be a constant (e.g. ), so as to ensure that no product of terms in the final expression has a negative coefficient two essential points used to derive the EBW update for MMI NTNU Speech Lab.

  18. EBW for discrete HMMs (5/6) [Povey 2004] By applying these two ideas : NTNU Speech Lab.

  19. EBW equivalent smooth function (6/6) [Povey 2004] NTNU Speech Lab.

  20. Example • consider C NTNU Speech Lab.

  21. Example NTNU Speech Lab.

  22. Conclusion • Presented an algorithm for maximization of certain rational function define over domain of probability values • This algorithm is very useful in practical situation for training HMMs parameters NTNU Speech Lab.

  23. MPE: Final Auxiliary Function weak-senseauxiliary function strong-sense auxiliary function smoothing function involved weak-sense auxiliary function NTNU Speech Lab.

  24. EBW derived from auxiliary function NTNU Speech Lab.

  25. EBW derived from auxiliary function NTNU Speech Lab.

More Related