1 / 26

A Generalization of Forward-backward Algorithm

A Generalization of Forward-backward Algorithm. Ai Azuma Yuji Matsumoto Nara Institute of Science and Technology. Forward-backward algorithm. Allows efficient calculation of sums (e.g. expectation, ...) over all paths in a trellis. Plays an important role in sequence modeling

zander
Download Presentation

A Generalization of Forward-backward Algorithm

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Generalization of Forward-backward Algorithm Ai Azuma Yuji Matsumoto Nara Institute of Science and Technology

  2. Forward-backward algorithm • Allows efficient calculation of sums (e.g. expectation, ...) over all paths in a trellis. • Plays an important role in sequence modeling • HMMs (Hidden Markov Models) • CRFs (Conditional Random Fields)[Lafferty et al., 2001] • ...

  3. A sequential labeling example: part-of-speech tagging “Time flies like an arrow” Time [noun] flies [noun] like [noun] an [noun] arrow [noun] SOURCE SINK Time [verb] flies [verb] like [verb] an [verb] arrow [verb] Time [prep.] flies [prep.] like [prep.] an [prep.] arrow [prep.] Time [indef. art.] flies [indef. art.] like [indef. art.] an [indef. art.] arrow [indef. art.] in CRFs and HMMs, we need to compute the "sum" of the probabilities (or scores) of all paths.

  4. Forward-backward algorithm efficiently computes sums over all paths in the trellis with dynamic programming It is intractable to enumerate all paths in the trellis because the number of all paths is enormous Forward-backward algorithm recursively computes the sum from source/sink to sink/source with keeping intermediate results on each node and arc

  5. Forward-backward algorithm is applicable to = type of node/node pair = set of paths = set of nodes and arcs (cliques) in path = k-th feature

  6. Type of sums computable with forward-backward algorithm: = set of paths = set of nodes and arcs (cliques) in path

  7. But sometimes we need higher-order multivariate moments... • To name a few examples: • Correlation between features • Objectives more complex than log-likelihood • Parameter differentiations of these • ...

  8. Our goal: To generalize forward-backward algorithm for higher-order multivariate moments!

  9. Can we derive dynamic programming for this formula? Answer Record multiple forward/backward variables for each clique, and Combine all the previously calculated values by the binomial theorem

  10. SOURCE u A set of pathsfrom SOURCE to u ・・・・・

  11. SOURCE u A set of pathsfrom SOURCE to u Ordinaryforward-backward records only this variable ・・・・・

  12. u ・・・・・ v ・・・・・ SOURCE Direct ancestors of v ・・・・・ ・・・・・

  13. u ・・・・・ v ・・・・・ SOURCE Direct ancestors of v ・・・・・ These are derived from the binomial theorem ・・・・・

  14. ・・・・・ SINK ・・・・・ SOURCE ・・・・・ Direct ancestors of SINK ・・・・・ Desired values

  15. Summary of Our Ideas u multiple variables for each clique ・・・・・ v ・・・・・ SOURCE ・・・・・ ・・・・・ ・・・・・ Dependency between variables in a step, which is derived from the binomial theorem

  16. For multivariate cases, forward/backward variables have multiple indices u ・・・・・ ・・・・・

  17. Computational cost is only linear in the number of nodes and arcs in the trellis Linear in |V| and |E| To calculate the following form computational cost of the generalized forward-backward is proportional to

  18. Merits of the generalized forward-backward algorithm • The generalized forward-backward subsumes many existing task-specific algorithms • For some tasks, it leads to a solution more efficient than the existing ones

  19. Merit 1. The generalized forward-backward subsumes many existing task-specific algorithms:

  20. Merit 1. The generalized forward-backward subsumes many existing task-specific algorithms: All these formulas have a form computable with our proposed method.

  21. The previously proposed algorithms for these tasks are task-specific • The generalized forward-backward is a task-independentalgorithm applicable to formulae of the form • If a problem involves this form, it immediately offers efficient solution

  22. Merits of the generalized forward-backward algorithm • The generalized forward-backward subsumes many existing task-specific algorithms • For some tasks, it leads to a solution more efficient than the existing ones

  23. Merit 2. Efficient optimization procedure with respect to Generalized Expectation Criteria for CRFs [Mann et al., 2008] Algorithm proposed in [Mann et al., 2008] By a specialization of the generalization Nodes labeled as answers Computational cost is proportional to Computational cost is proportional to (L = # of nodes labeled as answers)

  24. Future tasks • Explore other tasks to which our generalized forward-backward algorithm is applicable • Extend the generalized forward-backward to trees and general graphs containing cycles

  25. Summary • We have generalized the forward-backward algorithm to allow for higher-order multivariate moments • The generalization offers an efficient way to compute complex models of sequences that involve higher-order multivariate moments • Many existing task-specific algorithms are instances of this generalization • It leads to a faster algorithm for computing Generalized Expectation Criteria for CRFs

  26. Summary Thank you for your attention! • We have generalized the forward-backward algorithm to allow for higher-order multivariate moments • The generalization offers an efficient way to compute complex models of sequences that involve higher-order multivariate moments • Many existing task-specific algorithms are instances of this generalization • It leads to a faster algorithm for computing Generalized Expectation Criteria for CRFs

More Related