1 / 20

Predictive State Representations

This document outlines the advantages of Predictive State Representations (PSR), the concept of PSR, how to learn PSR models, and key conclusions. PSR offers advantages like expression based on observable quantities, avoidance of local minima and saddle points, and generality equal to POMDP models. The descriptions delve into predictive state representations, including notations, system dynamics matrix, relation to POMDP, and the linear PSR model definition. Learning the PSR model involves discovery of core tests and parameters, as well as analytical algorithms for discovery and learning. This comprehensive guide presents a new approach to dynamical systems modeling through PSR and analytical discovery techniques.

robine
Download Presentation

Predictive State Representations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Predictive State Representations Hui Li July 7, 2006

  2. Outline • What are the advantages of predictive state representation • What’s predictive state representation (PSR) • How to learn PSR model • Conclusions

  3. What are the advantages of PSR • PSR are expressed entirely on observable quantities • PSR avoids the problems of local minima and saddle points in learning the model of POMDP • PSR attain generality and compactness at least equal to POMDP

  4. What are predictive state representations (1/9) Two notations in PSR • History (h) • History is the sequence of action-observation (ao) pair that the agent has already experienced, beginning at the first time step • Test (t) • Test is a sequence of ao pair that begins immediately after a history

  5. History Test … … o2 ok o1 a2 ak a1 o1 a2 o2 a3 o3 aj oj a1 What are predictive state representations (2/9) Prediction of a test p(t|h)

  6. What are predictive state representations (3/9) System-dynamics matrix D

  7. What are predictive state representations (4/9) Order of all possible tests in D hi Properties of the predictions in each row of D hi

  8. What are predictive state representations (5/9) Relation between PSR and POMDP Belief state is updated according to Bayes rule Constructing D from a POMDP

  9. What are predictive state representations (6/9)

  10. What are predictive state representations (7/9) Since the rank of D k, there must exit at most k linearly independent columns or rows in D. • Core tests QT • The tests corresponding to the k linearly independent columns • are called core tests. • Core histories Qh • The histories corresponding to the k linearly independent rows • are called core histories.

  11. What are predictive state representations (8/9)

  12. What are predictive state representations (9/9) Linear PSR model Definition D(Q) is a linear sufficient statistic of the histories since all the columns of D are a linear combination of the columns in D(Q). PSR State update

  13. How to learn PSR model (1/6) Two subproblems in learning PSR model • Discovery: find the core tests QT which predictions constitutes state (sufficient statistic) • Learning: learn the parameters maot that define the system dynamics.

  14. How to learn PSR model (2/6) The set of tests and histories corresponding to a set of linearly independent columns and rows of any submatrix of Dare subsets of core-tests and core-histories respectively. Infinite Matrix Finite, small matrix

  15. How to learn PSR model (3/6) Analytical Discovery and Learning Algorithm (ADL) • Assumption: the exact D is obtained • Analytical discovery algorithm (AD) • Analytical learning algorithm (AL)

  16. All tests up to length 1 Until converge Linearly independent T1 Extend one step . . . H1 All histories up to length 1 How to learn PSR model (4/6) • Analytical discovery algorithm (AD)

  17. How to learn PSR model (5/6) 2. Analytical learning algorithm (AD) Since Then

  18. How to learn PSR model (6/6) Estimate the system-dynamic matrix D

  19. Conclusions • New dynamical systems – predictive state representations (PSR) is introduced which is grounded in actions and observations. • An algorithm is introduced – analytical discovery and learning (ADL) to learn the PSR model

  20. References • James, M. R., & Singh, S. (2004). Learning and discovery of predictive state representations in dynamical systems with reset. Proceedings of the 21st International Conference on Machine Learning (ICML) (pp. 719–726). • Littman, M., Sutton, R. S., & Singh, S. (2002). Predictive representations of state. Advances in Neural Information Processing Systems 14 (NIPS) (pp. 1555–1561). MIT Press. • McCracken, P., & Bowling, M. (2006). Online learning of predictive state representations. Advances in Neural Information Processing Systems 18 (NIPS). MIT Press. To appear. • Singh, S., James, M. R., & Rudary, M. R. (2004). Predictive state representations: A new theory for modeling dynamical systems. Uncertainty in Artificial Intelligence: Proceedings of the Twentieth Conference (UAI) (pp. 512–519). • Singh, S., Littman, M., Jong, N., Pardoe, D., & Stone, P.(2003). Learning predictive state representations. Proceedings of the Twentieth International Conference on Machine Learning (ICML) (pp. 712–719). • Wiewiora, E. (2005). Learning predictive representations from a history. Proceedings of the 22nd International Conference on Machine Learning (ICML) (pp. 969–976). • Wolfe, B., James, M. R., & Singh, S. (2005). Learning predictive state representations in dynamical systems without reset. Proceedings of the 22nd International Conference on Machine Learning (ICML) (pp. 985–992). • Bowling, M., McCracken, P., James, M., Neufeld J., & Wilkinson, D. (2006). Learning predictive state representations using non-blind polices. ICML 2006

More Related