Predictive State Representations

Predictive State Representations Hui Li July 7, 2006

Outline • What are the advantages of predictive state representation • What’s predictive state representation (PSR) • How to learn PSR model • Conclusions

What are the advantages of PSR • PSR are expressed entirely on observable quantities • PSR avoids the problems of local minima and saddle points in learning the model of POMDP • PSR attain generality and compactness at least equal to POMDP

What are predictive state representations (1/9) Two notations in PSR • History (h) • History is the sequence of action-observation (ao) pair that the agent has already experienced, beginning at the first time step • Test (t) • Test is a sequence of ao pair that begins immediately after a history

History Test … … o2 ok o1 a2 ak a1 o1 a2 o2 a3 o3 aj oj a1 What are predictive state representations (2/9) Prediction of a test p(t|h)

What are predictive state representations (3/9) System-dynamics matrix D

What are predictive state representations (4/9) Order of all possible tests in D hi Properties of the predictions in each row of D hi

What are predictive state representations (5/9) Relation between PSR and POMDP Belief state is updated according to Bayes rule Constructing D from a POMDP

What are predictive state representations (6/9)

What are predictive state representations (7/9) Since the rank of D k, there must exit at most k linearly independent columns or rows in D. • Core tests QT • The tests corresponding to the k linearly independent columns • are called core tests. • Core histories Qh • The histories corresponding to the k linearly independent rows • are called core histories.

What are predictive state representations (8/9)

What are predictive state representations (9/9) Linear PSR model Definition D(Q) is a linear sufficient statistic of the histories since all the columns of D are a linear combination of the columns in D(Q). PSR State update

How to learn PSR model (1/6) Two subproblems in learning PSR model • Discovery: find the core tests QT which predictions constitutes state (sufficient statistic) • Learning: learn the parameters maot that define the system dynamics.

How to learn PSR model (2/6) The set of tests and histories corresponding to a set of linearly independent columns and rows of any submatrix of Dare subsets of core-tests and core-histories respectively. Infinite Matrix Finite, small matrix

How to learn PSR model (3/6) Analytical Discovery and Learning Algorithm (ADL) • Assumption: the exact D is obtained • Analytical discovery algorithm (AD) • Analytical learning algorithm (AL)

All tests up to length 1 Until converge Linearly independent T1 Extend one step . . . H1 All histories up to length 1 How to learn PSR model (4/6) • Analytical discovery algorithm (AD)

How to learn PSR model (5/6) 2. Analytical learning algorithm (AD) Since Then

How to learn PSR model (6/6) Estimate the system-dynamic matrix D

Conclusions • New dynamical systems – predictive state representations (PSR) is introduced which is grounded in actions and observations. • An algorithm is introduced – analytical discovery and learning (ADL) to learn the PSR model

References • James, M. R., & Singh, S. (2004). Learning and discovery of predictive state representations in dynamical systems with reset. Proceedings of the 21st International Conference on Machine Learning (ICML) (pp. 719–726). • Littman, M., Sutton, R. S., & Singh, S. (2002). Predictive representations of state. Advances in Neural Information Processing Systems 14 (NIPS) (pp. 1555–1561). MIT Press. • McCracken, P., & Bowling, M. (2006). Online learning of predictive state representations. Advances in Neural Information Processing Systems 18 (NIPS). MIT Press. To appear. • Singh, S., James, M. R., & Rudary, M. R. (2004). Predictive state representations: A new theory for modeling dynamical systems. Uncertainty in Artificial Intelligence: Proceedings of the Twentieth Conference (UAI) (pp. 512–519). • Singh, S., Littman, M., Jong, N., Pardoe, D., & Stone, P.(2003). Learning predictive state representations. Proceedings of the Twentieth International Conference on Machine Learning (ICML) (pp. 712–719). • Wiewiora, E. (2005). Learning predictive representations from a history. Proceedings of the 22nd International Conference on Machine Learning (ICML) (pp. 969–976). • Wolfe, B., James, M. R., & Singh, S. (2005). Learning predictive state representations in dynamical systems without reset. Proceedings of the 22nd International Conference on Machine Learning (ICML) (pp. 985–992). • Bowling, M., McCracken, P., James, M., Neufeld J., & Wilkinson, D. (2006). Learning predictive state representations using non-blind polices. ICML 2006

Predictive State Representations

Predictive State Representations

Presentation Transcript

Representations

Knowledge Representations

Alternative Representations

Underspecified Representations

Representations / Models

Multiple representations

Representations

Representations

Nonlinguistic Representations

Representations

Representations

Representations

Intermediate Representations

8.3 Alternative State Machine Representations

Reinforcement Learning with Multiple, Qualitatively Different State Representations

Predictive State Representations

Efficiently Learning Linear-Linear Exponential Family Predictive Representations of State

State space representations and search strategies - 2