200 likes | 333 Views
Predictive State Representations. Hui Li July 7, 2006. Outline. What are the advantages of predictive state representation What’s predictive state representation (PSR) How to learn PSR model Conclusions. What are the advantages of PSR.
E N D
Predictive State Representations Hui Li July 7, 2006
Outline • What are the advantages of predictive state representation • What’s predictive state representation (PSR) • How to learn PSR model • Conclusions
What are the advantages of PSR • PSR are expressed entirely on observable quantities • PSR avoids the problems of local minima and saddle points in learning the model of POMDP • PSR attain generality and compactness at least equal to POMDP
What are predictive state representations (1/9) Two notations in PSR • History (h) • History is the sequence of action-observation (ao) pair that the agent has already experienced, beginning at the first time step • Test (t) • Test is a sequence of ao pair that begins immediately after a history
History Test … … o2 ok o1 a2 ak a1 o1 a2 o2 a3 o3 aj oj a1 What are predictive state representations (2/9) Prediction of a test p(t|h)
What are predictive state representations (3/9) System-dynamics matrix D
What are predictive state representations (4/9) Order of all possible tests in D hi Properties of the predictions in each row of D hi
What are predictive state representations (5/9) Relation between PSR and POMDP Belief state is updated according to Bayes rule Constructing D from a POMDP
What are predictive state representations (7/9) Since the rank of D k, there must exit at most k linearly independent columns or rows in D. • Core tests QT • The tests corresponding to the k linearly independent columns • are called core tests. • Core histories Qh • The histories corresponding to the k linearly independent rows • are called core histories.
What are predictive state representations (9/9) Linear PSR model Definition D(Q) is a linear sufficient statistic of the histories since all the columns of D are a linear combination of the columns in D(Q). PSR State update
How to learn PSR model (1/6) Two subproblems in learning PSR model • Discovery: find the core tests QT which predictions constitutes state (sufficient statistic) • Learning: learn the parameters maot that define the system dynamics.
How to learn PSR model (2/6) The set of tests and histories corresponding to a set of linearly independent columns and rows of any submatrix of Dare subsets of core-tests and core-histories respectively. Infinite Matrix Finite, small matrix
How to learn PSR model (3/6) Analytical Discovery and Learning Algorithm (ADL) • Assumption: the exact D is obtained • Analytical discovery algorithm (AD) • Analytical learning algorithm (AL)
All tests up to length 1 Until converge Linearly independent T1 Extend one step . . . H1 All histories up to length 1 How to learn PSR model (4/6) • Analytical discovery algorithm (AD)
How to learn PSR model (5/6) 2. Analytical learning algorithm (AD) Since Then
How to learn PSR model (6/6) Estimate the system-dynamic matrix D
Conclusions • New dynamical systems – predictive state representations (PSR) is introduced which is grounded in actions and observations. • An algorithm is introduced – analytical discovery and learning (ADL) to learn the PSR model
References • James, M. R., & Singh, S. (2004). Learning and discovery of predictive state representations in dynamical systems with reset. Proceedings of the 21st International Conference on Machine Learning (ICML) (pp. 719–726). • Littman, M., Sutton, R. S., & Singh, S. (2002). Predictive representations of state. Advances in Neural Information Processing Systems 14 (NIPS) (pp. 1555–1561). MIT Press. • McCracken, P., & Bowling, M. (2006). Online learning of predictive state representations. Advances in Neural Information Processing Systems 18 (NIPS). MIT Press. To appear. • Singh, S., James, M. R., & Rudary, M. R. (2004). Predictive state representations: A new theory for modeling dynamical systems. Uncertainty in Artificial Intelligence: Proceedings of the Twentieth Conference (UAI) (pp. 512–519). • Singh, S., Littman, M., Jong, N., Pardoe, D., & Stone, P.(2003). Learning predictive state representations. Proceedings of the Twentieth International Conference on Machine Learning (ICML) (pp. 712–719). • Wiewiora, E. (2005). Learning predictive representations from a history. Proceedings of the 22nd International Conference on Machine Learning (ICML) (pp. 969–976). • Wolfe, B., James, M. R., & Singh, S. (2005). Learning predictive state representations in dynamical systems without reset. Proceedings of the 22nd International Conference on Machine Learning (ICML) (pp. 985–992). • Bowling, M., McCracken, P., James, M., Neufeld J., & Wilkinson, D. (2006). Learning predictive state representations using non-blind polices. ICML 2006