60 likes | 208 Views
Practical Reinforcement Learning in Continuous Space. William D. Smart Brown University Leslie Pack Kaelbling MIT. Presented by: David LeRoux. Goals of Paper. Practical RL approach Handles continuous state and action spaces Safely approximates value function
E N D
Practical Reinforcement Learning in Continuous Space William D. Smart Brown University Leslie Pack Kaelbling MIT Presented by: David LeRoux
Goals of Paper • Practical RL approach • Handles continuous state and action spaces • Safely approximates value function • On-line learning bootstrapped with human-provided data
Approaches to Continuous State or Action Space • Discretize • If too course, problem with hidden states • If too fine, cannot generalize • Curse of dimensionality • Function Approximators • Use to estimate the Value Function • Errors tend to propagate • Tendency to over-estimate (hidden extrapolation)
Proposed Approach - Hedger • Instance-Based Approach • To predict Q(s,a): • Find neighborhood of (s,a) in corpus • Calculate kernel weights for neighbors • Do locally weighted regression, LWR, to estimate Q(s,a) • If not sufficient number of points in neighborhood, or (s,a) is not in within the Independent Variable Hull, return conservative default value for Q(s,a)
Hedger Training – given an observation (s,a,r,s’) • qnew qold+(r+ qnext- qold), where • qold Qpredict(s,a) • qnext maxa’ Qpredict(s’,a’) • Use this to update Q(s,a) • Use the updated value of Q(s,a) to update Q(si,ai) in neighborhood of (s,a) • May be used in Batch or On-line
Potential Problems using Instance-Based Reinforcement Learning • Determining appropriate metric • Obtaining training paths achieving rewards • Keeping the size of the corpus manageable • Finding neighbors efficiently See Representations for Learning Control Policies – Forbes & Andre