390 likes | 589 Views
Introduction. Batch policy evaluation algorithms take a dataset and produce an approximate value functionTwo common linear least-squares algorithms for performing approximate policy evaluationThese two algorithms behave very differently. By studying this space of algorithms, can we gain a bette
E N D