60 likes | 185 Views
An Accelerated Gradient Method for Multi-Agent Planning in Factored MDPs. Sue Ann Hong Geoff Gordon. Carnegie Mellon University. Multi-agent planning. Optimize Shared constraints resources. Individual constraints Individual objective.
E N D
An Accelerated Gradient Method for Multi-Agent Planning in Factored MDPs Sue Ann Hong Geoff Gordon Carnegie Mellon University
Multi-agent planning Optimize Shared constraints resources Individual constraints Individual objective
Factored MDPs[Guestrin et al., 2002] • Want: an efficient, distributed solver Piece-wise linear constraints on shared resources Optimize Shared constraints resources MDP: maximize linear reward Fast solver: value iteration Individual constraints Individual objective
Distributed optimizationLagrangian relaxation Resource 1 @ $100 • How to set the prices? Gradient-based methods. 2 NO NO 1 2 Solve in a distributed fashion $300 $50 $80 $200 $100
FISTA for factored MDPs • linear objective : augment with a strongly convex function: causal entropy[Ziebart et al., 2010] • Usually regularization towards a more uniform policy • Retains a fast individual planner (softmax value iteration) • Introduces smoothing error (to the linear objective) • We show that the gain in convergence can outweigh the approximation (smoothing) error.