10 likes | 110 Views
Oracular POMDPs: A Very Special Case. Nicholas Armstrong-Crews and Manuela Veloso Carnegie Mellon University. There is an “oracle,” such as a human or accurate robot sensor, that will give perfect state information The oracle is an action and has a cost
E N D
Oracular POMDPs: A Very Special Case Nicholas Armstrong-Crews and Manuela VelosoCarnegie Mellon University • There is an “oracle,” such as a human or accurate robot sensor, that will give perfect state information • The oracle is an action and has a cost • At each timestep, the agent can consult the oracle for information or take a state-altering action • There are no other observations in the environment • OPOMDPs are “between” MDPs and POMDPs • We can factorize actions into those that provide information and those that change state • We can define the value of information • JIV algorithm – J-MDP Information Value • Choose between Q-MDP action (from underlying MDP) or oracle • Greedy heuristic, but works well due to factorization of actions • Solves problems with hundreds of thousands of action-states • Provably polynomial time complexity