Markov Decision Process (MDP)
Markov Decision Process (MDP). S : A set of states A : A set of actions P r(s’|s,a): transition model (aka M a s,s’ ) C (s,a,s’): cost model G : set of goals s 0 : start state : discount factor R ( s,a,s’): reward model. Value function: expected long term reward from
534 views • 17 slides