Markov Decision Processes

Math 419/519 Prof. Andrew Ross Markov Decision Processes

Highway Pavement Maintenance • Thanks to Pablo Durango-Cohen for this example. • Though I have made up the numbers. • Classify highway pavement condition as: • Good • Fair • Poor • Can do 4 kinds of repairs: • Expensive • Moderate • Cheap • Nothing

Timeline • April: check condition of road, decide on action. • Summer: repair road as decided • Fall/Winter: road might deteriorate • April: check condition, etc.

Markov Assumptions • How we got to current condition does not matter. • Future deterioration depends only on the present condition and action. • When choosing an action, we will only look at the present condition, not the past • This is a policy decision, not a statement about road physics. We could change this policy, but it would make the problem bigger.

If we do Nothing • Road deteriorates according to this transition matrix: • Do the zeros make sense? • Does it make sense that the probabilities decrease from right to left?

If we do Cheap repairs • Road improves/deteriorates according to this transition matrix:

If we do Moderate repairs • Road improves/deteriorates according to this transition matrix:

If we do Expensive repairs • Road improves/deteriorates according to this transition matrix:

Repair Policy • Natural to say: “If it's in Good condition, do Nothing. If it's in Fair condition, do ___. If it's in Poor condition, do ___.” • Rather than if/then, let's make a Policy Matrix:

Mixed Policies? • Maybe we can't afford to do Expensive repairs each time the road becomes Poor—only 30% of the time? Etc.

Overall Cost • Given a policy matrix, find the transition matrix • Then find the steady-state distribution • Then find how often we do each action • Then account for the cost of each action • Then change the policy matrix a little, try to find a cheaper overall cost. • See the book for the math notation.

Other Thoughts • Can find optimal policy through: • “Policy Iteration” • “Value Iteration” • Related to Dynamic Programming

References • Wayne Winston: “Introduction to Operations Research” book • Ronald A. Howard: “Comments on the Origin and Application of Markov Decision Processes” article in journal “Operations Research”, Vol 50 issue 1.

Markov Decision Processes