Dynamic Programming

Dynamic Programming Study Guide for ES205 Yu-Chi Ho Jonathan T. Lee Jan. 11, 2001

Outline • Sample Problem • General Formulation • Linear-Quadratic Problem • General Problems

5 6 8 1 2 5 1 N 8 6 1 5 2 0 6 3 4 2 3 8 3 1 9 3 6 3 1 2 3 4 Path-Cost Problem • Find the path with minimal cost

N Principle of Optimality • “An optimal policy has the property that whatever the initial state and initial decision are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision.”

7 6 5 5 6 6 6 8 1 2 5 1 N 8 6 1 5 2 0 6 3 4 2 3 8 3 1 9 3 6 3 1 2 3 4 Path-Cost Problem (cont.)

7 12 7 5 15 6 8 1 2 5 1 N 8 6 1 5 2 0 6 3 4 2 3 8 3 1 9 3 6 3 1 2 3 4 Path-Cost Problem (cont.)

11 15 5 9 15 6 8 1 2 5 1 N 8 6 1 5 2 0 6 3 4 2 3 8 3 1 9 3 6 3 1 2 3 4 Path-Cost Problem (cont.)

N Formulation for Cost-Path Pb.

N Formulation (cont.) • More generally,

N General Formulation • Multistage optimization problem: • The cost-to-go with initial condition

Multistage or Optimal Control Problem • Can be approached as static optimization problem with specialized equality (staircase) constraints • See study guides titled “Dynamic Systems” • These two equivalent ways will be made clear below in the solution of a specific class of problems

N Linear-Quadratic Problem subject to linearsystem dynamics given the initial state x(0) where x(i) is the state variables at time i u(i) is the control variable at time i a(i) and b(i) are the cost factor at time i

N LQ Problem (cont.)

N LQ Problem (cont.) • Substituteinto • Set • We get

N LQ Problem (cont.) • With some work, we have • LetThen, we have

N LQ Problem (cont.) • With

N LQ Problem (cont.) • Substitute the optimum u(N-1), then we have • Define

N LQ Problem (cont.)

N LQ Problem (cont.) • By induction, we have the optimal solution to bewherewith boundary condition

N General Problems • Stochastic problems • Combinatorial problems • Variable termination time • Constraints in the problem

N Stochastic Problem • The cost-to-gowith initial condition

N Combinatorial Problem • The cost-to-gowith initial condition

References: • Bellman, R., Dynamic Programming, Princeton University Press, 1957. • Bryson, Jr., A. E. and Y.-C. Ho, Applied Optimal Control: Optimization, Estimation, and Control, Taylor & Francis, 1975. • Dreyfus, S. E. and A. M. Law, The Art and Theory of Dynamic Programming, Academic Press, 1977. • Ho, Y.-C., Lecture Notes, Harvard University, 1997.

References: • National Institute of Standards and Technology, Dictionary of Algorithms, Data Structures, and Problems, http://hissa.nist.gov/dads/HTML/principle.html • Ortega, A. and K. Ramchandran, “Rate-Distortion Methods for Image and Video Compression: An Overview,” IEEE Signal Processing Magazine, Nov. 1998. http://sipi.usc.edu/~ortega/RD_Examples/boxDP.html

Dynamic Programming