Section 8: Markov Decision Process

Section 8: Markov Decision Process

The Clinic’s Problem • You are a doctor and run a clinic which opens 2 hours a day • Each hour you will have one patient: 80% of the time (s)he has flu, 20% of the time (s)he has Ebola • You can either send them home or to the hospital • It’s better when patients survive at home than if they survive at the hospital. But we definitely don’t want them to die! • A patient with the flu has 90% chance to survive at home and will survive at the hospital for sure • A patient with Ebola has 50% chance to survive at home and will survive at the hospital for sure.

Home or Hospital? • For each patient who visits your clinic in a day, should you send him/her home or to the hospital?

Modeling in MDP • What are the states? - S = {s1, s2,…,sn} • What are the actions? - A = {a1, a2, …, am} • What is the transition model? - T: SAS • What is the reward model? - R: SA

Model (1) • States: a four-element tuple [N, C, PA, PS] - N: number of patients visited so far - C: condition of the current patient - PA: the action on the previous patient - PS: whether the previous patient survives • Actions: - send home, send to hospital, null action • Reward Model: - If s(PA)=home and s(PS)=1, R(s, a)=2 - If s(PA)=home and s(PS)=0, R(s, a)=0 - If s(PA)=hospital and s(PS)=1, R(s, a)=1

Model (2) [0, F, n/a, n/a] [0, E, n/a, n/a] home [1, F, home, 1] [1, E, home, 1] … [1, F, home, 0] [1, E, home, 0] … [2, n/a, hospital, 1] [2, n/a, home, 1] [2, n/a, hospital , 0][2, n/a, home, 0] Reward=0! 0.8*0.9=0.72 0.2*0.9=0.18 0.8*0.1=0.08 0.2*0.1=0.02 hospital home Reward=2! Reward=2! 0.9 1 null 0.1 0 Reward=1! Finish

Eat ice-cream or take vitamins??? • States: Healthy (H) or Sick (S) • Actions: Ice-cream (IC) or Vitamins (V) • Transitions: • Ice cream Vitamins • Rewards:

Value Iteration

Example (=0.9) k Qk(H,IC) Qk(H,V) Qk(S,IC) Qk(S,V) *k(H) *k(S) Vk(H) Vk(S) 0 0 0 1 10 0 5 0 IC IC 10 5 2 9 9.5 8.55 IC IC 17.65 9.5 17.65 3 15.88 13.55 15.15 IC V 23.68 15.15 23.68

Section 8: Markov Decision Process

Section 8: Markov Decision Process

Presentation Transcript

Decision Theory

STUDENT LEARNING OBJECTIVES

MDMP Class (Military Decision Making Process)

Decision Making and Relevant Information

Describing Process Specifications and Structured Decisions

Planning under Uncertainty with Markov Decision Processes: Lecture I

Decision Making

Section 1: The First Amendment: Your Freedom of Expression Section 2: The Fourth Amendment: Your Right to Be Secure

Section 1: The Roots of American Democracy Section 2: American Independence Section 3: Articles of Confederation Section

Lecture Five – The Dividend Decision

Unit One

Chapter 12 SOLUTIONS

Random Walks and Markov Chains

Markov Chains Regular Markov Chains Absorbing Markov Chains

COA Decision Brief 23 AUG 2002 COS, COL MARK KOH

RFIC Design

Hidden Markov Models

A BEGINNERS GUIDE TO SUCCESSFUL COMPLETION OF THE SECTION 106 REVIEW PROCESS

Decision making: Relevant Costs and Benefits

Chapter Five

Reaching Today's Prospective Students: Can higher education marketers impact the decision process of today’s empowered p