1 / 9

Section 8: Markov Decision Process

Section 8: Markov Decision Process. The Clinic’s Problem. You are a doctor and run a clinic which opens 2 hours a day Each hour you will have one patient: 80% of the time (s)he has flu, 20% of the time (s)he has Ebola You can either send them home or to the hospital

Download Presentation

Section 8: Markov Decision Process

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Section 8: Markov Decision Process

  2. The Clinic’s Problem • You are a doctor and run a clinic which opens 2 hours a day • Each hour you will have one patient: 80% of the time (s)he has flu, 20% of the time (s)he has Ebola • You can either send them home or to the hospital • It’s better when patients survive at home than if they survive at the hospital. But we definitely don’t want them to die! • A patient with the flu has 90% chance to survive at home and will survive at the hospital for sure • A patient with Ebola has 50% chance to survive at home and will survive at the hospital for sure.

  3. Home or Hospital? • For each patient who visits your clinic in a day, should you send him/her home or to the hospital?

  4. Modeling in MDP • What are the states? - S = {s1, s2,…,sn} • What are the actions? - A = {a1, a2, …, am} • What is the transition model? - T: SAS • What is the reward model? - R: SA

  5. Model (1) • States: a four-element tuple [N, C, PA, PS] - N: number of patients visited so far - C: condition of the current patient - PA: the action on the previous patient - PS: whether the previous patient survives • Actions: - send home, send to hospital, null action • Reward Model: - If s(PA)=home and s(PS)=1, R(s, a)=2 - If s(PA)=home and s(PS)=0, R(s, a)=0 - If s(PA)=hospital and s(PS)=1, R(s, a)=1

  6. Model (2) [0, F, n/a, n/a] [0, E, n/a, n/a] home [1, F, home, 1] [1, E, home, 1] … [1, F, home, 0] [1, E, home, 0] … [2, n/a, hospital, 1] [2, n/a, home, 1] [2, n/a, hospital , 0][2, n/a, home, 0] Reward=0! 0.8*0.9=0.72 0.2*0.9=0.18 0.8*0.1=0.08 0.2*0.1=0.02 hospital home Reward=2! Reward=2! 0.9 1 null 0.1 0 Reward=1! Finish

  7. Eat ice-cream or take vitamins??? • States: Healthy (H) or Sick (S) • Actions: Ice-cream (IC) or Vitamins (V) • Transitions: • Ice cream Vitamins • Rewards:

  8. Value Iteration

  9. Example (=0.9) k Qk(H,IC) Qk(H,V) Qk(S,IC) Qk(S,V) *k(H) *k(S) Vk(H) Vk(S) 0 0 0 1 10 0 5 0 IC IC 10 5 2 9 9.5 8.55 IC IC 17.65 9.5 17.65 3 15.88 13.55 15.15 IC V 23.68 15.15 23.68

More Related