1 / 14

A Reinforcement Learning Approach to Dynamic Resource Allocation

A Reinforcement Learning Approach to Dynamic Resource Allocation. introduction. Dynamic resource allocation among multiple entities sharing a common set of resource The results of our predecessors (UP) Improvement (RL for U). Problem formulation.

hewitt
Download Presentation

A Reinforcement Learning Approach to Dynamic Resource Allocation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Reinforcement Learning Approach to Dynamic Resource Allocation

  2. introduction • Dynamic resource allocation among multiple entities sharing a common set of resource • The results of our predecessors (UP) • Improvement (RL for U)

  3. Problem formulation • Resource migrations require a non-negligible time • Algorithm for reassigning multiple resource units

  4. Solution Methodology • Learning U in the dynamic resource allocation setting. • Predecessors:1.[8] single state of project. 2.[11] two values state but transfers of only a single resource type. • Improvement: extend [11] by considering transfers of multiple resource types. • dUi/dri=dUj/drj • n resource type ,s is n-dimensional vector • Rule base: advantage; disadvantage. DRA-FRL

  5. Solution Methodology • Fuzzy Rulebase each parameter p gives the output value of the FRB when the input vector belongs to the categories A of rule i.

  6. Solution Methodology

  7. Solution Methodology • Reinforcement Learning Algorithm • A finite set of states S • A finite set of action A • A reward function r: S*A*S-----R • A state transition function T:S*A-----PD(S) • r(s,a,s)

  8. Solution Methodology • Temporal difference (TD),TD(0)

  9. Solution Methodology • Greed policy

  10. Solution Methodology • [9]TDL with function approximation

  11. Experimental Setup and Results • Queuing theory: M/D/n queue the expected queue length

  12. Experimental Setup and Results • The optimal fixed resource allocation by queuing theory • Reactive policy balance resource utilization • Utility-based policy cost function:

  13. Experimental Setup and Results • Step 1: use the “reactive” policy as the initial policy. • Step 2: DRA-FRL

  14. Experimental Setup and Results • results

More Related