1 / 15

Iroko: Reinforcement Learning for Data Center Optimization

Iroko is an open reinforcement learning platform tailored for data center scenarios, overcoming challenges faced by operators such as fragile stability and limited reproducibility. With the ability to scale, define rewards, and model various data center conditions, Iroko aims to revolutionize decision-making processes in this critical environment. Experiments show potential to outperform traditional TCP algorithms, signaling a shift towards more robust RL solutions. Join the data center evolution with Iroko on GitHub.

gracieb
Download Presentation

Iroko: Reinforcement Learning for Data Center Optimization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Iroko A Data Center Emulator for Reinforcement Learning Fabian Ruffy, Michael Przystupa, Ivan Beschastnikh University of British Columbia https://github.com/dcgym/iroko

  2. Reinforcement Learning and Networking

  3. DC challenges are optimization problems Traffic control Resource management Routing Operators have complete control Automation possible Lots of data can be collected The Data Center: A perfect use case Cho, Inho, Keon Jang, and Dongsu Han. "Credit-scheduled delay-bounded congestion control for datacenters.“ SIGCOMM 2017

  4. Typical reinforcement learning is not viable for data center operators! Fragile stability Questionable reproducibility Unknown generalizability Prototyping RL is complicated Cannot interfere with live production traffic Offline traces are limited in expressivity Deployment is tedious and slow Two problems…

  5. Iroko: open reinforcement learning gym for data center scenarios Inspired by the Pantheon* for WAN congestion control Deployable on a local Linux machine Can scale to topologies with many hosts Approximates real data center conditions Allows arbitrary definition of Reward State Actions Our work: A platform for RL in Data Centers  *Yan, Francis Y., et al. "Pantheon: the training ground for Internet congestion-control research.“ ATC 2018

  6. Iroko in one slide Policy OpenAI Gym Reward Model State Model Data Collectors Traffic Pattern Action Model Topology Fat-Tree Dumbbell Rack

  7. Ideal data center should have: Low latency, high utilization No packet loss or queuing delay Fairness CC variations draw from the reactive TCP Queueing latency dominates Frequent retransmits reduce goodput Data center performance may be unstable Use Case: Congestion Control

  8. Predicting Networking Traffic Bandwidth Allocation Flow Pattern 10 Data Collection Policy 10 10 10 Switch 10 10 10

  9. Predicting Networking Traffic Bandwidth Allocation Flow Pattern 10 Data Collection Policy 3.3 10 10 10 3.4 3.3 Switch 10 10 10

  10. Predicting Networking Traffic Bandwidth Allocation Flow Pattern 10 Data Collection Policy 10 10 3.3 Switch 3.4 3.3 10

  11. Two environments: env_iroko: centralized rate limiting arbiter Agent can set the sending rate of hosts PPO, DDPG, REINFORCE env_tcp: raw TCP Contains implementations of TCP algorithms TCP Cubic, TCP New Vegas, DCTCP Goal: Avoid congestion Can we learn to allocate traffic fairly?

  12. 50000 timesteps Linux default UDP as base transport 5 runs (~7 hours per run) Bottleneck at central link Experiment Setup

  13. Results – Dumbbell UDP

  14. Challenging real-time environment Noisy observation Exhibits strong credit assignment problem RL algorithms show expected behavior for our gym Achieve better performance than TCP New Vegas More robust algorithms required to learn good policy DDPG and PPO achieve near optimum REINFORCE fails to learn good policy Results - Takeaways

  15. Data center reinforcement learning is gaining traction …but it is difficult to prototype and evaluate Iroko is a platform to experiment with RL for data centers intended to train on live traffic early stage work but experiments are promising available on Github: https://github.com/dcgym/iroko Contributions

More Related