1 / 19

A Case for Fine G rained Traffic Engineering in Data Centers

A Case for Fine G rained Traffic Engineering in Data Centers. Theophilus Benson *, Ashok Anand*, Aditya Akella*, Ming Zhang + *University of Wisconsin, Madison + Microsoft Research. Why are Data Centers Important?. Congestion == bad app. performance

zuri
Download Presentation

A Case for Fine G rained Traffic Engineering in Data Centers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Case for Fine Grained Traffic Engineering in Data Centers Theophilus Benson*, Ashok Anand*, Aditya Akella*, Ming Zhang+ *University of Wisconsin, Madison + Microsoft Research

  2. Why are Data Centers Important? • Congestion == bad app. performance • Bad app performance == user dissatisfaction • User dissatisfaction == loss of revenue • Traffic engineering is crucial • IM: low B/W, loose latency • Multimedia: low B/W, strict latency • Games: high B/W, strict latency

  3. Outline • Background • Traffic Engineering in data centers • Design goals for ideal TE • MicroTE • Conclusion

  4. Options for TE in Data Centers? • Current supported techniques • Equal Cost MultiPath (ECMP) • Spanning Tree Protocol (STP) • Proposed (ECMP based) • Fat-Tree, VL2 • Other existing • TEXCP, COPE,…, OSPF link tuning

  5. Properties of Data Center Traffic • Flows are small and short-lived [Kandula et. al, 2009] • Traffic is bursty[Benson et. al, 2009] • Traffic is unpredictable at 100 secs[Maltz et. al, 2009]

  6. How do we evaluate TE? • Data center traces • Cloud data center • Map-reduce app • ~1500 servers, • ~80 switches • 1 sec snapshots for 24 hours • Simulator • Input: • Traffic matrix, Topology ,Traffic Engineering • Output: • link utilization ….

  7. Draw Backs of Existing TE • STP does not use multiple path • ECMP does not adapt to burstiness

  8. Draw Backs of Proposed TE • Fat-Tree • Rehash flows • Local opt. != global opt. • VL2 • Coarse grained flow assignment • VL2 & Fat-Tree do not adapt to burstiness

  9. Draw Backs of Other Approaches • TEXCP, COPE ….OSPF link tuning Egress Ingress x • Unable to react fast enough (below 100 secs)

  10. Design Requirements for TE • Calculate paths & reconfigure network • Use all network paths • Use global view • Must react quickly • How predictable is traffic? ….

  11. Is Data Center Traffic Predictable? • YES! 33% of traffic is predictable

  12. How Long is Traffic Predictable? • TE must react in under 2 seconds

  13. MicroTE: Architecture Monitoring Component Routing Component …. Network Controller • Based on OpenFlow framework • Global view: • created by network controller • React to predictable traffic: • routing component tracks demand history • All N/W paths: • routing component creates routes using all paths

  14. Routing Component • Step 1: Determine predictable traffic • Step 2: Route along rarely utilized paths • Currently use LP • Faster Algorithm == future work • Step 3: Set ECMP for other traffic • Step 4: Return routes

  15. Routing Component New Global View Determine Predictable ToRs Now: Use LP Future: Use heuristic Calculate Network Routes for predictable traffic Set ECMP for unpredictable traffic Add Network View to History Yes No Significant Change in Routes? Return Nothing Return Calculated Routes

  16. Tradeoffs: Monitoring Component Routing Component Monitoring Component • Switch based • Low complexity • High overhead • End-host based • Low overhead • High complexity …. Network Controller

  17. Preliminary Evaluation • Outperforms ECMP • Slightly worse than optimal

  18. Conclusion • Study existing TE • Found them lacking (15-20%) • Study data center traffic • Discovered traffic predictability (33% for 2 secs) • Guidelines for ideal TE • MicroTE • Implementation of ideal TE • Preliminary evaluation

  19. Thank You • Questions?

More Related