1 / 38

About the paper

“ ElasticTree : Saving energy in data center networks“ by Brandon Heller, Seetharaman , Mahadevan , Yiakoumis , Sharma, Banerjee , McKeown presented by Nicoara Talpes , Kenneth Wade. About the paper. Published in April 2010 at Networked Systems Design & Implementation (NSDI). Motivation 1.

valiant
Download Presentation

About the paper

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. “ElasticTree: Saving energy in data center networks“ by Brandon Heller, Seetharaman, Mahadevan,Yiakoumis, Sharma, Banerjee, McKeownpresented by NicoaraTalpes, Kenneth Wade

  2. About the paper Published in April 2010 at Networked Systems Design & Implementation (NSDI)

  3. Motivation 1 Efforts spend so far on servers and cooling. Our focus is on the network (10-20% total power) Environmental Protection Agency: estimate that in 2011 networks in data centers will consume 12 B kWh This is 6.542.640 tons CO21

  4. Motivation 2 Goal: energy proportionality

  5. Motivation 3 Cannot get to green line using the hardware Common network goal is to balance traffic evenly among all links: power is constant regardless of load ‘Data centers provisioned to run at peak workload, below capacity most of the time’ Today’s network elements not energy proportional: switches, transceivers waste power at low loads Switches consume 70% of full power when idle

  6. Existing networks 2N: fault tolerant

  7. Wasted power Servers draw constant power independent of traffic time varying demands, provisioned for peak

  8. ElasticTree • Goal: build network that has energy proportionality even if switches don’t • By using traffic management and control of switches: turning on switch consumes most of the power; 8%: going from zero to full traffic; turning off switch saves most power • Careful: minimizing effects on performance and fault tolerance • Has to work at scale to make an impact • With ET, we do opposite than balanced networks: only use a few links, lower power at low loads (ex: middle night)

  9. Existing networks: scale-out Ex: Fat-tree; incremental degradation

  10. Existing networks: scale-out

  11. Implementation 1 • Optimizer: find minimum power network subset which satisfies current traffic. Inputs: topology, traffic matrix, switch’s power models, fault tolerance constraints. Outputs new topology • Continually re-computes subset as traffic changes • Power control: toggles power states of ports, linecards, entire switches • Routing: chooses paths for all flows, pushes routes into network

  12. Implementation

  13. Optimizer methods: formal model • outputs subset & flow assignments • Evaluates solution quality of other optimizers • optimal • (con) scales to number of hosts ^ 3.5

  14. Optimizer methods: formal model • Doesn’t scale

  15. Optimizer methods: Greedy-Bin packing

  16. Optimizer methods: Greedy-Bin packing

  17. Optimizer methods: Greedy-Bin packing

  18. Optimizer methods: Greedy-Bin packing

  19. Power savings: data centers • 30 % traffic inside dc, greedy-bin packet optimizer, scaled, reductions of 25-60%: energy elastic!

  20. Need for redundancy • Nice propriety: cost drops with increase of network size since MST is smaller fraction

  21. Optimizer methods: Greedy-Bin packing • Scales better, optimal solution not guaranteed, not all flows can be assigned • Understand power savings for larger topologies

  22. Optimizer methods: topology aware heuristic • Quickly find subsets in networks with regular structure (fat tree) • Requires less information: only need the cross-layer totals, not the full traffic matrix • Routing independent: does not compute set of flow routes, (con) assumes divisible flows; can be applied with any fat tree routing algorithm (Portland); any full-bisection-bandwidth topologies with any nr layers (ex 1gb at edge, 10 gb core) • Simple additions to this lead to quality solutions in a fraction of time

  23. Optimizer methods: topology aware heuristic

  24. Optimizer comparison

  25. Optimizer comparison • formal model intractable for large topologies greedy • Un-optimized single-core python implementation: 20s

  26. Control software • ET requires traffic data and control over flow paths. we use Open Flow: generate traffic? and push application level flow routes to switches

  27. Implementation

  28. Implementation 2 • Openflow: measure traffic matrix, control routing flows • Open flow: vendor neutral so no need to change code when use HP/ECR switches • Experiments show savings 25-40% feasible: 1 bill KWhr annual savings; then we have proportional reduction in cooling costs

  29. Experiments • Topologies: two 3-layer k=4 fat tree; one 3-layer k=6 fat tree • Measurements: NetFPGA traffic generator: each emulates four servers • Latency monitor

  30. Experiments

  31. 3 Power savings results • Formal method. Savings depend on network utilization • traffic all inside. near traffic at low utilization: 60% reduction

  32. Power savings: sine-wave demand • Reduction up to 64%

  33. Robustness: safety margins • MSTs disadvantages: renounces path redundancy and fault tolerance • Added cost of fault tolerance insignificant for large networks

  34. Performance • Uniform traffic shows spikes, large delays for packets

  35. Safety margins • safety margins defer points of loss, degrade latency • Margins are adjustable

  36. Topology aware optimizer • Better robustness by tweaks: setting the link rate utilization in equations to absorb overloads and reduce delay • setting the switch degree to add redundancy for improving fault tolerance • Solves constraints: Response times dominated by switch boot time (30sec - 3 min) • Fault tolerance: move topology-aware optimizer to separate host to prevent crashes to affect routing. • Traffic prediction experiments encouraging; can use greedy algorithm

  37. 7 Discussion • During low to mid-utilization, it respects the constraints while lowering the costs

  38. References Some images borrowed from the author’s presentation available at his website or online at http://www.usenix.org/events/nsdi10/tech/ 1-http://www.nef.org.uk/greencompany/co2calculator.htm

More Related