1 / 40

Finishing Flows Quickly with Preemptive Scheduling

Finishing Flows Quickly with Preemptive Scheduling. Speaker: Yu, Ye. The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures origins from their presentation. outline. Motivation PDQ solution to flow scheduling Evaluation Discussion. Datacenter Networks.

hinto
Download Presentation

Finishing Flows Quickly with Preemptive Scheduling

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Finishing Flows Quickly with Preemptive Scheduling Speaker: Yu, Ye The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures origins from their presentation.

  2. outline • Motivation • PDQ solution to flow scheduling • Evaluation • Discussion

  3. Datacenter Networks

  4. Latency 1ms latency = 1% reduce of sales. 100ms latency = 0.2% number searches 2.2 seconds faster / page = 60 M more download / year

  5. LOW LATENCY • LOW Latency Datacenters? • Finish Flows Earlier • The LAST of flows == final result • Meet Flow Deadline • User-facing applications, Latency Goal

  6. partition aggregate model • associated component deadlines in the parentheses.

  7. Today’s transport protocols • TCP / RCP/ ICTCP / DCTCP: • Fair Sharing • Divide link bandwidth equally. • Fail to reduce flow completion time.

  8. Host A Host B one segment RTT two segments four segments time WHAT is TCP • TCP slow start • TCP fast recovery • additive increase • multiplicative decrease

  9. WHAT IS RCP • Rate Control Protocol • RCP is an adaptive algorithm to emulate Processor Share : a router divides outgoing link bandwidth equally • Rate is picked by the routers based on queue size and aggregate traffic • Router assigns a single rate to all flows • Requires no per-flow state or per-packet calculation

  10. Fairness damages completion time • Flow Fa,Fb,Fc arrives at the same time, with size = 1,2,3 and deadline = 1,4,6 Fair share, FC time = (3+5+6)/3 = 4.67 D3 for order BAC FC time = (2+4+6)/3 = 4 Shortest Job First/ Earliest Deadline First FC time = (1+3+6)/3 = 3.33

  11. D3 depends on flow order • D3 satisfies as many flows as possible in the order of their arrival, • Request rate = flow size / time until deadline. Satisfy request by Order

  12. The PDQ Solution • Preemptive Distributed Quick • Schedule by flow criticality. • Criticality: relative priority of flows. • Scheduling discipline. Preemptive : relating to the purchase of goods or shares by one person or party before the opportunity is offered to others.

  13. PDQ’s scheduling diciplines • EDF: earliest deadline first • Optimal for flow deadlines. • SJF: shortest job first • Optimal for mean flow finish time. • EDF+SJF: • Give preference to deadline flows. • Policy based: • Manually allocate priority of flow.

  14. challenges. • Decentralizing scheduling discipline • More mice than elephant. • Switching between flows seamlessly • Hard to full utilize bandwidth • Prioritizing flows using FIFO tail-drop Queues • FIFO Queue length limited

  15. outline • Motivation • PDQ solution to flow scheduling • Evaluation • Discussion

  16. PDQ protocol - overview

  17. PDQ protocol-PDQ sender-1 • SYN / TERM packet for initialization and termination. • Resend after timeout. • sender maintains info for in-flight packets: • Current Sending Rate (Rs) • ID of switch who paused it (Ps) • Deadline (Ds) • Expected flow transmission time (Ts) • Inter-probing time (Is) • Measured RTT (RTTs)

  18. PDQ protols-sender-2 • Sender sends package with rate Rs • If Rs = 0, Send a probe packet heartbeatly.(scheduling header without data) • When ACKarrives, update Rs (ACKinfo: accept/pause)

  19. Pdq protols-sender-early-termination • Sender TERMNINATES a flow when it cannot meet its deadline.Whenever: • Deadline is past. • Remaining flow transmission + time > deadline • Flow is paused , and time + RTT> deadline

  20. PDQ protols-switch • Let the most criticalflow complete asap. • Critical flows preempt others to achieve the highest possible sending rate • 1) maintain state about each flow • 2) Compute Rate Feedback • a) flow controller to decide witch flows to send • b) rate controller to determine Rate

  21. pdq protocol-switch-state • Maintains flow states on each link • <Rate, P, Deadline, expected Time,RTT> • Pi: flow i is paused by switch Pi • Store 2K of them, most critical ones. K is number of Current Sending Flow.

  22. pdq protocol-FLOW Control • Whenever a Switch receives ACK/data, ACCEPTor PAUSEa flow • Pause: inform others flow f is Paused. • Switch who receives ACK-Pause iremoves i from its own states • Accept: calculate available bandwidth • Other Switch who receives ACK-accept i updates state i

  23. algorithm recv data/ack flow f • 1) if f is paused by other Switch, remove it from my list. • 2) if f is not in my list: • Try to add f into my list, if can not ,pause f • 3) if (w = min(aviliableBW ,Rf) > 0 ): • Accept f • Otherwise pause f

  24. Flow-control-3 optimization • Dampening • If switch accepted a flow, then in a short period of time he can not accept other new flows. • Early starting • Suppressed probing

  25. Early startseamless schedule

  26. Suppressed probing • Sender may send probe packages too often. • Flow info If: tell the sender of f that you should send probe every If*RTT. • Ifis maintained by switches , by calculation of average finish time of all flows and rank of f

  27. pdq protocol-rate control • Control the total sending rate of its accepted flows. • Maintains variable C to compute range of Rate. • reserves BW for early started flows • C = Full_BW- Queue_size/(K*RTT)

  28. outline • Motivation • PDQ solution to flow scheduling • Evaluation • Discussion

  29. Evaluation setting: TRAFFIC • Deadline-constrained flows: • Time sensitive : ~20ms • Short message : 2KB~200KB • Goal: Application Throughput = percentage of flows meets their deadlines • Deadline-unconstrained flows: • 100~1000KB • Goal: average flow completion time

  30. Evaluation SETTING: topology

  31. query aggregation: • All senders initiate at the same time to the same receiver. Optimal: one scheduler control all transmission with no delay. maximize application throughput: sort by EDF, and then uses a dynamic programming

  32. The Deadline-unconstrained case

  33. Seamless flow switching Five flow (~1MB) comes at the same time

  34. an elephant flow and 50 short flows starting from 10ms

  35. Impact of network scale

  36. outline • Motivation • PDQ solution to flow scheduling • Evaluation • Discussion

  37. Fairness?

  38. Other concerns • Does it require rewriting APP? • PDQ paused appears like TCP slow, • The transport connection stays open. • Deployment? • Hosts: between IP and transport layer • Switch: modify hardware/software, O(k)

  39. Thank you! Q&A

More Related