420 likes | 690 Views
Finishing Flows Quickly with Preemptive Scheduling. Speaker: Yu, Ye. The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures origins from their presentation. outline. Motivation PDQ solution to flow scheduling Evaluation Discussion. Datacenter Networks.
E N D
Finishing Flows Quickly with Preemptive Scheduling Speaker: Yu, Ye The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures origins from their presentation.
outline • Motivation • PDQ solution to flow scheduling • Evaluation • Discussion
Latency 1ms latency = 1% reduce of sales. 100ms latency = 0.2% number searches 2.2 seconds faster / page = 60 M more download / year
LOW LATENCY • LOW Latency Datacenters? • Finish Flows Earlier • The LAST of flows == final result • Meet Flow Deadline • User-facing applications, Latency Goal
partition aggregate model • associated component deadlines in the parentheses.
Today’s transport protocols • TCP / RCP/ ICTCP / DCTCP: • Fair Sharing • Divide link bandwidth equally. • Fail to reduce flow completion time.
Host A Host B one segment RTT two segments four segments time WHAT is TCP • TCP slow start • TCP fast recovery • additive increase • multiplicative decrease
WHAT IS RCP • Rate Control Protocol • RCP is an adaptive algorithm to emulate Processor Share : a router divides outgoing link bandwidth equally • Rate is picked by the routers based on queue size and aggregate traffic • Router assigns a single rate to all flows • Requires no per-flow state or per-packet calculation
Fairness damages completion time • Flow Fa,Fb,Fc arrives at the same time, with size = 1,2,3 and deadline = 1,4,6 Fair share, FC time = (3+5+6)/3 = 4.67 D3 for order BAC FC time = (2+4+6)/3 = 4 Shortest Job First/ Earliest Deadline First FC time = (1+3+6)/3 = 3.33
D3 depends on flow order • D3 satisfies as many flows as possible in the order of their arrival, • Request rate = flow size / time until deadline. Satisfy request by Order
The PDQ Solution • Preemptive Distributed Quick • Schedule by flow criticality. • Criticality: relative priority of flows. • Scheduling discipline. Preemptive : relating to the purchase of goods or shares by one person or party before the opportunity is offered to others.
PDQ’s scheduling diciplines • EDF: earliest deadline first • Optimal for flow deadlines. • SJF: shortest job first • Optimal for mean flow finish time. • EDF+SJF: • Give preference to deadline flows. • Policy based: • Manually allocate priority of flow.
challenges. • Decentralizing scheduling discipline • More mice than elephant. • Switching between flows seamlessly • Hard to full utilize bandwidth • Prioritizing flows using FIFO tail-drop Queues • FIFO Queue length limited
outline • Motivation • PDQ solution to flow scheduling • Evaluation • Discussion
PDQ protocol-PDQ sender-1 • SYN / TERM packet for initialization and termination. • Resend after timeout. • sender maintains info for in-flight packets: • Current Sending Rate (Rs) • ID of switch who paused it (Ps) • Deadline (Ds) • Expected flow transmission time (Ts) • Inter-probing time (Is) • Measured RTT (RTTs)
PDQ protols-sender-2 • Sender sends package with rate Rs • If Rs = 0, Send a probe packet heartbeatly.(scheduling header without data) • When ACKarrives, update Rs (ACKinfo: accept/pause)
Pdq protols-sender-early-termination • Sender TERMNINATES a flow when it cannot meet its deadline.Whenever: • Deadline is past. • Remaining flow transmission + time > deadline • Flow is paused , and time + RTT> deadline
PDQ protols-switch • Let the most criticalflow complete asap. • Critical flows preempt others to achieve the highest possible sending rate • 1) maintain state about each flow • 2) Compute Rate Feedback • a) flow controller to decide witch flows to send • b) rate controller to determine Rate
pdq protocol-switch-state • Maintains flow states on each link • <Rate, P, Deadline, expected Time,RTT> • Pi: flow i is paused by switch Pi • Store 2K of them, most critical ones. K is number of Current Sending Flow.
pdq protocol-FLOW Control • Whenever a Switch receives ACK/data, ACCEPTor PAUSEa flow • Pause: inform others flow f is Paused. • Switch who receives ACK-Pause iremoves i from its own states • Accept: calculate available bandwidth • Other Switch who receives ACK-accept i updates state i
algorithm recv data/ack flow f • 1) if f is paused by other Switch, remove it from my list. • 2) if f is not in my list: • Try to add f into my list, if can not ,pause f • 3) if (w = min(aviliableBW ,Rf) > 0 ): • Accept f • Otherwise pause f
Flow-control-3 optimization • Dampening • If switch accepted a flow, then in a short period of time he can not accept other new flows. • Early starting • Suppressed probing
Suppressed probing • Sender may send probe packages too often. • Flow info If: tell the sender of f that you should send probe every If*RTT. • Ifis maintained by switches , by calculation of average finish time of all flows and rank of f
pdq protocol-rate control • Control the total sending rate of its accepted flows. • Maintains variable C to compute range of Rate. • reserves BW for early started flows • C = Full_BW- Queue_size/(K*RTT)
outline • Motivation • PDQ solution to flow scheduling • Evaluation • Discussion
Evaluation setting: TRAFFIC • Deadline-constrained flows: • Time sensitive : ~20ms • Short message : 2KB~200KB • Goal: Application Throughput = percentage of flows meets their deadlines • Deadline-unconstrained flows: • 100~1000KB • Goal: average flow completion time
query aggregation: • All senders initiate at the same time to the same receiver. Optimal: one scheduler control all transmission with no delay. maximize application throughput: sort by EDF, and then uses a dynamic programming
Seamless flow switching Five flow (~1MB) comes at the same time
outline • Motivation • PDQ solution to flow scheduling • Evaluation • Discussion
Other concerns • Does it require rewriting APP? • PDQ paused appears like TCP slow, • The transport connection stays open. • Deployment? • Hosts: between IP and transport layer • Switch: modify hardware/software, O(k)
Thank you! Q&A