150 likes | 307 Views
Enabling Flow-level Latency Measurements across Routers in Data Centers. Parmjeet Singh, Myungjin Lee Sagar Kumar, Ramana Rao Kompella. Latency-critical applications in data centers. Guaranteeing low end-to-end latency is important Web search (e.g., Google’s instant search service)
E N D
Enabling Flow-level Latency Measurements across Routers in Data Centers Parmjeet Singh, MyungjinLee SagarKumar, RamanaRaoKompella
Latency-critical applications in data centers • Guaranteeing low end-to-end latency is important • Web search (e.g., Google’s instant search service) • Retail advertising • Recommendation systems • High-frequency trading in financial data centers • Operators want to troubleshoot latency anomalies • End-host latencies can be monitored locally • Detection, diagnosis and localization through a network: no native support of latency measurements in a router/switch
Prior solutions • Lossy Difference Aggregator (LDA) • Kompella et al. [SIGCOMM ’09] • Aggregate latency statistics • Reference Latency Interpolation (RLI) • Lee et al. [SIGCOMM ’10] • Per-flow latency measurements More suitable due to more fine-grained measurements
Deployment scenario of RLI • Upgrading all switches/routers in a data center network • Pros • Provide finest granularity of latency anomaly localization • Cons • Significant deployment cost • Possible downtime of entire production data centers • In this work, we are considering partial deployment of RLI • Our approach: RLI across Routers (RLIR)
Overview of RLI architecture • Goal • Latency statistics on a per-flow basis between interfaces • Problem setting • No storing timestamp for each packet at ingress and egress due to high storage and communication cost • Regular packets do not carry timestamps Ingress I Router Egress E
Overview of RLI architecture Linear interpolation line • Premise of RLI: delay locality • Approach 1) The injector sends reference packets regularly 2) Reference packet carries ingress timestamp 3) Linear interpolation: compute per-packet latency estimates at the latency estimator 4) Per-flow estimates by aggregating per-packet estimates L Reference Packet Injector Ingress I Egress E 2 1 2 1 Delay 1 L R R 2 Interpolated delay Latency Estimator Time
Full vs. Partial deployment • Full deployment: 16 RLI sender-receiver pairs • Partial deployment: 4 RLI senders + 2 RLI receivers • 81.25 % deployment cost reduction Switch 1 Switch 3 Switch 5 Switch 2 Switch 4 Switch 6 RLI Sender (Reference Packet Injector) RLI Receiver (Latency Estimator)
Case 1: Presence of cross traffic • Issue: Inaccurate link utilization estimation at the sender leads to high reference packet injection rate • Approach • Not actively addressing the issue • Evaluation shows no much impact on packet loss rate increase • Details in the paper Switch 1 Switch 3 Switch 5 Switch 2 Switch 4 Switch 6 Link utilization estimation on Switch 1 Cross Traffic Bottleneck Link RLI Sender (Reference Packet Injector) RLI Receiver (Latency Estimator)
Case 2: RLI Sender side • Issue: Traffic may take different routes at an intermediate switch • Approach: Sender sends reference packets to all receivers Switch 1 Switch 3 Switch 5 Switch 2 Switch 4 Switch 6 RLI Sender (Reference Packet Injector) RLI Receiver (Latency Estimator)
Case 3: RLI Receiver side • Issue: Hard to associate reference packets and regular packets that traversed the same path • Approaches • Packet marking: requires native support from routers • Reverse ECMP computation: ‘reverse’ engineer intermediate routes using ECMP hash function • IP prefix matching at limited situation Switch 1 Switch 3 Switch 5 Switch 2 Switch 4 Switch 6 RLI Sender (Reference Packet Injector) RLI Receiver (Latency Estimator)
Deployment example in fat-tree topology RLI Sender (Reference Packet Injector) Reverse ECMP computation / IP prefix matching IP prefix matching RLI Receiver (Latency Estimator)
Evaluation • Simulation setup • Trace: regular traffic (22.4M pkts) + cross traffic (70M pkts) • Simulator • Results • Accuracy of per-flow latency estimates 10% / 1% injection rate RLI Sender RLI Receiver Regular Traffic Reference packets Packet Trace Traffic Divider Switch1 Switch2 Cross Traffic Injector Cross Traffic
Accuracy of per-flow latency estimates Bottleneck link utilization: 93% 67% 10% injection 10% injection 1% injection 1% injection CDF Relative error 1.2% 4.5% 18% 31%
Summary • Low latency applications in data centers • Localization of latency anomaly is important • RLI provides flow-level latency statistics, but full deployment (i.e., all routers/switches) cost is expensive • Proposed a solution enabling partial deployment of RLI • No too much loss in localization granularity (i.e., every other router)