DiffProbe Detecting ISP Service Discrimination

Partha Kanuparthy, Constantine Dovrolis DiffProbeDetecting ISP Service Discrimination

Net Neutrality • Recent FCC-ISP debates • Comcast throttling dispute, etc. • FCC broadband mapping framework • Tools to estimate performance • $350m stimulus funds

What is Service Discrimination? • ISPs can classify certain apps as low-priority: • and service them accordingly • Discrimination can manifest as (relatively): • high delays • high loss rates • ISP can also do shaping: leads to low throughput (=> both delay and loss) • ShaperProbe: first step

Problem: Is an application's trafficbeing classified low-priority by an ISP? Is the ISP doing loss or delay discrimination or both? Can we identify scheduler type? Solution: Compare performance of normal and application traffic sent simultaneously Identifying discrimination is not easy: Congestion events can be short-lived (us-ms scales) Bad idea: compare delays/loss rates from different times Customer may see same performance if there is no cross-traffic Bad idea: call this as no-discrimination DiffProbe server Goals

Delay Discrimination: Practice • Non-discriminatory schedulers (single queue): • First-Come-First-Serve (FCFS) • Discriminatory schedulers (multiple classes): • Strict Priority (SP) • Weighted Fair Queuing (WFQ) WRR Delay discrimination creates difference in delay distributions

Loss Discrimination: Practice • Non-discriminatory buffer managers: • DropTail (DT) • Random Early Detect (RED) • Discriminatory buffer managers: • Weighted RED (WRED) • Drop-from-Longest-Queue Drop-from-Longest Loss discrimination creates difference in loss rates

Rest of the talk… • High level design • Detecting delay discrimination • Detecting loss discrimination • The DiffProbe tool • ShaperProbe

High-level design • Send normal (P) and application (A) traffic simultaneously • Measure one-way delays (OWDs) and lost packets for each flow DiffProbe server Application traffic (A) Normal traffic (P)

Avoiding Classification • A flow: , ... • P flow has to be: • sufficiently different from A to avoid classification • Ex: alter payload, ports, gaps • sufficient similar to A to observe same network performance as P when there is no discrimination • same packet size distribution between A and P • send a P packet at about same time as A

A P A P Probing Patterns • Create two probing structures using A and P: • Balanced Load Period (BLP): send both flows at their normal rates • Load Increase Period (LIP): scale up P flow's rate • Why createLIP? To maximize chances of queuing in ISP network

Discrimination Identifiability • The user does not always “see” discrimination • no high-priority backlog “=>” Low-priority gets link capacity • We use BLP to detect unidentifiableconditions for delay discrimination • P delays created during LIP are larger than BLP 90th percentile of P's delays during LIP median of P's delays during BLP >

Overview • High level design • Detecting delay discrimination • Detecting loss discrimination • The DiffProbe tool

Detecting Delay Discrimination • We observe empirical delay distributions of A and P flows during LIP: and • No delay discrimination: • Delay discrimination: FCFS(Comcast) WRR 1:3 (emulated)

A P Detecting Delay Discrimination (2) • Pre-processing: • Pairing: Consider only those (A,P) sample pairs which were sent within an MTU-transmission time, τ • Discard delay values in τ-neighborhood of estimated propagation delay • such samples don't see queuing • Subtract propagation delay estimate from samples

Detecting Delay Discrimination (3) • Hypothesis test for : • Null hypothesis: equal distributions • Compute Kullbeck-Leibler (KL) Divergence of pre-processed samples • Compute KL Divergences of uniform random partitions of • Is (2) > (3)? • Test for • Compare all higher percentiles (50th - 90th) of A and P delay distributions • Redo the test, swapping A and P as inputs • If this test fails, we state that delay discrimination is unknown

Delay Discrimination: Accuracy Evaluate using simulations: • Discrimination using SP and WFQ • Skype iSAC packet trace as A flow • Cross-traffic: interactive TCP sessions (200 users) • Half of user traffic classified low-priority • BLP, LIP durations: 30s 90+% accuracy among detectable trials FCFS, SP, WFQ 1:1.5 is similar to FCFS WFQ weights 95% confidence, 2% error margin

WFQ SP or WFQ? • SP-like or WFQ-like scheduling create diff. delays • Idea: some P packets serviced just after A would: • see only A's non-preemption delay (if any) in SP • but, see A's queuing delays in WFQ Low-priority SP WFQ 1:2 queuing non-preemption SP • Method: choose a subset of P samples: • receivedvery close but after an A packet Distribution of P subset

Detecting Loss Discrimination • Estimate loss rates of A and P flows during LIP as fraction of packets lost: and • No loss discrimination: • Loss discrimination: WRR 1:3 Drop-Longest-Queue (emulated)

Detecting Loss Discrimination (2) • Pre-processing: to estimate and • Pairing: same as that for delay discrimination • ensure the A and P flows sample the same congestion events if DropTail/RED • Use the Two-Proportion Test on and • Unidentifiability: less than 10 dropped packets in each flow

Loss Discrimination: Accuracy • Buffer sizes according to BW-Delay product • 90+% accuracy for discriminating configurations similar loss rates WRED accuracy f: Min queue threshold of normal flows: WFQ 1:1.5 is similar to DT Drop-Longest-Queue (WFQ) vs. DT

Implementing DiffProbe • DiffProbe runs as client-server (~7500 LoCs) • Classifier types: port, payload • A flow: Skype and Vonage voice traces • P flow: randomize payload, port of A flow • LIP, BLP durations: 30s each • Pre-probing: estimate path capacity using packet trains

Experiments • Emulations: • discriminating link configured using tc • Pareto cross-traffic • SP, WRR, and Drop-Longest-Queue discriminators • No FPs, FNs • Real-world experiments (Skype and Vonage): • We do not have ground truth • A high p-value of KL-test is a good “indicator” of no-discrimination • One ISP showed multi-path routing, which created different delays KL-test p-values: Access ISP runs

Validation • ISPs have so far not disclosed details of application discrimination practices (if any) • No ground truth! • Discrimination: significant difference in delays and/or losses of A and P • Why? : controlled environment trials! Validation ideas?

ShaperProbe • A pre-probing module of DiffProbe to answer: • Can we detect traffic shaping by ISPs? • What is the shaping configuration? • Key idea: probe and detect level shifts in rate • the token bucket signature Upload: 7Mbps -> 2Mbps in 8s

ShaperProbe (contd.) • Deployed at Google M-Lab • 60,000+ runs so far • Who shapes traffic? ...among 700+ other ASes.

Thank You!partha @ cc.gatech.edu

Detecting Delay Discrimination (3) • Hypothesis test for : • Null hypothesis: equal distributions • Compute Kullbeck-Leibler (KL) Divergence of pre-processed samples • call it • Bootstrap: compute KL Divergences of uniform random partitions of • this gives us a KL distribution • Reject null hypothesis if p-value is < 0.05:

Detecting Delay Discrimination (4) • Test for (if KL-test rejects null hypothesis) • Compare higher percentiles of A and P delay distributions • Redo the test, swapping A and P as inputs • If this test fails, we state that delay discrimination is unknown

WFQ SP or WFQ? (2) • For the distribution of this subset of P samples: • SP if: 95th percentile P delay ≈ 5th percentile • WFQ-like, otherwise SP Distribution of P subset WFQ-SP accuracy

DiffProbe Detecting ISP Service Discrimination