130 likes | 241 Views
A Measurement Framework for Pin-Pointing Routing Changes. Renata Teixeira ( UC San Diego ) http://www-cse.ucsd.edu/~teixeira with Jennifer Rexford (AT&T). NetTs’04 – Portland, OR. Why understand routing changes?. Routing changes cause service disruptions Convergence delay Traffic shift
E N D
A Measurement Framework for Pin-Pointing Routing Changes Renata Teixeira (UC San Diego) http://www-cse.ucsd.edu/~teixeira with Jennifer Rexford (AT&T) NetTs’04 – Portland, OR
Why understand routing changes? • Routing changes cause service disruptions • Convergence delay • Traffic shift • Change in path properties • RTT, available bandwidth, or lost connectivity • Operators need to know • Why: For diagnosing and fixing problems • Where: For accountability • Need to guarantee service-level agreements
What can be done with active measurements? • Active measurements: traceroute-like tools • Can’t probe in the past • Shows the effect, not the cause Web Server (d) AS 2 AS 4 AS 1 User (s) AS 3
Can we use passive measurements? • Passive measurements: public BGP data BGP update feeds Data Correlation Data Collection (RouteViews, RIPE) root cause
A B D C BGP data collection No change Why Public BGP Data is Not Enough? Myth: The BGP updates from a single router accurately represent the AS dst AS 2 AS 1 The measurement system needs to capture the BGP routing changes from all border routers 7 6 10 12
A B D C BGP data collection Why Public BGP Data is Not Enough? Myth:Routing changes visible in eBGP have greater impact end-to-end impact than changes with local scope. dst AS 2 AS 1 The measurement system needs to capture internal changes inside an AS 5 7 6 10 12
Why Public BGP Data is Not Enough? Myth:BGP data from a router accurately represents changes on that router. 12.1.1.0/24 A BGP data collection 12.1.0.0/16 The measurement system needs to know all routes the router knows.
3 7 9 2 8 5 4 1 6 11 10 Misleading BGP Changes Myth:The AS responsible for the change appears in the old or the new AS path. BGP data collection old: 1,2,8,9,10 new: 1,4,5,6,7,10 Accurate troubleshooting may require measurement data from each AS
12 BGP data collection Misleading BGP Changes Myth:Looking at routing changes across prefixes resolves d2 AS 3 d3 AS 2 AS 1 d1 A B 7 ASes involved in the change need to cooperate to pin-point the reason for the change 10 C Changes for d2, but not for d1 and d3
Strawman Proposal: Omni Server • Creating an AS-level view • BGP feeds from all border routers • Inject all routes known in each router • Internal routing data • Archive log of routing changes • Responding to queries • Local cause: responds directly • No local change: query neighbor AS • Local change from downstream cause: query old and/or new neighbor AS
(i,s,d,t) failure link (3,4) (j,s,d,t’) failure link (3,4) Diagnosis with Omnis Omni 2 Omni 4 Web Server (d) AS 2 AS 4 AS 1 i User (s) AS 3 Omni 1 j Omni 3
Conclusion • Passive data • AS-level view • History (answers in the past) • Distributed • Active querying • Servers, not routers • See cause, not effect
Future Directions • How often are the myths violated? • Measurement studies of ISP networks • Doesn’t Omni require lots of data? • ISPs already collect this kind of data • Routing protocols extensions to reveal reasons of routing changes • Will ASes really cooperate? • Pressure to provide service-level agreements • Small group of ASes that choose to cooperate • Won’t ASes cheat? • Need techniques to detect persistent lying