300 likes | 466 Views
Identifying problematic inter-domain routing issues. Olaf Maennel, Anja Feldmann Saarland University, Saarbücken, Germany. BGP scalability?!! BGP convergence times??? A lot of open questions, that need understanding! What happens really in the Internet?. Motivation. Data munching
E N D
Identifying problematic inter-domain routing issues Olaf Maennel, Anja FeldmannSaarland University, Saarbücken, Germany
BGP scalability?!! BGP convergence times??? A lot of open questions, that need understanding! What happens really in the Internet? Motivation
Data munching automatic processing of raw data providing an intermediate level Characterizing BGP updates identification of update events TOOL: “Character”
TOOL: “Character” your function(or "Check" functions) results FileFinder - Package RAW-DATA
Identification of routing updates type of changes, flapping, session resets, … Processing of updates in the context of related (same prefix) surrounding (near in time) How “character” works Input: table dump1 – all updates – table dump2 route change events
Output: route_btoa Timestamp Updated Prefix 1011363829|A|195.66.224.112|3549| 80.96.15.0/24|3549 3300 702 8708| 1011387198|W|195.66.224.112|3549| 80.96.15.0/24| | 1011387339|A|195.66.224.112|3549| 80.96.15.0/24|3549 701 702 8708| 1011387369|A|195.66.224.112|3549| 80.96.15.0/24|3549 3300 702 8708| 1010976980|W|195.66.224.112|3549|80.96.150.0/24| | 1010977007|A|195.66.224.112|3549|80.96.150.0/24|3549 209 1755 15471| AS Path • All updates like Merit’s "route_btoa –m"
RIPE’s RRC00: Jan 14, 2002 01:00 – Jan 20, 2002 01:10 Example data sets
Output: route_btoa Timestamp Updated Prefix 1011363829|A|195.66.224.112|3549| 80.96.15.0/24|3549 3300 702 8708| 1011387198|W|195.66.224.112|3549| 80.96.15.0/24| | 1011387339|A|195.66.224.112|3549| 80.96.15.0/24|3549 701 702 8708| 1011387369|A|195.66.224.112|3549| 80.96.15.0/24|3549 3300 702 8708| 1010976980|W|195.66.224.112|3549|80.96.150.0/24| | 1010977007|A|195.66.224.112|3549|80.96.150.0/24|3549 209 1755 15471| AS Path • Classification of each update is appended:
Output: What has changed? #update change to last update |:|24.|199 |AA-DIFF|ASPath-way Community|3549|3320->3300|8708|origin | |:|25.|23369|AW-DIFF| | | | | | |:|26.|141 |WA-DIFF|ASPath-way Community|3549|3300->701 |702 |transit| |:|27.|30 |AA-DIFF|ASPath-way Community|3549|701->3300 |702 |transit| |:|1. |-1 |AW-DIFF| | | | | | |:|2. |27 |WA-DIFF|ASPath-way Community|3549|3300->209 |1755|transit| time since last update What has changed?
Output: AS Path changes last ‘stable’ AS |:|24.|199 |AA-DIFF|ASPath-way Community|3549|3320->3300|8708|origin | |:|25.|23369|AW-DIFF| | | | | | |:|26.|141 |WA-DIFF|ASPath-way Community|3549|3300->701 |702 |transit| |:|27.|30 |AA-DIFF|ASPath-way Community|3549|701->3300 |702 |transit| |:|1. |-1 |AW-DIFF| | | | | | |:|2. |27 |WA-DIFF|ASPath-way Community|3549|3300->209 |1755|transit| from where to where? rejoining AS
Output: Old AS Path AS on the “old” Path 3549__95%_ 3320__47%_ 5483_*15%* 8708__78%_| 2 |0. |22.|#8|flapping| 3549__95%_ 3300__65%_ 702__61%_ 8708_**3%*| 5 |3. |20.|#6| | 3549__95%_ 3300__65%_ 702__63%_ 8708__36%_| 5 |21.|21.|#1| | 3549__95%_ 701__66%_ 702__64%_ 8708__53%_| 3 |0. |24.|#9| | 3549__96%_ 3300__67%_ 1755__54%_ 15471_*21%*| * |* |* |* | | 3549__96%_ 3300__67%_ 1755__54%_ 15471__33%_| * |* |* |* | | percentage of prefixes still reachable
1. new change duplicate 2. 3. flapping 4. reconvergence n-way change >4 Sets of updates for a prefixwith same attributes
Output: “n-way flapping” distance to last equal update reconvergence | 2 |0. |22.|#8|flapping|208326|85% |<- | | (8708)__72%_ 5483 | 5 |3. |20.|#6| | |8% |-1 | | (8708)__79%_ 702 | 5 |21.|21.|#1| | |8% |-2 | | (8708)__78%_ 702 | 3 |0. |24.|#9| | |8% |flap-3|23540| (8708)__78%_ 702 | * |* |* |* | | |100%| | |(15471)**95%* 1755 | * |* |* |* | | |100%| | |(15471)**95%* 1755 percentage of other prefixes by the originating AS identified as flapping first and last occurrence in update series flapping time to last flap
peering connection breakdown -a whole table must be exchanged Update storms are propagated through the internet… How big is the problem? Session resets
Output: possible session resets AS number (8708)__72%_ 5483**66%* 3320**28%* 3549___0%_| 2 |3320 5483| | (8708)__79%_ 702___5%_ 3300___3%_ 3549___0%_| | | | (8708)__78%_ 702___5%_ 3300___3%_ 3549___0%_| | |peak| (8708)__78%_ 702___5%_ 701___1%_ 3549___0%_| | |peak| (15471)**95%* 1755___0%_ 3549___0%_ 3300___0%_| 1 |15471 | | (15471)**95%* 1755___0%_ 3549___0%_ 3300___0%_| 1 |15471 | | Percentage of updated vs. all associated prefixes with an AS.
Identification of session resets All prefixes updated
Output: possible session resets number of ASs involved (8708)__72%_ 5483**66%* 3320**28%* 3549___0%_| 2 |3320 5483| | (8708)__79%_ 702___5%_ 3300___3%_ 3549___0%_| | | | (8708)__78%_ 702___5%_ 3300___3%_ 3549___0%_| | |peak| (8708)__78%_ 702___5%_ 701___1%_ 3549___0%_| | |peak| (15471)**95%* 1755___0%_ 3549___0%_ 3300___0%_| 1 |15471 | | (15471)**95%* 1755___0%_ 3549___0%_ 3300___0%_| 1 |15471 | | ASs involved
Output: Classification further changes? |2|3320 5483| | 7.0|instable |... | | | | 5.9|instable |... | | |peak|16.2|instable |... | | |peak|16.2|re-stable change|... |1|15471 | | 1.3|instable |... |1|15471 | | 1.4|instable |... further suggestions?! peak identification update rate per second
Like packet flows Bursts consists of several updates same prefix short time window Update burst
Classification of updates Statistical information Missing updates / verification Output Character
RTG – a realistic Routing Table (and update) Generator generation of tables and updates with ‘real-world’ characteristics Use RTG to benchmark router performance Ongoing work
Conclusion If you are interested, pleasevisit our website: http://www.net.uni-sb.de/~olafm Thank you !