590 likes | 781 Views
Router-level Internet Topology Mapping. CS790 Presentation Modified from Dr. Gunes ’ slides by Talha OZ. Outline. Introduction Internet Topology Measurement Topology Discovery Issues Impact of IP Alias Resolution Topology Discovery Resolving Anonymous Routers
E N D
Router-level Internet Topology Mapping CS790 Presentation Modified from Dr. Gunes’ slides by Talha OZ
Outline • Introduction • Internet Topology Measurement • Topology Discovery Issues • Impact of IP Alias Resolution • Topology Discovery • Resolving Anonymous Routers • Graph-based Induction Technique • Resolving Alias IP Addresses • Analytical and Probe-based Alias Resolution • Resolving Genuine Subnets • Dynamic Subnet Inference • Summary
Internet • Web of interconnected networks • Grows with no central authority • Autonomous Systems optimize local communication efficiency • The building blocks are engineered and studied in depth • Global entity has not been characterized • Most real world complex-networks have non-trivial properties. • Global properties can not be inferred from local ones • Engineered with large technical diversity • Range from local campuses to transcontinental backbone providers 3
Internet Measurements • Understand topological and functional characteristics of the Internet • Essential to design, implement, protect, and operate underlyingnetwork technologies, protocols, services, and applications • Need for Internet measurements arises due to commercial, social, and technical issues • Realistic simulation environment for developed products, • Improve network management • Robustness with respect to failures/attacks • Comprehend spreading of worms/viruses • Know social trends in Internet use • Scientific discovery • Scale-free (power-law), Small-world, Rich-club, Dissasortativity,…
Internet Topology Measurement • Types of Internet topology maps • Autonomous System (AS) level maps • Router level maps • A router level Internet map consists of • Nodes: End-hosts and routers • Links: Point-to-point or multi-access links • Router level Internet topology discovery • A process of identifying nodes and links among them Lumenta Jan 06 CAIDA Jan 08 CAIDA Jan 00
Internet Topology MeasurementBackground • Internet topology measurement studies • Involves topology collection / construction / analysis • Current state of the research activities • Distributed topology data collection studies/platforms • iPlane, Skitter, Dimes, DipZoom, … • 20M path traces with over 20M nodes (daily) • Topology discovery issues • Sampling • Anonymous routers • Alias IP addresses • Genuine subnets
Internet Topology MeasurementsProbing • Direct probing • Indirect probing IPB IPD Vantage Point IPB TTL=64 IPD TTL=64 B C D A IPB IPC Vantage Point B C D IPD TTL=1 IPD TTL=2 A
Internet Topology MeasurementTopology Collection (traceroute) • Probe packets are carefully constructed to elicit intended response from a probe destination • traceroute probes all nodes on a path towards a given destination • TTL-scoped probes obtain ICMP error messages from routers on the path • ICMP messages includes the IP address of intermediate routers as its source • Merging end-to-end path traces yields the network map IPB IPA IPC IPD Vantage Point Destination TTL=1 TTL=4 TTL=2 TTL=3 A B C D S
Internet Topology Measurement:Background Internet2 backbone S s.3 s.2 s.2 n.1 n.3 n.3 N c.2 w.2 w.1 u.1 c.1 W C c.3 w.3 w.3 u.2 U c.4 k.1 k.2 K u.3 l.1 k.3 Trace to NY a.1 a.2 l.2 L A l.3 l.3 a.3 a.3 h.2 Trace to Seattle H h.3 h.1 h.4 h.4 h.4 d
Internet Topology Measurement:Background s.1 f e S s.3 n.2 s.2 n.1 n.3 N c.2 w.2 w.1 c.1 u.1 W C c.3 w.3 u.2 U c.4 k.1 k.2 K u.3 l.1 k.3 a.1 a.2 l.2 L A l.3 a.3 h.2 H h.3 h.1 h.4 d
Internet Topology MeasurementTopology Collection f Internet2 backbone e S N C W U K L A H • Traces • d - H - L - S - e • d - H - A - W - N - f • e - S - L - H - d • e - S - U - K - C - N - f • f - N - C - K- H - d • f - N - C - K - U - S - e d
Topology Sampling Issues • Sampling to discover networks • Infer characteristics of the topology • Different studies considered • Effect of sample size [Barford 01] • Sampling bias [Lakhina 03] • Path accuracy [Augustin 06] • Sampling approach [Gunes 07] • Utilized protocol [Gunes 08] • ICMP echo request • TCP syn • UDP port unreachable
Topology Sampling Approaches • Sampling techniques • Path sampling • Diameter • Edge sampling • Capacity • Node sampling • Degree characteristics • Sampling approach • (n,n) – traceroute based topology • Returns the Internet map among n vantage points • (k,m) – traceroute based topology where k<<m (k=n) • Returns the Internet map between ksources and mdestinations (k,m)-sampling vs (n,n)-sampling Path sampling vs Node sampling
Historical Perspective on ResponsivenessData Set • ICMP path traces from skitter • 1st collection cycle of each year (from 1999 to 2008) • Skitter had updates to destination IP addresses • major update in the system in 2004 • Processing • Alias IP addresses • Analytical Alias Resolver (AAR) [Gunes-06] • Analytical and Probe Based Alias Resolver (APAR) [Gunes-09] • Anonymous routers • Graph Based Induction (GBI) [Gunes-08]
Current Practices in Responsiveness Data Set • 536,743 destination IP addresses • from skitter and iPlane projects • Between 7-11 April 2008 • Probes • ICMP echo request • TCP SYN • UDP to random ports • Direct probes • ping • Indirect probes • traceroute
Current Practices in Responsiveness Direct probes 320 K 217 K 537 K IPs
Current Practices in Responsiveness Direct probes (domain) 537 K IPs 25.5 K 10.1 K 5 K 1.7 K 0.5 K
Current Practices in Responsiveness Indirect probes Initial Final 306 K traces
Current Practices in Responsiveness • Nodes that respond to indirect probes might not respond to direct probes • Nodes are most responsive to ICMP probes (%82) • least responsive to UDP probes (%60) • End hosts are less responsive than routers • Responsiveness is similar for different domains
Anonymous Router Resolution Problem • Anonymous routers do not respond to traceroute probes and appear as a in path traces • Same router may appear as a in multiple traces. • Anonymous nodes belonging to the same router should be resolved. • Anonymity Types • Ignore all ICMP packets • ICMP rate-limiting • Ignore ICMP when congested • Filter ICMP at border • Private IP address
Anonymous Router Resolution Problem f Internet2 backbone e S N C W U K L A H • Traces • d - - L - S - e • d - - A - W - - f • e - S - L - - d • e - S - U - - C - - f • f - - C - - - d • f - - C - - U - S - e d
Anonymous Router Resolution Problem S U K C N f L H A W e • Traces • d - - L - S - e • d - - A - W - - f • e - S - L - - d • e - S - U - - C - - f • f - - C - - - d • f - - C - - U - S - e d Sampled network C U S f L W A e d Resulting network
Alias Resolution • Each interface of a router has an IP address. • A router may respond with different IP addresses to different queries. • Alias Resolution is the process of grouping the interface IP addresses of each router into a single node. • Inaccuracies in alias resolution may result in a network map that • includes artificial links/nodes • misses existing links .33 .5 .18 Denver .7 .13
IP Alias Resolution Problem s.1 f e S s.3 n.2 s.2 n.1 N n.3 c.2 u.1 w.1 w.2 c.1 W C c.3 u.2 w.3 U k.1 c.4 k.2 K u.3 k.3 l.1 a.1 l.2 a.2 L A l.3 a.3 h.2 • Traces • d - h.4 - l.3 - s.2 - e • d - h.4 - a.3 - w.3 - n.3 - f • e - s.1 - l.1 - h.1 - d • e - s.1 - u.1 - k.1 - c.1 - n.1 - f • f - n.2 - c.2 - k.2 - h.2 - d • f - n.2 - c.2 - k.2 - u.2 - s.3 - e H h.3 h.1 h.4 d
IP Alias Resolution Problem S U K C N f Sampled network L H A W e d s.3 u.1 c.1 n.1 k.1 s.1 f e c.2 k.2 u.2 n.2 s.2 n.3 w.3 l.1 a.3 h.2 l.3 h.1 • Traces • d - h.4 - l.3 - s.2 - e • d - h.4 - a.3 - w.3 - n.3 - f • e - s.1 - l.1 - h.1 - d • e - s.1 - u.1 - k.1 - c.1 - n.1 - f • f - n.2 - c.2 - k.2 - h.2 - d • f - n.2 - c.2 - k.2 - u.2 - s.3 - e h.4 Sample map without alias resolution d
Genuine Subnet Resolution • Alias resolution • IP addresses that belong to the same router • Subnet resolution • IP addresses that are connected over the same medium IP2 IP3 IP1 IP4 IP6 IP5 IP1 IP1 IP2 IP3 IP2 IP3
Outline • Introduction • Internet Topology Measurement • Topology Discovery Issues • Impact of IP Alias Resolution • Topology Discovery • Resolving Anonymous Routers (Hakan’s work !) • Graph-based Induction Technique • Resolving Alias IP Addresses • Analytical and Probe-based Alias Resolution • Resolving Genuine Subnets • Dynamic Subnet Inference • Summary
Summary - Anonymous Router Resolution • Responsiveness reduced in the last decade • NP-hard problem • Graph Based Induction Technique • Practical approach for anonymous router resolution • Takes ~6 hours to process data sets of ~20M path traces • Identifies common structures • Handles all anonymity types • Helpful in resolving multiple anonymous routers in a locality C C C C A D A A D D A D E E E E GBI Underlying Neighbor Matching Collected
Outline • Introduction • Internet Topology Measurement • Topology Discovery Issues • Impact of IP Alias Resolution • Topology Discovery • Resolving Anonymous Routers • Graph-based Induction Technique • Resolving Alias IP Addresses • Analytical and Probe-based Alias Resolution • Resolving Genuine Subnets • Dynamic Subnet Inference • Summary
1 2 2 1 1 3 2 4 2 1 1 2 IP Alias Resolution Problem • A set of collected traces • w, …,b1, a1, c1, …, x • z, …,d1, a2, e1, …, y • x, …,c2, a3, b2, …, w • y, …,e2, a4, d2, …, z w b c x a z d e y a sub-graph Sample map from the collected path traces • A router may appear with different IP addresses in different path traces • Need to resolve IP addresses belonging to the same router b1 c1 d1 e1 a1 a2 w x z y a3 a4 b2 c2 d2 e2 with no alias resolution
1 2 2 1 1 3 2 4 2 1 1 2 IP Alias Resolution Problem z w b c x a d1 d2 z d e y b1 c1 sub-graph a x w b2 c2 a1 w b e1 e2 c x a2 a3 z d y e y a4 partial alias resolution(only router a is resolved) partial alias resolution (only router a is not resolved)
IP Alias Resolution: Previous Approaches • Source IP Address Based Method [Pansiot 98] • Relies on a particular implementation of ICMP error generation. • IP Identification Based Method (ally) [Spring 03] • Relies on a particular implementation of IP identifier field, • Many routers ignore direct probes. • DNS Based Method [Spring 04] • Relies on similarities in the host name structures sl-bb21-lon-14-0.sprintlink.net sl-bb21-lon-8-0.sprintlink.net • Works when a systematic naming is used. • Record Route Based Method [Sherwood 06] • Depends on router support to IP route record processing B Dest = A A A B B A, ID=100 Dest = A Dest = B B, ID=99 B, ID=103 Dest = B
Analytical Alias Resolution Approach • Leverage IP address assignment convention to infer IP aliases • Identify symmetric path segments within the collected set of path traces • Infer IP aliases • Use a number of checks to • Remove false positives • Increase confidence in the identified IP aliases
A B IP address Assignment PracticesPoint-to-point Links • For a point-to-point link • use either /30 subnet or /31 subnet • The interface IP addresses on the link are consecutive and are within /30 subnet or /31 subnet • use ↔ to represent subnet relation between two IP addresses • Use subnet relation (↔) to infer IP aliases /30 network 192.168.1.5 192.168.1.6 192.168.1.4/30 /31 network 192.168.1.4 192.168.1.5 192.168.1.4/31
IP address Assignment PracticesMulti-access Links • A similar relation between IP addresses belonging to the same multi-access link holds • Example: Consider two IP addresses A:129.119.1.10 and B: 129.119.1.13 • A and B are not together in a /30 or a /31 subnet • However, they are together in /29 subnet 129.119.1.8/29 A: 129.119.1.00001010 B: 129.119.1.00001101 A B .13 .10 129.119.1.8/29 subnet
Analytical Alias ResolutionSample traceroute pairs no response UTD 129.110.95.1 no response 129.110.5.1 206.223.141.74 206.223.141.73 206.223.141.69 Aliases 129.110.5.1- 206.223.141.74 206.223.141.73 - 206.223.141.69 206.223.141.70 - 198.32.8.33 … 206.223.141.70 198.32.8.33 198.32.8.34 198.32.8.65 198.32.8.66 198.32.8.85 198.32.8.84 192.5.89.10 192.5.89.89 192.5.89.9 192.5.89.90 18.168.0.27 18.7.21.1 18.168.0.25 MIT 18.7.21.84
c d a b e f a sample network APARAnalytical and Probe-based Alias Resolution • There is possibility of • incorrect subnet assumption, • Two /30 subnets assumed as a /29, • incorrect alignment of path traces. • IP4 and IP8 are thought of as aliases. • To prevent false positives, some conditions are defined • Trace preservation, • Distance preservation (probing component of APAR), • Completeness, • Common neighbor. IP4 IP7 IP1 IP3 IP2 IP8 IP9
Analytical Alias ResolutionMain Idea • Use traceroute collected path traces only • No probing is required at this point • Study the relations between IP addresses in different traces • Infer subnets: Use the IP address assignment convention to infer • Point-to-point (/30 or /31) subnets, or • Multi-access (/x where x<30) subnets from the path traces • Infer IP aliases: Align path segments to infer IP aliases from the detected subnets
Analytical Alias Resolution:Potential Issues • Problems with inferring subnets accurately • False positive: two separate subnets with consecutive /30 subnet numbers may be inferred as one /29 subnet • False negative: a /29 subnet may be inferred as two separate /30 subnets • Problems with inferring IP aliases accurately • False positives and false negatives possible due to incorrectly formed subnets • Both false positives and false negatives introduce inaccuracies to the resulting topology map
Analytical Alias ResolutionPotential Solutions • How to verify the accuracy of formed subnets • Accuracy condition: Two or more IP addresses from the same subnet cannot appear in a loop-free trace (unless they are consecutive) • Check if a newly formed subnet violates this condition for any pair of available IP addresses from this subnet in any other path trace • Completeness condition: To infer a /x subnet among a set of IP addresses that belong the address range, require that some fraction (e.g., 50%) of these addresses appear in our data set • Needed to increase our confidence on the inferred subnet • Processing order: Start with subnets with higher completeness ratio
Analytical Alias ResolutionPotential Solutions • How to verify the accuracy of inferred IP aliases • No loop condition:No inferred IP aliases should introduce any routing loops in any of the path traces Example: Consider two traces • (…, a, b, c, d, …) (…, e, f, g, h, b, i, …) (reverse trace) • Assume a subnet relation (g ↔ c) • Inferred alias pair: (b,g)----- CAUSES LOOP!
Analytical Alias ResolutionPotential Solutions • How to verify the accuracy of inferred IP aliases • Common neighbor condition: Given two IP addresses s and t that are candidate aliases belonging to a router R, one of the following cases should hold: • s and t have a common neighbor in some path trace • There exists an alias pair (b,o) such that • b is a successor (or predecessor) of s • o is a predecessor (or successor) of t • involved traces are aligned such that they form two subnets, one at each side of router R • Distance condition:Given two IP addresses s and t that are candidate aliases for a router R, s and t should be at similar distance to a vantage point • Adds an active probing component to the solution
EvaluationsCoverage Comparisons • AMP: ally (1,884 pairs) and APAR (2,034 pairs) • iPlane: ally (39,191 pairs) and APAR (50,206 pairs) Causing Loop Ally APAR Ally disagree 1,003 45 864 986 34 Ally APAR ? Complete ally requires (275K)2 probes iPlane 10,678 22,886 6,179 3,058 11,070 2,514 8,206 Ally disagree Causing loop Source IP based
SummaryAnalytical and Probe-base Alias Resolution • IP alias resolution task has a considerable effect on most of the analyzed topological characteristics • In general, false negatives have more impact than false positives. • APAR • benefits from IP address assignment of links, • focuses on structural connections between routers, • more effective on data sets that • include symmetric path segments • collected from large number of vantage points • requires no/minimal probing overhead. • complements probe-based approaches
Outline • Introduction • Internet Topology Measurement • Topology Discovery Issues • Impact of IP Alias Resolution • Topology Discovery • Resolving Anonymous Routers • Graph-based Induction Technique • Resolving Alias IP Addresses • Analytical and Probe-based Alias Resolution • Resolving Genuine Subnets • Dynamic Subnet Inference • Summary
Genuine Subnet ResolutionProblem • Subnet resolution • Identify IP addresses that are connected over the same medium • Improve the quality of resulting topology map A A B B C C D D IP1 IP1 IP2 IP3 IP2 IP3 A A B B C C D D (underlying topology) (observed topology) • (inferred topology)
Subnet Resolution: Advantages • Improve the quality of resulting topology map vs • Increase the scope of the map A A A B B B C C C D D D A A A B B B C C C D D D (genuine topology) (observed topology) • (inferred topology)
Subnet Resolution: Advantages • Improve alias resolution process • Reduce the number of probes in ally based alias resolution • ally tool requires O(n2) probes to resolve aliases among n IP addresses. • We could determine ally probes based on subnets • This approach reduces the number of probes to O(n.s) where s is the average of number of IP addresses in a subnet. Trace: IPa……...IPb ……... IPc ……... IPd IPe IPf IPk IPl IPg IPh IPi subnets
Subnet Resolution: Approach 129.110.0.0/16 129.110.12.0/29 .2 .4 .6 129.110.219.0/24 .1 .3 .5 /24 129.110.4.0/24 /24 129.110.12.0/29 /30 129.110.1.0/30 /29 129.110.2.0/31 /31 129.110.6.0/28 129.110.17.0/24 /28 /24 Importance of IP Alias Resolution