330 likes | 437 Views
Characterizing the Internet Hierarchy from Multiple Vantage Points. Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park, NJ http://www.research.att.com/~jrex.
E N D
Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park, NJ http://www.research.att.com/~jrex Work with L. Subramanian, S. Agarwal, and R. Katz http://www.cs.berkeley.edu/~sagarwal/research/BGP-hierarchy/
Outline • Internet architecture • Autonomous Systems (ASes) and IP addressing • Interdomain routing and AS relationships • Type-of-relationship problem • Motivation, formulation, and practical challenges • Analyzing partial views of the AS graph • Assigning a rank to each AS from a single vantage point • Comparing ranks of ASes across multiple vantage points • Analysis results • Interdomain routing data and inferred AS relationships • Five-level classification of the Internet hierarchy • Conclusions
Internet Architecture • Divided into Autonomous Systems • Distinct regions of administrative control (~14,000) • Set of routers and links managed by a single institution • Service provider, company, university, … • Hierarchy of Autonomous Systems • Large, tier-1 provider with a nationwide backbone • Medium-sized regional provider with smaller backbone • Small stub network run by a company or university • Interaction between Autonomous Systems • Internal topology is not shared between ASes • … but, neighboring ASes interact to coordinate routing
Autonomous Systems (ASes) Path: 6, 5, 4, 3, 2, 1 4 3 5 2 6 7 1 Web server Client (128.235.184.76)
10000001 11111110 00010011 00011100 IP Addressing (for the host www.etri.re.kr) • 32 bits in dotted-quad notation (129.254.19.28) • Divided into network and host portions • 129.254.0.0/16: 16-bit prefix with 232-16 addresses 129 254 19 28 Network (16 bits) Host (16 bits)
Border Gateway Protocol (BGP) • ASes exchange info about who they can reach • IP prefix: block of destination IP addresses • AS path: sequence of ASes along the path • Policies configured by the AS’s network operator • Path selection: which of the paths to use? • Path export: which neighbors to tell? “I can reach 129.254.0.0/16 via AS 4766” “I can reach 129.254.0.0/16” ETRI 7018 4766 3748 AT&T KIX 129.254.19.28
Traffic from the customer provider d traffic customer Customer-Provider Relationship • Customer pays provider for access to the Internet • AS exports customer’s routes to all neighbors • AS exports provider’s routes only to its customers Traffic to the customer advertisements provider d customer
traffic Peer-Peer Relationship • Peers exchange traffic between their customers • Free of charge (assumption of even traffic load) • AS exports a peer’s routes only to its customers Traffic to/from the peer and its customers advertisements peer peer d
Moving Beyond the AS Graph • AS graph doesn’t capture effects of routing policies • Some paths not allowed (e.g., transit through a peer) • Local preference of paths (e.g., prefer customer path) • Can’t simply compute shortest path on the graph • AS graph doesn’t capture the Internet hierarchy • Large AS may have few neighbors (mostly peers) • Small AS may have many neighbors (mostly providers) • Can’t simply rely on node degree to rank the nodes • Need to know the relationship between AS pairs
AS Relationships Matter • Understanding the Internet’s basic structure • More than just a graph of interconnected ASes • Placement of servers for content distribution • Place replicas so they can reach everyone well • Selection of new peers or providers for an AS • Select peers/providers with good “reach” to others • Installing route filters to protect the network • Filter customer routes that include other large providers • Analyzing the convergence properties of BGP • Hierarchical relationships avoid protocol oscillation
Inferring Relationships from Routing Data • Practical realities of the Internet • AS graph is not known • AS relationships are proprietary • … at least some routing data is publicly available! • Exploiting routing data • Available via traceroute experiments or BGP tables • Provides a set of AS paths, such as “7018 4766 3748” • Implies existence of edges (7018, 4766) & (4766, 3748) • Implies that 4766 (KIX) allows AS 7018 (AT&T) to transit to AS 3748 (ETRI)
Valid and Invalid Paths • AS relationships limit the kinds of valid paths • Uphill portion: customer-provider relationships • Plateau: zero or one peer-peer edge • Downhill portion: provider-customer relationships Qwest AT&T AT&T UUNET Invalid Invalid Valid UUNET Sprint AT&T Rutgers NJIT AT&T Research Lixin Gao, “On inferring Autonomous System relationships in the Internet,” IEEE/ACM Transactions on Networking, December 2001.
Type-of-Relationship Problem • Given the inputs • AS graph G(V,E) with vertices V and edges E • Set of paths P on the graph G • Find a solution that • Labels each edge with an AS relationship • Minimizes the number of invalid paths in P • Properties of the problem • NP complete (?) • May have multiple solutions • We propose a heuristic algorithm
Practical Challenges • Peer-peer relationships are hard to infer • Mislabeling a peer-peer edge as provider-customer does not change a valid path into an invalid path • We use heuristics to detect the peer-peer edges • Some AS pairs have unusual relationships • Sibling ASes that provide transit service for each other • Backup relationship for connectivity under failure • Misconfiguration of a conventional AS relationship • We detect these cases by analyzing the “invalid” paths • Getting access to a large path set P is hard • We exploit BGP routing tables from multiple vantage points
Validation Approaches • Quantify the number of invalid paths • Small number suggests better results • …still, this doesn’t mean that inferences are correct • Compare results with other inference algorithms • Higher confidence if inferences are the same • … still, both algorithms could give wrong answers • Compare results with Routing Arbiter Database • Higher confidence if consistent with RADB routing policies • … still, RADB information is incomplete and out-of-date • Compare results with proprietary ISP data • Higher confidence if answers are correct for this AS • … still, answers may be wrong for other ASes
D C C C D D E B E A A F A F F Partial View of the AS Graph • Routing data from a single source AS • Collection of paths starting from the source • Directed graph from union of all edges in these paths B Actual graph E B
C D D C E E B F A F A Assigning Rank to AS in a Partial View • Reverse pruning algorithm to assign rank • Rank 1 to the leaves, then remove leaves • Rank 2 to the leaves, then remove leaves… • Single (largest) rank to nodes in connected component, if any B 5 5 1 1 4 4 3 3 2 2 1 1
Combining Information From Multiple Views • Vector of ranks for each AS • A single element for each of the n views • Dominance: provider-customer relationship • Provider has higher ranks than customer in most views • For example, B has (2,5) and A has (1,1) • Equivalence: peer-peer relationship • Peers have equal ranks in or inconsistent ranks • For example, C has (3,4) and D has (4,3) • Probabilistic inference • Thresholds to tolerate some variations across the views • E.g., an AS dominates in n-1 views and dominated in 1
Applying Our Algorithm • Applying the algorithm to ten public BGP tables • RouteViews table and nine Looking Glass servers • Extracted set of unique paths P for each view • Applied reverse pruning algorithm to each view • Applied inference rules to the vectors of ranks • Results of the analysis on data from April 2001 • AS graph with 10,698 ASes and 23,935 edges • Inferences were made for 99.2% of the edges • 94.5% provider-customer and 4.7% peer-peer edges • Most inferences do not require the probabilistic rules
Advantage of Multiple Vantage Points • A single vantage point is not enough • 15% of the edges appear in exactly one BGP table • Only 25% of the edges appear in all ten BGP tables
Analyzing Invalid Paths • Checking the validity of inferences • Assume the relationship inferences are correct • Identify paths that are invalid under these inferences • Compute the number of invalid paths • Investigate common anomaly triples (A, B, C) • Results of our analysis • Applied to paths in 2 of the original 10 BGP tables • Applied to paths in 4 other BGP tables • 0.5-3% of paths are invalid for five of the six tables • 8.7% of paths are invalid for the KDDI table
Common Anomaly Patterns • Misconfiguration • (1, 65112, 6461): 65112 is a private AS that should not appear between Genuity and AboveNet • Sibling relationships • (7018, 6841, 3300): Infonet Europe merged with AUCS • (1239, 1740, 7018): Cerfnet was acquired by AT&T • (1239, 8043, 6395): IXC Communications acquired SmartNAP and renamed Broadwing • Heuristic for identifying sibling relationships • AS pair that appears in a large number of “invalid” paths • Our analysis identified 22 possible sibling relationships
AS Classification • Directed AS graph • Directed edge from provider to customer • Bidirectional edge between two peers • Lowest level: Stubs • Leaf nodes: no peers or downstream customers • 8898 of the 10915 ASes (82.5% of ASes) • Ex: UC Berkeley (25), AT&T Labs (6431), and ETRI (3748) • Next lowest level: Regional ISPs • Leaf nodes after successive pruning of leaf nodes • 971 ASes of the 10915 ASes (8.9% of ASes) • Ex: PacBell (5676), US West (6223), and UUNET Canada (815) • Remaining 1046 ASes: Core
Dense Core • Ways to classify so-called “tier-1” ASes • Any AS with no upstream provider (98 such nodes) • AS set that forms the largest clique of peer edges (13 nodes) • Relaxing the definition • Tolerate some missing or misclassified edges • Tolerate some ASes with sibling relationships • “Almost a clique” • Subgraph of m nodes with in and out degree at least m/2 • Greedy algorithm for locating the largest near-clique • 20 ASes in the near-clique • 15 of the ASes form a subgraph just 3 edges short of a clique • Genuity, Sprint, UUNET, AT&T, Verio, Level3, C&W,…
Transit and Outer Core • Transit core • ASes that peer with the dense core and each other • Notion of a “weak in-way cut” to isolate these ASes • Algorithm for identifying the ASes in transit core • 129 ASes, including top providers in Europe and Asia • Ex: UUNET Europe, KDDI, and Singapore Telecom • Outer core • All of the remaining ASes in the core • 897 ASes, including large regional and national ISPs • Ex: Turkish Telecom and Minnesota Regional Network
Node Degree is Not Enough • Node degree ignores relationships • A stub AS may have many upstream providers (one bankrupt!) • A core AS may have a small # of peers (TeliaNet USA degree 40) • Some ASes have customers that don’t have AS numbers
Related Work • AS graph characterization • Constructing graph from BGP tables or traceroute experiments • Characterizing the topological properties of the graph • Inferring AS relationships (Lixin Gao) • Identifies the key properties of paths (uphill, downhill, etc.) • Heuristic using node degree to infer boundary point between the uphill and downhill portions of the path • Application of the algorithm using RouteViews routing table • Characterization of the hierarchy of ASes • Early work by Govindan/Reddy based on node degree • Recent work by Ge et al based on AS relationships
Conclusions • Inferring AS relationships • Reverse pruning to assign rank to each AS • Comparison of ranks from different vantage points • Performance evaluation • Application of algorithm to collection of ten BGP tables • Exploration of the anomalies that cause invalid paths • Characterization of Internet hierarchy • Stub, regional ISP, outer core, transit core, & dense core • Algorithms for identifying the three parts of the core • Application to AS graph inferred from the BGP tables
Ongoing Work • Classification of siblings • Use anomalous triples (A, B, C) to identify siblings • Group siblings into a single node (with union of edges) • Repeat classification of the AS hierarchy on new graph • Longitudinal study • Repeat the study over a period of time with new data • Study how AS relationships and hierarchy changes • Validation of our inference results • Compare to RADB, Lixin’s results, AT&T data, etc. • http://www.cs.berkeley.edu/~sagarwal/research/BGP-hierarchy/
1 701 703 9304 7018 Genuity Hutchinson UUNet AT&T Digression #1: Really Weird “Invalid” Paths… • Properties of the path • Two tier-1 U.S. providers (Genuity and UUNet) • One service provider in Hong Kong (Hutchinson) • Another tier-1 U.S. provider (AT&T) at the end of the path • Looking at internal AT&T configuration data… • AT&T does not have a BGP session with AS 9304 • AT&T does not originate the prefixes (e.g., 152.141.116.0/24) • Explanation • Another AS was using the AT&T AS number (for over a year!) • We sent them an e-mail and asked them to stop, and they did
Digression #1: How Could This Happen, and Persist? • BGP configuration is done locally by neighbors • Customer configures its router with AS number 7018 • Provider configures its router with neighbor of 7018 • The misconfiguration didn’t necessarily cause a problem • Hop-by-hop routing took the traffic to the right place • Most BGP policies don’t look at the identity of the ASes • Could have caused a problem: route filtering • Large providers might applying filtering to customer routers • Discard routes with other large providers in the path • Could have caused a problem: loop detection • The bogus routes did not appear in AT&T’s routing tables • AT&T router saw 7018 in the path and discarded the route • AT&T router did have a route for the supernet (152.141.0.0/16)
Digression #2: Making Up The Examples • Getting an IP address • E-mail from ansari@adm.njit.edu • “nslookup adm.njit.edu” gave 128.235.184.176 • Finding the AS number and a sample path • telnet://route-views.oregon-ix.net/ • “show ip bgp 128.235.184.176” • One of the paths is “7018 209 4246” • Double checking this belongs to NJIT • “whois –h whois.arin.net 128.235.184.176” • “whois –h whois.arin.net 4246”
Digression #2: whois –h whois.arin.net 128.235.184.76 NetType: Direct Assignment NameServer: DNS1.NJIT.EDU NameServer: DNS2.NJIT.EDU NameServer: THORIN.UTHSCSA.EDU Comment: RegDate: Updated: 2001-04-05 TechHandle: PT35-ARIN TechName: Teklinski, Peter TechPhone: +1-973-596-2908 TechEmail: pt@njit.edu OrgName: New Jersey Institute of Technology OrgID: NJIT Address: 323 Martin Luther King Boulevard City: Newark StateProv: NJ PostalCode: 07102 Country: US NetRange: 128.235.0.0 - 128.235.255.255 CIDR: 128.235.0.0/16 NetName: NJIT NetHandle: NET-128-235-0-0-1 Parent: NET-128-0-0-0-0