300 likes | 396 Views
Characterizing the Internet Hierarchy from Multiple Vantage Points. Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park, NJ http://www.research.att.com/~jrex.
E N D
Characterizing the Internet Hierarchy from Multiple Vantage Points Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park, NJ http://www.research.att.com/~jrex Work with L. Subramanian, S. Agarwal, and R. Katz http://www.cs.berkeley.edu/~sagarwal/research/BGP-hierarchy/
Outline • Internet architecture • ASes, IP addressing, BGP routing, and AS relationships • Type-of-relationship problem • Motivation, formulation, and practical challenges • Analyzing partial views of the AS graph • Assigning a rank to each AS from a single vantage point • Comparing ranks of ASes across multiple vantage points • Analysis results • BGP routing data and inferred AS relationships • AS paths that are inconsistent with the inferences • Five-level classification of the Internet hierarchy • Conclusions
Internet Architecture • Divided into Autonomous Systems • Distinct regions of administrative control (~11,000) • Set of routers and links managed by a single institution • Service provider, company, university, … • Hierarchy of Autonomous Systems • Large, tier-1 provider with a nationwide backbone • Medium-sized regional provider with smaller backbone • Small stub network run by a company or university • Interaction between Autonomous Systems • Internal topology is not shared between ASes • … but, neighboring ASes interact to coordinate routing
Autonomous Systems (ASes) Path: 6, 5, 4, 3, 2, 1 4 3 5 2 6 7 1 Web server Client
00001100 00100010 10011110 00000101 IP Addressing and Prefixes • 32 bits in dotted-quad notation (12.34.158.5) • Divided into network and host portions • 12.34.158.0/23 is a 23-bit prefix with 29 addresses 12 34 158 5 Network (23 bits) Host (9 bits)
Interdomain Routing with BGP (Between ASes) • ASes announce info about prefixes they can reach • Local policies for path selection (which to use?) • Local policies for route propagation (who to tell?) • Policies configured by the AS’s network operator “I can reach 12.34.158.0/23 via AS 1” “I can reach 12.34.158.0/23” 2 3 1 12.34.158.5
Traffic from the customer provider d traffic customer Customer-Provider Relationship • Customer pays provider for access to the Internet • AS exports customer’s routes to all neighbors • AS exports provider’s routes only to its customers Traffic to the customer advertisements provider d customer
traffic Peer-Peer Relationship • Peers exchange traffic between their customers • Free of charge (assumption of even traffic load) • AS exports a peer’s routes only to its customers Traffic to/from the peer and its customers advertisements peer peer d
AS Relationships Matter • Motivating problems • Placement of servers for content distribution network • Selection of new peers or providers for an AS • Analyzing the convergence properties of the BGP protocol • Installing route filters to protect against misconfiguration • Understanding of the basic structure of the Internet • Knowing the AS graph is not enough • Interdomain routing is not shortest-path routing • Some paths not allowed (e.g., transit through a peer) • Local preference of paths (e.g., prefer customer path) • Node degree does not define the Internet hierarchy • Need to know the relationship between AS pairs
Inferring Relationships from Routing Data • Practical realities of the Internet • AS graph is not known • AS relationships are proprietary • … at least some routing data is publicly available! • Exploiting routing data • Available via traceroute experiments or BGP tables • Provides a set of AS paths, such as “701 7018 46” • Implies existence of edges (701, 7018) and (7018, 46) • Implies that 7018 (AT&T) allows AS 701 (UUNet) to transit to AS 46 (Rutgers)
Valid and Invalid Paths • AS relationships limit the kinds of valid paths • Uphill portion: customer-provider relationships • Plateau: zero or one peer-peer edge • Downhill portion: provider-customer relationships Valid Invalid Invalid Lixin Gao, “On inferring Autonomous System relationships in the Internet,” IEEE/ACM Transactions on Networking, December 2001.
Type-of-Relationship Problem • Given the inputs • AS graph G(V,E) with vertices V and edges E • Set of paths P on the graph G • Find a solution that • Labels each edge with an AS relationship • Minimizes the number of invalid paths in P • Properties of the problem • NP complete (?) • May have multiple solutions • We propose a heuristic algorithm
Practical Challenges • Peer-peer relationships are hard to infer • Mislabeling a peer-peer edge as provider-customer does not change a valid path into an invalid path • We use heuristics to detect the peer-peer edges • Some AS pairs have unusual relationships • Sibling ASes that provide transit service for each other • Backup relationship for connectivity under failure • Misconfiguration of a conventional AS relationship • We detect these cases by analyzing the “invalid” paths • Getting access to a large path set P is hard • We exploit BGP routing tables from multiple vantage points
Validation Approaches • Quantify the number of invalid paths • Small number suggests better results • …still, this doesn’t mean that inferences are correct • Compare results with other inference algorithms • Higher confidence if inferences are the same • … still, both algorithms could give wrong answers • Compare results with Routing Arbiter Database • Higher confidence if consistent with RADB routing policies • … still, RADB information is incomplete and out-of-date • Compare results with proprietary ISP data • Higher confidence if answers are correct for this AS • … still, answers may be wrong for other ASes
D C C C D D E E E B B F A A A F F Partial View of the AS Graph • Routing data from a single source AS • Collection of paths starting from the source • Directed graph from union of all edges in these paths B Actual graph
C D D C E E B F A F A Assigning Rank to AS in a Partial View • Reverse pruning algorithm to assign rank • Rank 1 to the leaves, then remove leaves • Rank 2 to the leaves, then remove leaves… • Single (largest) rank to nodes in connected component, if any B 5 5 1 1 4 4 3 3 2 2 1 1
Combining Information From Multiple Views • Vector of ranks for each AS • A single element for each of the n views • Dominance: provider-customer relationship • Provider has higher ranks than customer in most views • For example, B has (2,5) and A has (1,1) • Equivalence: peer-peer relationship • Peers have equal ranks in or inconsistent ranks • For example, C has (3,4) and D has (4,3) • Probabilistic inference • Thresholds to tolerate some variations across the views • E.g., an AS dominates in n-1 views and dominated in 1
Applying Our Algorithm • Applying the algorithm to ten public BGP tables • RouteViews table and nine Looking Glass servers • Extracted set of unique paths P for each view • Applied reverse pruning algorithm to each view • Applied inference rules to the vectors of ranks • Results of the analysis on data from April 2001 • AS graph with 10,698 ASes and 23,935 edges • Inferences were made for 99.2% of the edges • 94.5% provider-customer and 4.7% peer-peer edges • Most inferences do not require the probabilistic rules
Advantage of Multiple Vantage Points • A single vantage point is not enough • 15% of the edges appear in exactly one BGP table • Only 25% of the edges appear in all ten BGP tables
Analyzing Invalid Paths • Checking the validity of inferences • Assume the relationship inferences are correct • Identify paths that are invalid under these inferences • Compute the number of invalid paths • Investigate common anomaly triples (A, B, C) • Results of our analysis • Applied to paths in 2 of the original 10 BGP tables • Applied to paths in 4 other BGP tables • 0.5-3% of paths are invalid for five of the six tables • 8.7% of paths are invalid for the KDDI table
Common Anomaly Patterns • Misconfiguration • (1, 65112, 6461): 65112 is a private AS that should not appear between Genuity and AboveNet • Sibling relationships • (7018, 6841, 3300): Infonet Europe merged with AUCS • (1239, 1740, 7018): Cerfnet was acquired by AT&T • (1239, 8043, 6395): IXC Communications acquired SmartNAP and renamed Broadwing • Heuristic for identifying sibling relationships • AS pair that appears in a large number of “invalid” paths • Our analysis identified 22 possible sibling relationships
1 701 703 9304 7018 Genuity Hutchinson UUNet AT&T Digression: Really Weird “Invalid” Paths… • Properties of the path • Two tier-1 U.S. providers (Genuity and UUNet) • One service provider in Hong Kong (Hutchinson) • Another tier-1 U.S. provider (AT&T) at the end of the path • Looking at internal AT&T configuration data… • AT&T does not have a BGP session with AS 9304 • AT&T does not originate the prefixes (e.g., 152.141.116.0/24) • Explanation • Another AS was using the AT&T AS number (for over a year!) • We sent them an e-mail and asked them to stop, and they did
Digression: How Could This Happen, and Persist? • BGP configuration is done locally by neighbors • Customer configures its router with AS number 7018 • Provider configures its router with neighbor of 7018 • The misconfiguration didn’t necessarily cause a problem • Hop-by-hop routing took the traffic to the right place • Most BGP policies don’t look at the identity of the ASes • Could have caused a problem: route filtering • Large providers might applying filtering to customer routers • Discard routes with other large providers in the path • Could have caused a problem: loop detection • The bogus routes did not appear in AT&T’s routing tables • AT&T router saw 7018 in the path and discarded the route • AT&T router did have a route for the supernet (152.141.0.0/16)
AS Classification • Directed AS graph • Directed edge from provider to customer • Bidirectional edge between two peers • Lowest level: Stubs • Leaf nodes: no peers or downstream customers • 8898 of the 10915 ASes (82.5% of ASes) • Ex: UC Berkeley (25), AT&T Labs (6431), and INRIA (1300) • Next lowest level: Regional ISPs • Leaf nodes after successive pruning of leaf nodes • 971 ASes of the 10915 ASes (8.9% of ASes) • Ex: PacBell (5676), US West (6223), and UUNET Canada (815) • Remaining 1046 ASes: Core
Dense Core • Ways to classify so-called “tier-1” ASes • Any AS with no upstream provider (98 such nodes) • AS set that forms the largest clique of peer edges (13 nodes) • Relaxing the definition • Tolerate some missing or misclassified edges • Tolerate some ASes with sibling relationships • “Almost a clique” • Subgraph of m nodes with in and out degree at least m/2 • Greedy algorithm for locating the largest near-clique • 20 ASes in the near-clique • 15 of the ASes form a subgraph just 3 edges short of a clique • Genuity, Sprint, UUNET, AT&T, Verio, Level3, C&W,…
Transit and Outer Core • Transit core • ASes that peer with the dense core and each other • Notion of a “weak in-way cut” to isolate these ASes • Algorithm for identifying the ASes in transit core • 129 ASes, including top providers in Europe and Asia • Ex: UUNET Europe, KDDI, and Singapore Telecom • Outer core • All of the remaining ASes in the core • 897 ASes, including large regional and national ISPs • Ex: Turkish Telecom and Minnesota Regional Network
Node Degree is Not Enough • Node degree ignores relationships • A stub AS may have many upstream providers • A core AS may have a small number of peers • Some ASes have customers that don’t have AS numbers
Related Work • AS graph characterization • Constructing graph from BGP tables or traceroute experiments • Characterizing the topological properties of the graph • Inferring AS relationships (Lixin Gao) • Identifies the key properties of paths (uphill, downhill, etc.) • Heuristic using node degree to infer boundary point between the uphill and downhill portions of the path • Application of the algorithm using RouteViews routing table • Characterization of the hierarchy of ASes • Early work by Govindan/Reddy based on node degree • Recent work by Ge et al based on AS relationships
Conclusions • Inferring AS relationships • Reverse pruning to assign rank to each AS • Comparison of ranks from different vantage points • Performance evaluation • Application of algorithm to collection of ten BGP tables • Exploration of the anomalies that cause invalid paths • Characterization of Internet hierarchy • Stub, regional ISP, outer core, transit core, & dense core • Algorithms for identifying the three parts of the core • Application to AS graph inferred from the BGP tables
Ongoing Work • Classification of siblings • Use anomalous triples (A, B, C) to identify siblings • Group siblings into a single node (with union of edges) • Repeat classification of the AS hierarchy on new graph • Longitudinal study • Repeat the study over a period of time with new data • Study how AS relationships and hierarchy changes • Validation of our inference results • Compare to RADB, Lixin’s results, AT&T data, etc. • http://www.cs.berkeley.edu/~sagarwal/research/BGP-hierarchy/