330 likes | 673 Views
Scott Kirkpatrick, School of Engineering, Hebrew University of Jerusalem EVERGROW and OneLab2 Collaborators (thanks, not blame…) Yuval Shavitt, Eran Shir, Udi Weinsberg, Shai Carmi, Shlomo Havlin, Avishalom Shalit, Daqing Li.
E N D
Scott Kirkpatrick, School of Engineering, Hebrew University of Jerusalem EVERGROW and OneLab2 Collaborators (thanks, not blame…) Yuval Shavitt, Eran Shir, Udi Weinsberg, Shai Carmi, Shlomo Havlin, Avishalom Shalit, Daqing Li The Internet’s Physical Topology(or, Will the Internet ever measure itself?)
The Internet is our most distributed work of Engineering Federated initially from military and commercial networks, some of which involved highly proprietary and gratuitously different platforms. Arpanet, DECnet, PC-based systems, IBM’s SNA, BitNet, Euronet… As a result, there are two distinct layers: BGP and above (inter-AS), and intra-AS (OSPF shortest path, MPLS, ATM, …) BGP information is exchanged by sharing recommended routes, exposing only those for which an AS will be properly compensated. Engineering the Internet has always been distributed using a formal model, IETF RFP’s etc. similar to international standards formation, yet with less commercial involvement than typical ISO practice, and a US center-of-gravity for the deliberations. Layering of communications protocols has permitted high degree of refinement, but now seems in stasis. There are no global databases, many local databases, poor data quality.
Measuring and monitoring the Internet Has undergone a revolution Traceroute – an old hack basic tool in wide use Active monitors – hardware intensive distributed software DIMES (“Dimes@home”) an example, not the only one now Many enhancements under consideration, as the problems in traceroute become very evident Ultimately, we expect every router (or what they become in the future internet) will participate in distributed active monitoring. The payoff comes with interactive and distributed services that can achieve greater performance at greatly decreased overhead
History of TraceRoute active measurement Jacobson, “traceroute” from LBL, February 1989 And this is something that can be rewritten for special situations, such as cellphones Single machine traces to many destinations – Lucent, 1990s (Burch and Cheswick) Great pictures, but interpretation not clear, demonstrate need for more analytic visualization techniques But excellent for magazine covers, t-shirts… First attempt to determine the time evolution of the Internet First experience in operating under the “network radar”
History of Internet Measurement, ctd. Skitter and subsequent projects at CAIDA (SDSC) 15-50 machines (typically <25), at academic sites around world RIPE and NLANR, 1-200 machines, commercial networks and telco backbones, information is proprietary DIMES (>10,000 software agents) represents the next step Current statistics: 8298 users 19,597 agents registered (in 115 countries) Have seen 29,404 Ases and 204,204 AS-AS links 6.6 B measurements saved since 9/2004
DIMES data available for general use • Monthly data files currently available from 1/2007 • Weekly data files available by web request from 9/2004 • Data sets • AS nodes • AS edges • Routers • City Edges • POPs (tested, but not released yet)
DIMES documentation • File entries are explained, otherwise, caveat emptor:
Traceroute is more than a piece of string A flood of feigned suicide packets (with TTL values t=1 to about 30 hops), each sent more than one time. Ideal situation, each packet dies at step t, router returns echo message, “so sorry, your packet died at ip address I, time T” Non ideal situations must be filtered to avoid data corruption: Errors – router inserts destination address for I Non-response is common Multiple interfaces for a single (complex) router Route flaps, load balancing create false links Route instabilities can be reduced with careful header management (requires guessing router tricks) Resulting links must be resolved – to Ases, to routers, to POPs
Models of the Internet are highly contentious • Practitioner preferences – start with points in 2D • Impose physical constraints of actual routers • Finite number of connections • Low connectivity/high bandwidth (core) • High connectivity/low bandwidth (edge) • Introduce randomness through distance-dependent probability of interconnection • Reorganize net locally in ways thought to reflect engineering practice • Ignore existence of extended entities (large Ases) • Results in strong resistance to scale-free models
Use a new analytical tool – k-pruning Prune by grouping sites in “shells” with a common connectivity further into the Internet: All sites with connectivity 1 are removed (recursively) and placed in the “1-shell,” leaving a “2-core” then removing 2-shell leaves 3-core, and so forth. The union of shells 1- k is called the “k-crust” At some point, kmax, pruning runs to completion. Identify nucleus as kmax-core This is a natural, robust definition, and should apply to other large networks of interest in economics and biology. Cluster analysis finds interesting structure in the k-crusts
K-crusts show percolation threshold These are the hanging tentacles of our (Red Sea) Jellyfish For subsequent analysis, we distinguish three components: Core, Connected, Isolated Largest cluster in each shell Data from 01.04.2005
Meduza (מדוזה) model This picture has been stable from January 2005 (kmax = 30) to present day, with little change in the nucleus composition. The precise definition of the tendrils: those sites and clusters isolated from the largest cluster in all the crusts – they connect only through the core.
What about the error bars, the bias, etc.? Need to address the specifics of the “network discoveries” How frequently observed? How sensitive are the observations to the number of observers? How do the measurements depend on the time of observation? The extensive literature on the subject is mostly straw-man counterexamples, that show bias from this class of observation can be serious, in graphs of known structure, but do not address how to estimate structure from actual measurements.
Filtering the masses of data Current efforts (me, Weinsberg, Carmi) are studying how the Meduza model and other observations are affected by removal of the less-reliable data: Infrequently seen links Less than three days presence in a week Some things seen only once Stuff seen by rogue agents Is it intentional? Probably not. So far all the basic observations are proving robust.
How does the city data differ from the AS-graph information? • Cities are local, ASes may be highly extended (ATT, Level 3, Global Xing, Google) • About 4000 cities identified, cf. 25,000 ASes • But similar features are seen • Wide spread of small-k shells • Distinct nucleus with high path redundancy • Many central sites participate with nucleus • A less strong Medusa structure
Is BGP routing wasting capacity? Peer-connected component (PCC) capable of long ranged communications as well as local We've used “betweenness” to test alternate routings which ignore “Tier One” links. Betweenness is essentially a traffic model. Each node in a set sends one packet to each other node in the set. (Example, all 1 and 2 shell nodes) Compare maximum betweenness with and without the nucleus ASes.
Conclusions – will the Internet use this information? • Undisclosed transverse capacity in peering links can provide a global backup or reserve • Unless business relationships evolve, it is not adequate to also carry long-distance traffic • One hop (in AS-graph) of extra disclosure will probably suffice to make this viable for regional traffic.