420 likes | 607 Views
Modelling Global Internet Dynamics Dr. Robert Baker. Troy Mackay, Brett Carson, Dr. Rajanathan Rajaratnam University of New England Professor Les Cottrell Stanford University. The Internet (Cheswick, 1999). A Planned Shopping Centre. Space-Time Convergence.
E N D
Modelling Global Internet Dynamics Dr. Robert Baker Troy Mackay, Brett Carson, Dr. Rajanathan Rajaratnam University of New England Professor Les Cottrell Stanford University
The Internet (Cheswick, 1999) A Planned Shopping Centre
Space-Time Convergence • This convergence, connecting origin-destination pairs, is defined by the rate of time discounting (and distance minimisation) and its rate is a function of the technology of transfer • The space-time convergence means that, at least theoretically, the mathematical operators can be projected beyond this interaction to larger distance scales and smaller time scales • It suggests that the trip operators is the same for the Internet as for a shopping centre. • As were approach the singularity (for Internet Trips) , special features emerge, such as ‘virtual distance’, ‘virtual trips’ ‘time reversal’
The Stanford Internet Experiments • The Stanford experiments were undertaken by Professor Les Cottrell at the Linear Accelerator Centre, Stanford, USA. • The Stanford experiments have been running from 1998 to 2004 with various numbers of monitoring sites and remote hosts. The year 2000 had the greatest connectivity between number of monitoring sites and remote hosts and presents the best opportunity to test the model. • It features 27 global monitoring sites in 2000 pinging transactions every hour to 170 remote hosts. The experiment measures the time taken from these origin-destination pairs and further measures the amount of packets that were shed from congestion on the route.
Definitions • Latency • Latency is a synonym for delay and measures how much time it takes for a packet of data to get from one designated point to another in a network. • Propagation: Constrained by the speed of light • Transmission: The medium and size of the packet can introduce delay • Router and other processing: Each hub takes time to examine the packet • Internal connectivity: Delay within networks from intermediate devices • Latency and latitude/longitude co-ordinates will be the time-space variables • Packet Loss • When too many packets arrive on an origin-destination trip, routers hold them in buffers until the traffic decreases. When the buffer fills up during times of congestion, the router drops packets. This is part of what is called the ‘Internet Protocol’ (IP). • Packet loss is what is being measured here as a proxy of peak demand.
General Space-Time Trip Differential Equation A Classification of Relevant Trip Equations using Space-Time Operator Matrix
Trips to a Supermarkets or Planned Shopping Centres (Time Discounting Behaviour) Does this type of model apply to Internet traffic?
Does this type of model apply to Internet traffic? Time-based Random Walk Each monitoring site serves a number of remote hosts at a particular locality and there are i remote hosts linked to each monitoring site i . Assume that each of these remote hosts can hop to adjacent sites with a frequency Γthat does not depend on the characteristics of i . These hops can access sites forward in time or backwards in time. It is assumed the movement forwards or backwards are equally likely. Assumptions • The jump frequency of transactions between sites is constant and it is assumed independent of the site index i and its location in space. • This frequency of movement does not depend on the distribution of remote hosts or users in the neighbourhood of the i th site. • The time distance between sites and the type of transfer network does not influence the process, the only thing that is important is the time-based ordering of the points. • The time distance between sites is very much smaller than the smallest significant wavelength and the A amplitude of the demand wave (A sin kt) must be insignificant outside kmax.
The Internet Demand WaveData Reduction Method (Time) • 1. We take the raw hourly packet loss data for a given period and perform an average for each hour. • 2. This plotted for the average week (per hour) for the monitoring host/ remote host pairs. • 3. The data begins Monday 00:00 local time of the remote host and is truncated to extract the first five days (Monday to Friday 00:00 -24:00 local time). An inverse temporal translation is made back to GMT. We then can view the graph of the Internet demand wave. For example Vicky.stanford.edu
4. Weekends tend to have significantly less congestion, so extracting week days gives a cleaner Fourier spectrum 5. We apply a Discrete Fourier Transform to the data for each host/pair For example Vicky.stanford.edu-pinglafex.cbpf.br
Phase vs Longitude Linear Least Squares Regression and be longitude and phase respectively. Let Let be the longitude and phase of the ith data point. Let and be the linear least squares regression of the n data points where Since we must have to satisfy the boundary conditions of continuity over the 24 hour boundary. The case corresponds to local congestion dominated data, where the packet loss distribution is not strongly dependant on remote host longitude and is most probably due to local effects only. The case corresponds to remote congestion dominated data, where the . packet loss distribution is correlated with the remote host longitude.
and such as to minimize the normalized sum We wish to find and of the squares of the residuals We have: , Critical points can occur at boundaries discontinuities or local extrema: Thus we are able to minimize and and find suitable and by comparing the values at the possible critical points.
Scaling to the Earth’s Rotation: Global Periodicity [0,1] We also wish to scale and so as to produce a useful statistic for comparison. To this end we multiply by a scale factor so that and take values on the interval . The maximum sum of squares of angular residuals occurs when the data points are uniformly distributed in the direction perpendicular to the regression line. are distributed uniformly distributed on the interval So that residuals . Thus Therefore, define the statistic which will define global (and local periodicity)
Table for Global and Local Periodicity for Internet Traffic 2000
Case Studies (2000) • vicky.stanford.edu (West USA) • hepnrc.hep.net (East USA) • sunstats.cern.ch ( Switzerland) • icfamon.dl.ac.uk (UK) • yumj2.kek.jp (Japan)
hepnrc.hep.net (East USA) Internet Demand Wave (2000) Global/Local Periodicity Regression Plot (2000) for 5% Periodicity
sunstats.cern.ch (Switzerland) Internet Demand Wave (2000) Global/Local Periodicity Regression Plot (2000) for 5% Periodicity
icfamon.dl.ac.uk (UK) Internet Demand Wave (2000) Global/Local Periodicity Regression Plot (2000) for 5% Periodicity
yumj2.kek.jp (Japan) Internet Demand Wave (2000) Global/Local Periodicity Regression Plot (2000) for 5% Periodicity
(2) Time Gaussian Behaviour What is the relations between distance and ping time latencies? Is Internet traffic normally distributed? Spatial and Time Partitioning Same as Padmanabhan and Subramanian (2001) Microsoft Ping Times: 5-15ms; 16-25ms; 26-35ms,… Distance Units: Concentric Aggregation, 75km; 150km, 225km;…
(a) The cumulative probability for a gravity-type distribution for the distance between client and proxy for America-Online (Source: Padmanabhan and Subramanian, 2001) (b) The cumulative probability for a gravity-type distribution for a regional shopping mall (Bankstown Square, 1998 afternoon distribution; Baker 2000) (c) The results of a probe machine at Seattle, USA, measuring transaction delay in four categories (5-15ms; 25-35ms; 45-55ms 65-75ms) relative to geographic distance. (Source: Padmanabhan and Subramanian , 2001; Baker 2001)
A Time Gaussian is a Solution of the Time Discounting Differential Equation Key relationship =2MΔx.
vicky.stanford.edu (West USA) (1998-2003) Cumulative Frequency of Latency Bands and Distance Mid-points 2000 Testing the Relationship =2Mx 1998-2003
hepnrc.hep.net (East USA) Cumulative Frequency of Latency Bands and Distance Mid-points 2000 Testing the Relationship =2Mx 1998-2003
sunstats.cern.ch ( Switzerland) Cumulative Frequency of Latency Bands and Distance Mid-points 2000 Testing the Relationship =2Mx 1998-2003
icfamon.dl.ac.uk (UK) Cumulative Frequency of Latency Bands and Distance Mid-points 2000 Testing the Relationship =2Mx 1998-2003
3 Distance Decay The distance decay metric is a corollary of a time gaussian. For example: hepnrc.hep.net (East USA) (a) Log-linear Gravity Model (b) 3-D Contour Model Showing Gaussian Distribution (c) 2-D Density Plot Showing Gaussian Distribution
Space-Time Convergence • This convergence, connecting origin-destination pairs, is defined by the rate of time discounting (and distance minimisation) and its rate is a function of the technology of transfer • The space-time convergence means that, at least theoretically, the mathematical operators can be projected beyond this interaction to larger distance scales and smaller time scales • It suggests that the trip operators are the same for the Internet as for a shopping centre. • As were approach the singularity (for Internet Trips) , special features emerge, such as ‘virtual distance’, ‘virtual trips’ ‘time reversal’
Finite Difference Form A continuous distribution can also be ‘sampled’, where we can work backwards and derive the ‘finite difference’ form which can be solved numerically. Towards the end of this sampling, introduce a constant space-time rectangular grid for the independent variables (t, x) by choosing points for integers n and i. xi = nx tj = it This grid system is shown below and are arbitrarily determined by x and t. This system could represent the sampling mesh constructed to provide data for space-time distributions in the space-time convergence
Time and Space Estimation The time derivative is estimates by taking a Taylor expansion around the point ti Taking the differences, yields the central difference system The central second difference is stated as: Similarly, for the space derivative is derived around a point xi estimated from data forward over space from the revolution of the Earth (the Euler Forward scheme): Re-arranging the terms, yields the finite difference equation equivalent to the supermarket differential equation: where is the modulus representing the ratio of space to time mesh (Ghez, 1988) and is defined by:
The trip to the destination (the n +1 site) is requires a convergence without any oscillations and the finite difference trip back to the origin must be stable. The finite difference equation cannot have oscillatory solutions and this will occur if all the coefficient have the same sign. The modulus of the space-time grid for the data collection is positive, like M, and the coefficients of must be positive. Therefore, the modulus must obey the inequality of 0 1 and the trip from the destination to the residence is restricted by: or for the gravity coefficient This is the gravity inequality for spatial interaction modelling for one time zone and applies to distance minimisation strategies There is a Gaussian inequality derived similarly for time minimisation strategies
Is there evidence for this inequality in the Internet Experiments?
Is there evidence for a gravity inequality? hepnrc.hep.net (East USA) There is distance decay for the 5-15ms ping times (the ping times of least congestion) is a negative exponential function with an R-squared value of 0.73 and β value of 0.015.The frequency for this distribution is calculated at 0.208 and this corresponds to a more localised spatial interaction (less than 350km)
For the 15-25 ms latency, the log-linear regression still showed a significant line of best fit where the R-squared value is 0.53 and the value is now 0.004 meaning that the destinations were dispersed over a wider area (less than 1000km). The corollary a lower interaction frequency (where k = 0.11).
Conclusion The space-time convergence suggests that Internet transactions should be part of spatial interaction modelling Using the packet loss demand proxy from the Stanford Internet experiments, there is an Internet demand wave and it has similar features found in shopping trip modelling The Internet equation is defined by: This equation has two components There is a local time gaussian component with distance decay: Distance does matter! There is a global drift component from the 24-hour rotation of the Earth. There is a statistic that can classify sites as global or local periodic by standardising to unity the Earth’s rotation as the slope of the regression line.
Conclusion (cont.) • The Internet allows for us to look at trip behaviour near the space-time convergence. • The finite difference form allows for the examination of the convergence of the space-time mesh near this point. • The result is an inequality for the convergence to be stable and the definition of a gravity inequality. • Examination of the ping latency data from 1998-2004 for the Stanford Internet experiments suggests the inequality for convergence exists and there is a fundamental boundary from the speed of light in transmission. • The space-time distributions for one monitoring site hepnrc.hep.net (East USA) suggest that the gravity inequality is robust for this site.