480 likes | 596 Views
The Impact and Implications of the Growth in Residential User-to-User Traffic. Kenjiro Cho, Kensuke Fukuda, Hiroshi Esaki, Akira Kato (SIGCOMM'06). Presented by Stanley Wong, Tony Wat Spring 2007. 1. Introduction.
E N D
The Impact and Implications of the Growth in Residential User-to-User Traffic Kenjiro Cho, Kensuke Fukuda, Hiroshi Esaki, Akira Kato(SIGCOMM'06) Presented by Stanley Wong, Tony Wat Spring 2007
1. Introduction • Worldwide increase in user-to-user traffic observed, putting pressure on commercial backbone • Strong concern on Internet backbone technologies not able to keep up with rapid-growing residential traffic • Ensure the evolution of Internet, understand the effects of growing residential traffic
1. Introduction • Japan has high penetration rate of fibre-based broadband access (expontentially increasing) while increase in DSL is slowing down • Good candidate for study different behaviours
1. Introduction • Technically and politically difficult to obtain traffic data from commercial ISP as it contain sensitive data of ISPs • Measuring methods and policies varies among ISPs, make it difficult to compare • Involved seven major Japanese commercial ISPs in collecting traffic data • Goal is to know the ratio of residential broadband traffic to other traffic, changes in traffic patterns, regional differences among ISPs
2. Data Collection • Two data sets • Aggregated interface counters of edge routers from 7 ISPs • analysis at macro-scopic level • Sampled NetFlow data of one of the ISPs • detailed per-customer analysis
2.1 Data collection of aggregated traffic • Most ISPs collect interface counters values on their routers, usually have data in 2-hour resolutions • Developed and provided a perl script to ISPs to read log files and aggregate data according to different group of routers • So as to allow ISPs not to disclose internal network structure or unrelated details of their traffic
2.1 Data collection of aggregated traffic • Collected six times, month-long traffic logs from 7 ISPs from 2004 to 2006 • Focus on traffic crossing ISP boundaries • Grouped to customer, domestic and international traffic
2.2 Data collection of per-customer traffic • Sampled NetFlow data from one ISP • Sampling rate of 1/2048 on all edge routers to residential broadband customers • Collected five times, week-long data sets from 2004 to 2005 • Data include inbound/outbound traffic volume of each customer in 1 hour resolution with customer attributes such as line type (fibre or DSL), customer IDs • Combined with 2 geo-IP databases to analyze geographic communication patterns
3. Analysis Aggregated Traffic • Between Nov 2004 and Nov 2005 • RBB customer traffic (A1) = 26% for inbound, 46% for outbound and 37% for combined volume • Different between inbound and outbound slightly widened in the first 6 months • Estimated ratio (A1)/(A1+A2) = 59%
3.1 Growth of Traffic • The average rates of aggregated external traffic • Total volume of external domestic traffic (B2), exceeds the volume for the 6 major IXes (B1) • International traffic : Total external traffic = 30% for inbound and 26% for outbound
3.1 Growth of Traffic • Relationship between total customer traffic (A) and total external traffic (B) • Assume all inbound traffic from other ISPs is destined to customers: • Inbound traffic volume for (B) should be closed to outbound traffic for (A) • Outbound traffic volume for (B) should be closed to inbound traffic for (A)
3.1 Growth of Traffic • Relationship between IX traffic (B1) and total input rate of 6 major IXes • Total incoming traffic of these IXes = 42% of total traffic • Total amount of residential broadband traffic in Japan in Nov 2005: 353Gbps for inbound, 468 for outbound
3.2 Customer Traffic • Took the average of the same weekdays in a month • Excluded holidays from the weekly analysis since holiday traffic patterns are closer to weekends
3.2 Customer Traffic • For RBB customer (A1), exceeds 260Gbps in evening hours • The peak hours are from 21:00 to 23:00 • Downstream traffic is much larger than upstream • Believe that P2P applications contribute significantly to the upstream traffic • For non-RBB customers (A2), dominated by residential traffic • Observe office hour traffic in the daytime but those customer traffic is smaller than residential customer traffic
3.3 External Traffic • External traffic group are used to understand the total traffic volume in back bone network • Top graph shows traffic to and from 6 major IXes (B1) • Middle graph shows external domestic traffic (B2) • Bottom graph shows international traffic (B3)
3.3 External Traffic • For bottom graph, inbound traffic is much larger than the outbound • Traffic pattern is clearly different from the domestic traffic • Peak hour are still in the evening, but outbound traffic volume is virtually flat compared to inbound volume
3.4 Prefectural Traffic • Investigate regional difference (between metropolitan and rural areas • Similar temporal patterns • 70% of average traffic is constant • Prefecture’s traffic is roughly proportional to the population of the perfecture
4. Analysis of Per-customer Traffic • Analyzes Sampled NetFlow data from one of the ISPs • The number of unique active users identified by customer Ids • Classified into 2 groups: more than 2.5GB/days and less than 2.5 GB/days • The total number of active users of DSL is slightly higher than fiber
4.1 Distribution of Heavy-hitters • Cumulative distribution of total traffic volume of heavy-hitters in decreasing order of volume
4.1 Distribution of Heavy-hitters • Cumulative distribution of daily traffic per user on a log-log scale • Total users • Fiber users • DSL users
4.1 Distribution of Heavy-hitters • The distribution is heavy-tailed but there is a knee in the slope • Top 4% of heavy-hitters using more than 2.5GB/day (or 230kbits/sec) for the total users • Top 10% using more than 2.5GB/day for the fiber users • Less clear for DSL users, a knee can be seen at around the top 2% using more than 2.5GB/day • Outbound traffic is larger for the majority of the users on the left side of the knee • But does not hold for heavy-hitters on the right side of the knee
4.1 Distribution of Heavy-hitters • Distribution of the metropolitan prefecture is closer to that of the total users • Distribution of the rural prefecture is closer to that of DSL users
4.2 Correlation of Inbound and Outbound Volume • Correlation between inbound and outbound volumes for each user shown as log-log scatter plots • 4300 points for fiber and 5400 for DSL • Highest density cluster is below and parallel to the unity line where outbound volume is about 10 times larger than that of inbound • Slope of cluster seems to be slightly larger than 1 • High-volume cluster is larger in the fiber plots • Much more low-volume users in the DSL plot
4.3 Temporal Behavior • Inbound and outbound volumes are almost equal for fiber traffic • Inbound is 61% > heavy-hitters and outbound is 166% > normal users • In DSL traffic, outbound volume is 83% > total users, only 11% > heavy-hitters and 179% > normal users • Inbound traffic of fiber heavy-hitters is much larger than outbound traffic • Fiber traffic accounts for 86% of the total inbound volume and 80% of total residential volume
4.3 Temporal Behavior • Increase of active users in morning > Increase of traffic volume, but the increase is smaller
4.4 Protocol and Port Usage • Port 80 (http) accounts only 9 % of total traffic • TCP dynamic port account 83% of total traffic but the usage of each port is small • Most popular P2P file-sharing software in Japan (WINNY) • No longer possible to make use of port number for identifying applications
4.5 Geographic Traffic Matrices • Shows traffic matrix among residential users (RBB), domestic data-centers leased-lines (DOM) and international addresses (INTL) • 90% is domestic communication • Both ends are either domestic residential users or other domestic addresses • Language and cultural barriers • Domestic fiber users are connected so well
4.5 Geographic Traffic Matrices • Divided into heavy-hitters and normal users • Ratio of user-to-user traffic is 69% for heavy-hitters and 49% for normal users
4.5 Geographic Traffic Matrices • Users access similar destinations regardless of the user location • Cannot identify any increase in traffic to neighbor prefectures • A small number of peers for video
4.5 Geographic Traffic Matrices • Users-to-users group has a much larger number of peers than the user-to-domestic group • 80% at the horizontal line have less that 18 dominant peers • 80% have only less than 4.7 dominant peers
4.5 Geographic Traffic Matrices • Wider range of peer numbers regardless of the traffic volume • High-volume traffic is generated not only for P2P file-sharing but also by other applications
5. Related Work • Previous study on growth rate of Internet traffic, now becomes harder after privatization of Interner after mid 90s • One study shows 100% growth rate per year for U.S. in 2003 • From data observed in Japan, growth rate slow down after 2002 to stable at 50% per year • Similar rate observed in Australia and Hong Kong • Probably due to broadband deployment already reached most technically concious users
5. Related Work • Consistent findings with earlier measurements of peer-to-peer traffic where it is dominant in commercial backbones, exhibit different behaviour from traditional web traffic • However, no longer able to rely on known port numbers to identify applications as peer-to-peer traffic shifting from using known to arbitrary ports
5. Related Work • Previous studies reported asymmetric nature of peer-to-peer traffic • Findings from this paper show from comparison between fiber and DSL users that bandwidth demands are not asymmetric • And deployment of symmetric access will change traffic patterns
6. Implications • Initially observed large skew in traffic usage • top 4% heavy-hitters account for 70% traffic; fiber user accounts for 80% traffic • Per-customer measurement found that distribution of their traffic is heavy-tailed, it is widespread and appear to be casual users rather than more dedicated users • Traffic patterns apparently shows it is a diversed mixture of peer-to-peer file sharing and content-downloading
6. Implications • Can no longer view heavy-hitters as exceptional extremes, too many of them, statistically distributed over wide traffic volume range • Natural to think casual user start play with new applications such as video downloading and peer-to-peer file-sharing, become heavy-hitters, and shift from DSL to fiber • Or user start with fiber, and look for applications to use the abundant bandwidth • Their behaviour easily affected by social, economical or political factors
6. Implications • Total traffic volume heavily impacted by heavy-hitters, slight change in application algorithm or charging policies will cause significant impact to backbone traffic • ISP tempted to avoid congestion by suppressing traffic from extreme heavy-hitters, however users as a whole are shifting towards high-volume usage
6. Implications • Japan can be regarded as model of widespread symmetric residential broadband access, even Korea has highest broadband penetration ratio, but majority are not fiber access • Japan has fairly closed domestic traffic, partly due to language and cultral barriers and partly due to rich connectivity within the country
7. Conclusion • Widespread residential broadband access • Essential for researchers and industry to prepare to accommodate with end users ever-changing behaviour • Established protected data sharing mechanisms with commercial Japanese ISP for data collection • Residential broadband traffic accounts for 2/3 of ISP backbone traffic, increasing at 37% per year
7. Conclusion • Investigated differences between DSL and fiber users, heavy-hitters and normal users, and geographic traffic matrices • Small segment of user dictates the overall behaviour • The distribution of heavy-hitters is heavy-tailed without a clear boundary between heavy-hitters and rest of users
Our findings • Data from Hong Kong OFTA on monthly broadband customers Internet traffic • Nov 2005: 505739 Terabits / month = 195Gbps • Vs. paper estimated 353Gbps-468Gbps residential broadband traffic in Japan Nov 2005 http://www.ofta.gov.hk/en/tele-lic/operator-licensees/opr-isp/s2.html
Our findings • HKIX switching statistics • Also observed slight different patterns on weekdays and weekends with larger daytime traffic at weekends • Peak hours in HK 22:00-00:00 vs. paper mentioned 21:00-23:00 in Japan http://www.cuhk.edu.hk/hkix/stat/aggt/hkix-aggregate.html
Significance • Japan as a good place to study behaviour of symmetric broadband access • Proportion of users of symmetric broadband access in Japan are larger than that in other countries • Canno longer rely on port numbers to identify application used • May need to identify by other kind of signatures • The bandwidth demands of P2P applications and users are not asymmetric in nature • Previous studies report P2P traffic are asymmetric in nature • Actual demand of users are shown when they are given symmetric bandwidth
Limitations and potential improvements • Language and cultural barriers of Japan • Majority of content is in the Japanese language • 90% communication are domestic at both ends • Other countries may exhibit different behaviour • Potential improvements • Find out the differences in traffic behaviour among countries where language and cultural differences are not that significant
Limitations and potential improvements • Bandwidth usage are application-specific • P2P application dominate usage patterns • Social factor, different P2P application popular in different country • Slight change in P2P application behaviour can affect the bandwidth usage • Potential improvements • Find out in what ways will a difference in application behaviour can affect bandwidth usage • And how the application behaviour can be optimized so that network resources can be better utilized