1 / 24

Characterizing and Modeling Internet Traffic Dynamics of Cellular Devices

Characterizing and Modeling Internet Traffic Dynamics of Cellular Devices. M. Zubair Shafiq 1 , Lusheng Ji 2 , Alex X. Liu 1 , Jia Wang 2 1 Michigan State University, East Lansing, MI 2 AT&T Labs – Research, Florham Park, NJ. 6/11/2011. Motivation and Objective.

everly
Download Presentation

Characterizing and Modeling Internet Traffic Dynamics of Cellular Devices

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Characterizing and Modeling Internet Traffic Dynamics of Cellular Devices M. Zubair Shafiq1, Lusheng Ji2, Alex X. Liu1, Jia Wang2 1Michigan State University, East Lansing, MI 2AT&T Labs – Research, Florham Park, NJ 6/11/2011

  2. Motivation and Objective • Explosive increase in the data traffic volume over cellular networks • Subscriber growth • Increased network capacity • Increased device diversity • Improved device capabilities • Understanding the Internet traffic dynamics of cellular devices • Study large scale traffic • Study behavior of cellular devices • Study behavior of network applications • Develop predictive models

  3. Agenda • Data • Network architecture • Data collection • Differentiating device types • Measurement • Temporal dynamics • Application usage distribution • Modeling • Aggregate model • Multi-class model • Conclusions

  4. Data

  5. Architecture Overview • Cellular network: (1) radio access network, (2) core network • Mobile device connects to the network and establishes a Packet Data Protocol (PDP) context • IP Tunnel between mobile and GGSN using GPRS Tunneling Protocol (GTP)

  6. Data Collection • Anonymized and aggregated IP traffic records from the core network (Gn links) • Data covers a state in the United States over the period of one week • Information • Traffic volume, e.g. byte, packet, flow • Application type • Device type • Refer to Erman et al., WWW, 2009 for more details about data collection

  7. Differentiating Device Types • Type Allocation Code (TAC) in International Mobile Equipment Identifier (IMEI) number • GSM Association's TAC database contains the maker, model, version, and registration time of TAC numbers • Example IMEI: 01180800XXXXXXX • Make: iPhone • Model: 3G • Version: MB704LL • Year: 2008 • We study two popular smart phone families (A, B) and one wireless modem cards family (W)

  8. Measurement

  9. Temporal Dynamics • Interesting trends across weekdays and weekends • Smart phone B devices are favored more by business users and smart phone A devices are popular among general consumers

  10. Application Usage Distribution • Each device family has different traffic behaviors • Still, most top peaks in the volume distribution are for same applications mail mime www www mail mime mail mime www W A B

  11. Diversity vs. Volume • Diversity, characterized by information entropy – higher entropy more diversity • Wireless modem W devices tend to have the highest entropy and total volume • Entropy and total volume for smart phone A devices is more than those of smart phone B devices W A B

  12. Modeling

  13. Modeling • Aggregate Model • Traffic distribution • Temporal dynamics • Multi-class Model • Incorporate differences across device types • Cluster devices by unsupervised clustering algorithm • Based on traffic distribution • Based on temporal dynamics • Develop separate models for each cluster

  14. Modeling Aggregate

  15. Traffic distribution • Top 10% of the applications constitute about 99% of the flows • Highly skewed distribution • Zipf-like models, zipf with exponential cutoff, stretched exponential

  16. Temporal dynamics • Model the temporal dynamics as a random process • Order of the random process? • Autocorrelation analysis:

  17. Temporal dynamics • 23rd order discrete time Markov chain model • State merging (many-to-one mapping) to reduce the amount of required training data • Inaccuracies due to: • Changing device behavior • Changing device population composition

  18. Modeling Multiclass

  19. Multiclass: Device Clustering • Select appropriate number of clusters • Use intra-cluster distortion measure • Find the knee of the curve, gap statistic based heuristic k=3 k=3 Temporal Spatial

  20. Multiclass: Clustering Results • Spatial feature clustering • 100 element tuple: average traffic volume per application • Temporal feature clustering • 24 element tuple: average traffic volume per hour

  21. Multiclass: Models • Separate models for each of the three temporal and spatial clusters • 3 Zipf-like application distribution models • 3 Markov chain based temporal models • More accurate than the aggregate model Temporal Spatial

  22. Conclusions • Analyzed Internet traffic dynamics of cellular devices in a large cellular network • Findings have implications on cellular network design, troubleshooting, performance evaluation, and optimization • Devices families have subtle differences • Devise separate billing schemes • Skewness in application usage • Manufacturers and software developers can focus on the smaller subset of high-volume applications • Diurnality in temporal dynamics • Differentiate between peak and non-peak hour usage

  23. Questions?

  24. References • Data collection: J. Erman, A. Gerber, M. T. Hajiaghayi, D. Pei, and O. Spatscheck. Network-aware forward caching. In WWW, 2009. • Location information: Q. Xu, A. Gerber, Z. M. Mao, and J. Pang. AccuLoc: Practical localization of performance measurement in 3G networks. In ACM MobiSys, 2011. • Clustering: R. Tibshirani, G. Walther, and T. Hastie. Estimating the number of clusters in a data set via the gap statistic. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63:411-423, 2001.

More Related