140 likes | 214 Views
Identification of Mobile Devices from Network Traffic Measurements - a HTTP User Agent Method . Master’s Thesis August 2 8 , 2012 . Supervisor – Prof. Heikki Hämmäinen Instructor – M.Sc. Antti Riikonen. Aashish Adhikari.
E N D
Identification of Mobile Devices from Network Traffic Measurements - a HTTP User Agent Method Master’s Thesis August 28, 2012 Supervisor – Prof. Heikki Hämmäinen Instructor – M.Sc. Antti Riikonen Aashish Adhikari
Mobile device identification aids in profiling the mobile Internet usage • Support the pricing and business development • Tailor the services to attract more users • Device identification from network measurements • TypeAllocationCode (TAC) • TCP Fingerprinting • HTTP • UAProf • User Agent string parsing Background
UA-based identification relies on idiosyncracies of UA string formats • Examples of UA string formats • Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0; NP07) • Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_3_3 like Mac OS X; fi-fi) AppleWebKit/528.18 (KHTML, like Gecko) Version/4.0 Mobile/7A341 Safari/528.16 • NokiaN70-1/3.0546.2.3 Series60/2.8 Profile/MIDP-2.0 Configuration/CLDC-1.1 • Android-YouTube/2 (GT-I9000 GINGERBREAD); gzip • WURFL DDR and Java API (parser) • Frequent updates by the active community • Uses Two-Step UA String Analysis algorithm User Agent - based device identification
R1: How can device and device features be identified based on • HTTP User Agent from mobile Internet traffic traces? • R2: How can the identification of mobile devices (and features) • aid in profiling the mobile Internet usage in Finland? • O1: Develop a tool to identify device type, model (and features) • based on the HTTP request header User Agent field • O2: Study the output of the tool and compare it with an existing tool • O3: Provide descriptive statistics on the mobile Internet usage in • Finland based on the identified devices Research Questions & Objectives
Measurement Setup (Adopted from Kivi & Riikonen, 2009) Measurement data IP traffic traces from the Gi interface in the packet core networks of two Finnish mobile network operators A week’s worth of data Parameters utilized in this thesis User Agent string, total transferred bytes, and number of flows Also includes String Matching results
Datasets • TCP and UDP logs • WURFL Repository • Handset Feature List • WURFL API Implementation • Improvements to the WURFL output • Custompatchfile • Customrules • New Releases • StringMatchingresults • Features from both, WURFL and Handset Feature List Analysis Process
Tool Output WURFL works well for web browser generated UA strings Indentifies desktop devices Only ~0.5% false positives with the dataset Additional programming required to extract device information from app-generated UA Enhanced WURFL tool increased the identification by 14% points Still uncertanities with non-standard app-generated UAs In comparison with the String Matching Facilitates manipulation of output Removes the issue of the identification of app-generated UA strings to some extent Not just the brand and model of the device, but elaborated list of features including the OS, OS version, and mobile browser Partly removes the cumbersome task of manually updating the device database
Descriptive Results Share of all mobile devices generated traffic volume and flows Operating system distribution (bytes) among the handheld devices • Only Handset and Tablet device types considered for further analysis • Android based devices generating the most traffic
Contd... Shares of browser and app-generated bytes and flows for Handsets • Clear distinction between browser and app-generated UA for Android and iOS • Unrealistic results for Symbian and MeeGo OSs • Uncertanities probably due to incapability of the tool or app-generated UAs for • these OSs fall under Unknown category
Contd... • Error bars resulted from the • terminals that do not have the • feature or for which the data • were not available • Many features close to • saturation • Saturation level for FM • radio ? Share of traffic volume for selected handset features
Future Work Application identificationby the enhanced WURFL tool Analysis of user sessions based on the device type, model, OS and device features Business perspective to the currentanalysis
Conclusions Tools used for the identification of mobile devices in web servers could be used to identify devices from mobile network traffic traces as well It is reliable to implement open source and community contributed DDR (such as WURFL) and its API Descriptive results show Android based handheld devices gaining popularity, Samsung being the most popular among the brands Apple iPhone* generates the most traffic among the handsets Devices with advanced features, such as 3G and touchscreen, preferred for mobile Internet * No clear distinction between the iPhone models