310 likes | 432 Views
COST-TMA: meeting @ Samos, September 22 nd , 23 rd 2008. Multi-level Application-based Traffic Characterization in a Large-scale Wireless Network. Maria Papadopouli 1,2 Joint Research with Thomas Karagianis 3 and Manolis Ploumidis 1,2 1 Department of Computer Science, University of Crete
E N D
COST-TMA: meeting @ Samos, September 22nd, 23rd 2008 Multi-level Application-based Traffic Characterization in a Large-scale Wireless Network Maria Papadopouli1,2 Joint Research with Thomas Karagianis3 and Manolis Ploumidis 1,2 1 Department of Computer Science, University of Crete 2 Institute of Computer Science, Foundation for ResearchandTechnology-Hellas 3 Microsoft Research *This work was partially supported by General Secretariat for Research and Technology and by European Commission with a Marie Curie IRG grant
Research interests • Traffic modeling • Impact of parameters (number of flows, flow inter-arrivals, flow sizes) on accuracy • Topology & mobility modeling • Traffic forecasting (moving averages, Singular Spectrum Analysis, etc) • Client profiling • Mobile p2p computing • Data diffusion using realistic mobility models • Efficient selection of appropriate network interface/channel based on network conditions/application requirements • Efficient distributed monitoring • Understanding the impact of network conditions on user experience
Roadmap • Objectives • Testbed, data acquisition & preprocessing • Data analysis • Aggregate traffic • AP traffic • Client traffic • Conclusions • Research in progress …
Objectives • Classify flows into application types • Identify dominant & popular application types • Compare UNC network with other wired & wireless networks • Characterize AP & client traffic
Testbed, data acquisition & preprocessing • Testbed • 488 APs, 382 monitored • 6,593 distinct MAC addresses – 9,125 distinct IPs • Data acquisition • Packet header traces from egress router • Client SNMP data • Data preprocessing • Correlation of packet headers with client SNMP • Classification of flows using BLINC
Classification with BLINC: heuristics • Host behavior (e.g., client-server, collaborative) • Host popularity: number of distinct destination IPs • Clusters of hosts using a collaborative application • Number of source ports • Transport layer protocol: TCP vs. UDP • Cardinality of sets (ports vs. IPs) • Per flow average packet size • Constant in several applications (e.g., malware) • “Farms” of services: neighboring IPs • Non-payload flows (e.g., attacks)
Popular application types Clients with at least one flow per application type
Compare with other testbeds • Traffic share for most dominant application types • Wired & wireless testbeds • UNC wired network • Dartmouth wireless infrastructure • Residential campus may have missed all Web traffic that was not accessed through one of the well-known ports for Web
Home application type of APs Traffic of this application type > than x% of total AP traffic • Web most prevalent home application type
Client traffic characterization Client home application: Application type of which this clients transfer >X% of their traffic • Clients have strong application preferences • ~ 50% of clients have home application type (for X=90) • Web: most prevalent home application type • Clients with no home application are dominated by Web • Only a minority of clients have P2P as dominant application
Wireless traffic load • Wide range of workloads & log normality is prevalent • Light traffic load but with long tails • Dichotomy among APs: • APs dominated by uploaders • APs dominated by downloaders • Majority of APs send & receive packets of small size • Significant number of APs with asymmetric packet sizes: • APs with large sent & small receive packets • APs with small sent & large receive packets
Application-based characterization • Most popular applications • Web browsing & p2p accounting ~81% of total traffic • These applications dominate most users and APs • Web dominates both AP & client traffic share • Network management & scanning activity ~17% of total flows • Application-mix varies within APs of same building • Wireless clients with strong application-type interests • File transfer flows (e.g., ftp, p2p) are heavier in wired network than in wireless one • Flow sizes per application type • Different between wired & wireless network
In progress … • Focus on applications with real-time constraints • Impact of “extreme” network conditions on performance & user satisfaction • Statistical analysis for client profiles • Comparable analysis with other wireless networks
UNC/FORTH Web Archive • Online repository of • Wireless measurement traces • Packet header, SNMP, SYSLOG, signal quality • Models • Tools http://netserver.ics.forth.gr/datatraces • Login/ password access after free registration • Maria Papadopouli mgp@ics.forth.gr
BLINC • BLINd Classification • Flows in application types • Focus on end hosts rather than on flow • 3-level host behavior analysis • Social • Functional • Application • Application signature based classification • Accurate flows classification
Heuristics (2/2) • Community heuristic • Farms of services in neighboring IPs • Recursive detection • Interaction between servers • Mail with Razor servers
Application level • Transport layer interaction between hosts • Based on TCP 4-tuple • Empirically derived signatures – graphlets • Nodes: Src,Dst IP & Src,Dst Port • Edges: Flows through this TCP-tuple • Protocol type • Host behavior against graphlet library
Bldg level application usage patterns • % of APs with home application type / bldg type • Weak correlation between building category & # of APs with home application • Distinct APs different configurations • Uneven traffic distribution across APs of same bldg • APs dominated by Web, P2P, or unknown traffic
Conclusions • Three-level characterization of large scale infrastructure • Support admission control & AP selection mechanisms • Indicate user trends • Assist application specific traffic modeling • Web dominates both AP & client traffic share • P2P systems bear a significant impact • Clients have strong application preferences
Heuristics used in classification • Transport layer protocol: TCP vs. UDP • Cardinality of sets • Ports vs. IPs • Constant in several applications (e.g., malware) • Community heuristic • Farms of services in neighboring IPs • Non-payload flows (e.g., attacks)
Attack graphlets • Address-Scan attack • Address-Scan attack for specific IP set • Port-scan attack
Traffic asymmetry (1/2) Asymmetry index = total downloaded / total uploaded traffic • Certain APs dominated by uploaders • Asymmetry index / application type • Asymmetry index for P2P traffic < 1 for 40% of APs
Wireless user application preferences • Similar between wireless & wired users • Flow sizes / application type • Different between wired & wireless network • Possible reasons • Application dependent • User-driven