170 likes | 192 Views
Explore the world of trace analysis with network measurement data, covering wired and wireless networks, encounter traces, and more. Learn about different trace formats and notable projects. Utilize the TRACE framework to characterize and cluster data, employ MobiLib for modeling and protocol design, and extract useful information for analysis. Enhance your investigation by parsing traces effectively and identifying patterns to draw meaningful conclusions. Uncover valuable insights using various software and algorithms, and leverage access points and authentication server syslogs for comprehensive data interpretation. Maximize efficiency with trace processing tricks and optimize trace conversion for seamless analysis.
E N D
What is Trace • Measurement data from real world networks • Wired networks: netflow traces • Wireless networks: Association trace, encouter trace…… • More general traces which represent other types of networks: GPS trace (Cabspoting)
Different types of Traces • Encounter traces • The Intel/Cambridge Haggle/Pocket Switch Network project • The U of Toronto PDA-based encounter experiments • Your own encounter trace • Cellphone traces • MIT Reality Mining: encounter, location of users (by cellphone tower/bluetooth), call log
Different types of Traces • WLAN traces • UF traces, USC traces, Dartmouth • Vehicular traces • Cabspotting
Format of UF WLAN trace • The format shown below is not the format from raw trace data • Association Trace • <time of the event in seconds> <Access Point> <Event> <MAC> • Login Trace • <Time of the event in seconds> <Gateway> LOGIN <MAC> <Username> <Session ID>
Format of UF WLAN trace • Logout trace • <Time of the event in seconds> <Gateway> LOGOUT <MAC> <Username> <Session ID> <duration of session in seconds> <bytes_in> <bytes_out> <packet_in> <packet_out>
The TRACE framework Analyze Represent Trace Characterize (Cluster) MobiLib Employ (Modeling & Protocol Design)
Analyze the trace • You should have your own perspective about what to investigate • Make sure that the trace itself or together with some other possible resource can provide enough information you need • Decide a scheme to parse the trace or decide what kind of tools(database…) to use to get the information out of trace in your desired format (representation)
Analyze the trace • Now, its time to sit down and extract useful information from the trace! • Then, you already convert the trace into a special representation or format. Try to identify a way to analyze it, many possibilities
Example • Study the daily user flow relationship among locations • From the association trace, we can build a network among all the building around campus • If there is a user which first associates with one AP in Building A and then go to Building B and make another association, we draw an edge between A and B • The weight of the edge donates the number of users transition from A to B in a day
Cont • Representation • Matrix with (a,b) donates the outflux from A to B • Then process the trace and populate the entries of the matrix, in the same run you may also want to get some other details (lags, sequence….)
Cont • Get your results • Analyze it with any software, algorithm you want
Access Points Syslogs • Users are reported by MAC addresses • When they associate with a AP • When they disaccosiate from a AP • When they roam away from a AP • When some other event happens (error in packet checksum, max retry for a packet reached, etc.)
Authentication server syslogs • The authentication server reports the following events • DHCP lease – IP xxx is given to MAC yyy • User log in – User Gatorlink-ID logs in from MAC yyy • User log out – User Gatorlink-ID logs out, and it has been online for time ttt, sent/received bbb bytes • Every 30 minutes, each online user is reported for its traffic usage in the past 30 mins
Tricks of Trace Processing • Identify a common format that you can convert multiple traces into • I use one file for each user, within each file, each line represents “time location duration” • Abuse your hard drive • Keep intermediate results if they take long time to generate.... You will thank your former self years after you generated those files