1 / 91

Analyzing Large-Scale Wireless Network Traces for User Modeling and Protocol Design

This study focuses on analyzing empirical data from WLAN traces to understand user behavior, individual mobility, and global encounters in wireless networks. The research aims to incorporate findings into modeling and protocol design, classify behavioral groups, and enable efficient broadcast techniques. By employing the TRACE framework, the study delves into detailed behavior analysis and outlines future work. The analysis includes metrics for mobility models, such as Node association and activity patterns, to create a Time-variant Community Model for reproducing realistic mobility characteristics. Theoretical tractability is maintained, making this study significant for evaluating wireless mobile network protocols and understanding user preferences within WLAN environments.

joshuab
Download Presentation

Analyzing Large-Scale Wireless Network Traces for User Modeling and Protocol Design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Analysis of Large-scale Wireless Network Traces and Its Impact on User Modeling and Protocol Design Wei-jen Hsu Advised by Dr. Ahmed Helmy

  2. Emerging Wireless Communication • Opportunities • Challenges • Dynamic network structure • Decentralized service paradigm • Tight coupling between the devices and individuals

  3. Problem Statement • To understand user behavioral patterns in mobile networks from empirical, large-scale data sets • Individual mobility characteristics • Pair-wise similarity • Global encounter pattern • To incorporate the findings in modeling and protocol design • User mobility model • Classify behavioral groups; profile-cast • Efficient broadcast

  4. Analyze Represent Trace Employ(apply) Characterize The TRACE framework

  5. Detailed behavior analysis Complete case Future Work Outline

  6. Trace Trace Sets • In this work we mainly use WLAN traces • Mostly from university campuses or corporate networks (4 universities, 1 corporate network) • The largest data sets about wireless network users available to date (# users / lengths) • No bias: not “special-purpose”, data from all users in the network • For comparison we also look at some vehicular movement trace and human encounter trace

  7. Trace Trace Sets • Available information from WLAN traces • MAC addresses of the devices as identifiers • Location/Time of users (our main focus) Node: e0_12_29_fc_ba_8c AssociationStart time Location_ID Duration 2197745 172.16.8.244_11009 4433 2230200 172.16.8.244_11009 13320 2257917 172.16.8.244_11009 643 2285119 172.16.8.244_11009 1017 2297134 172.16.8.244_11009 7153 2304287 172.16.8.244_11023 6744

  8. Case study I – individual mobility

  9. Goal • To understand the mobility/usage pattern of individual wireless network users • To observe how environments/user type/trace-collection techniques impact the observations • To propose a realistic mobility model based on empirical observations • That is mathematically tractable • That matches with multiple scenarios

  10. Mobility Models • Mobility models are of crucial importance for the evaluation of wireless mobile network protocols • Requirements for mobility models • Realism (detailed behavior from traces) • Parameterized, tunable behavior • Mathematical tractability

  11. Represent Metrics for Mobility Models • How often are the nodes present? • Percentage of “online” time • What kind of preference do users show in space? • The percentile of time spent at the most frequently visited locations • What kind of repetition do users show in time? • The probability of re-appearance

  12. Skewed location preference Characterize On/off activity pattern Periodic re-appearance Prob.(online time fraction > x) Mobility Characteristics from WLANs • Simple existing modelsare very differentfrom the characteristicsin WLAN

  13. Skewed location visiting preferences Create “communities” to be the preferred area of movement Periodical re-appearance Create structure in time – Periods Repetitive structure 75% 25% Employ(apply) Time-variant Community (TVC) Model [Spy06, Hsu07]

  14. Theoretical Tractability • For the TVC model, we can derive • Nodal spatial distribution – the demographic profile of the mobility model • Average node degree – important for cluster maintenance and geographic routing • Hitting time/ Meeting time – important for routing performance analysis • With low error when the communication range is small compared to the community sizes (communication disk < 25% of community)

  15. Theoretical Tractability Avg. node degree Spatial distribution Hitting time Meeting time

  16. Using the TVC Model – Reproducing Mobility Characteristics • (STEP1) Identify the popular locations; assign communities • (STEP2) Assignparameters to the communities according to stats • (STEP3) Adding user on-off patterns (e.g., in WLAN, users are usually off when moving)

  17. Using the TVC Model – Reproducing Mobility Characteristics • WLAN trace (example: MIT trace) Skewed location visiting preference Periodic re-appearance *Similar matches achieved for USC and Dartmouth traces.

  18. Using the TVC Model – Reproducing Mobility Characteristics • Vehicular trace (Cab-spotting)

  19. Using the TVC Model – Reproducing Mobility Characteristics • Human encounter trace at a conference Inter-meeting time A encounters B Encounter duration time Encounter duration Inter-meeting time

  20. Summary (Case Study I) • We observe some omni-present mobility characteristics from WLANs • These characteristics are not captured by existing synthetic mobility models (i.e., hence the models are not realistic) • We propose the Time-variant Community (TVC) model, which is realistic, theoretically tractable, and flexible

  21. Case study II – Groups in WLAN

  22. Goal • Identify similar users (in terms of long run mobility preferences) from the diverse WLAN user population • Understand the constituents of the population • Identify potential groups for group-aware service • In this work we classify users based on their long-run mobility trends (or location-visiting preferences) • We consider semester-long USC trace (spring 2006, 94days) and quarter-long Dartmouth trace (spring 2004, 61 days)

  23. Association vector: (library, office, class) =(0.2, 0.4, 0.4) Represent Representation of User Association Patterns • We choose to represent summary of user association in each day by a single vector • Summarize the long-run mobility in an “association matrix” • Office, 10AM -12PM • Library, 3PM – 4PM-Class, 6PM – 8PM

  24. Eigen-behavior • Eigen-behaviors: The vectors that describe the maximum remaining power in the association matrix (obtained through Singular Value Decompostion)with quantifiable importance • Eigen-behavior Distance calculates similarity of users by weighted inner products of eigen-behaviors. • Benefits: Reduced computation and noise

  25. Identify Similar User • With the distance between users U and V defined as 1-Sim(U,V), we use hierarchical clustering to find similar user groups. USC Dartmouth *AMVD = Average Minimum Vector Distance

  26. Validation of User Groups • Significance of the groups – users in the same group are indeed much more similar to each other than randomly formed groups (0.93 v.s. 0.46 for USC, 0.91 v.s. 0.42 for Dartmouth) • Uniqueness of the groups – the most important group eigen-behavior is important for its own group but not other groups

  27. Characterize User Groups in WLAN - Observations • Skewed group size distribution – the largest 10 groups account for more than 30% of population on campus. Power-law distributed group sizes. • Most groups can be described by a list of locations with a clear ordering of importance • We also observe groups visiting multiple locations with similar importance – taking the most important location for each user is not sufficient

  28. Enough of words! Let’s see how it works

  29. Summary (Case Study II) • We use SVD to obtain eigen-behaviors of individual users. • We use the eigen-behavior distances and hierarchical clustering to classify WLAN users into similar groups. • This finding is useful for mobility modeling (identifying group sizes and their frequently visited locations), network management, abnormality detection, and group-aware protocol (i.e., profile-cast, our future work)

  30. Case study III – Encounter Pattern

  31. Derived from simultaneous associations to the same locations How many other nodes does a node encounter with? Encounter Events Prob. (unique encounter fraction > x) 0.5 On avg. only 2%~7% of population

  32. Draw a link to connect a pair of nodes if they ever encounter with each other Most node pairs are connected in the ER graph The ER graphs show SmallWorld graph characteristics High clustering coefficient Low average path length Characterize Represent Encounter-Relationship (ER) graph

  33. Future Work – Profile-cast

  34. Goal • To send messages to a group of nodes within the general population • The group is defined by the intrinsic behavior patterns of the nodes (CISE students, library visitors, moviegoers) • The sender does not know the network identities (addresses) of the destinations • Different from multi-cast: No join/leave, no group maintenance

  35. Profile-cast Use Cases • Mobility profile-cast • Targeting people who move in a particular pattern (lost-and-found, context-aware announcement) • Rely on the “similarity metric” between users • Mobility-independent profile-cast • Targeting people with a certain characteristics independent of mobility (classic music lovers) • Rely on the “Small World” encounter pattern Current Future

  36. Mobility space S N D N N N S D D Forward?? Mobility Profile-cast (inter-group) Scoped message spread in the mobility space

  37. 1. profiling S N N N N Each row represents an association vector for a time slot An entry represents the percentage of online time during time slot i at location j Sum. vectors Inter-group profile-cast Operation • Profiling user mobility • The mobility of a node is represented by an association matrix • Singular value decomposition provides a summary of the matrix (A few eigen-behavior vectors are sufficient, e.g. for 99% of users at most 7 vectors describe 90% of power in the association matrices for 94 days)

  38. 1. profiling S N N N N 2. Forwarding decision Inter-group profile-cast Operation • Determining user similarity • Nodes exchange their eigen-behaviors and the corresponding weights at encounter • Similarity of user mobility are evaluated by weighted inner products of eigen-behaviors • Message forwarded if Sim(U,V) is higher than a threshold (recall that the goal is to deliver messages to nodes with similar profile)

  39. Evaluation • Based on USC WLAN trace for realistic user mobility(2006 spring, 94 days, 5000 users) • We use hierarchical clustering to identify 200 distinct groups based on mobility profile. • We pick groups with 5 or more members and randomly pick 20% of the members in these groups as senders

  40. Complete user grouping info No usergrouping info Evaluation • Spanning the spectrum of grouping knowledge Inferred user grouping info Similarity-basedprotocol • Epidemic andRandom Tx. • Simple • Not optimized Centralized protocol- Highly efficient - But not practical

  41. Success Rate Delay Overhead 92% 45% more overhead Evaluation - Result • Centralized: Excellent successrate with only 3% overhead. • Similarity-based: • (1) 61% success rate at low overhead, 92% success rate at 45% overhead • (2) A flexible success rate – overhead tradeoff • RTx with infinite TTL: Much more overhead undersimilar success rate • Short RTx with many copies: Good success rate/overhead, but delay is still long

  42. Flooding Similarity S S S S S Single long random walk Multiple short random walks Mobility Profile-cast (intra-group) Goal

  43. Mobility Profile-cast (inter-group) • Sending to a mobility profile specified by the sender • Gradient ascend followed by local flooding (in the mobility space) • The current message holder holds on to the message until it encounters with a node with higher similarity to the target • When the message reaches a point close enough to the target, local flooding is triggered

  44. S S S S S S T.P. T.P. T.P. T.P. T.P. T.P. Gradient-ascend Single long random walk Multiple short random walks Mobility Profile-cast (inter-group) Goal Flooding Flooding_sim

  45. Mobility Profile-cast (inter-group)

  46. Performance Comparison Gradient ascend helpsto overcome the difficult case – when the source is far from T.P. Few long RW is better when S is far from T.P. but many short RW is betterwhen S is close to T.P.

  47. Performance Comparison Few long RW is better when S is close toT.P. but many short RW is betterwhen S is close to T.P. Gradient ascend helpsto overcome the difficult case – when the source is far from T.P. Gradient ascend has some extra delay comparing with flooding

  48. Future Work • Mobility independent profile-cast • The target group are not necessarily “close” in the mobility space • The encounter pattern provides a network in which most nodes are reachable • We don’t want to flood – How to leverage the Small World encounter pattern to reach the “neighborhood” of most nodes efficiently?

  49. S S S S S Mobility Independent Profile-cast Goal Flooding SmallWorld-based Single long random walk Multiple short random walks

  50. Forward? Interest space Mobility space Physical space S Future Work • One-copy-per-clique in the “mobility space” • We expect this to work because similarity in mobility leads to frequent encounters

More Related