100 likes | 112 Views
This project aims to develop a method using sensor data to determine behavior probability and assess risk based on sensor behavior. The focus is on GPS and Bluetooth sensors with the constraints of efficient mobile device usage. The difficulty lies in setting a risk threshold for behavior probabilities. While goal 2 still requires more testing, progress has been made on goal 1. The project includes location-based and relative-based behavior analysis using Hidden Markov Models and Bluetooth proximity measurements.
E N D
Overview • Overall the project has two main goals: • 1) Develop a method to use sensor data to determine behavior probability. • 2) Use the behavior probability to assess risk. • Behavior is taken relative to the available sensors used to record actions of the device’s user; in this case most of the work has been on GPS & Bluetooth. Several methods for obtaining the probability that a device’s current routine matches its previously established routine were developed with the following constraints: • Work efficiently on a mobile device, which has limited energy, storage, & computational • power (So not to interfere with device’s primary function). • Accurately identify routines in sensor data & assign them probabilities based on device • history.
Overview • Achieving goal 1 is largely much easier than goal 2. Assessing risk based on some given percentage requires a threshold, usually obtained by intuition on the human side. • Threshold gives cutoff for behavior probabilities; above it, no risk & below it, risk. • E.g. if there is a 40% match to previous behavior, then is there risk or not?; if the threshold is at 30% then no. • Why 30% ? This is currently left to intuition of device’s human user & expected to change; e.g. later on, threshold could be at 65%, then a 40% match would be risky. • Though 40% by itself seems high, it is taken without context, the choice of threshold provides this context. • The difficulty arises in that one must almost already know what is risky to set a threshold. Due to this difficulty, goal 2 has not been met, though much progress was made, more testing is required. Filling in for the human element has always been tricky, so for the time being the focus will be on covering goal 1 & assuming the user will determine risk based on the given behavior probabilities.
Location-Based Behavior • Map is divided into Zones & device location recorded as sequence of zone names representing device movement. Example path shown with each segment corresponding to a GPS scan event. (Many other zone divisions can be used) • Zone sequence of example path is: • 123333333333333333334444444556655444444444443333333333221 • Then a Hidden Markov Model is trained on these sequences, giving them probabilities. 7 1 3 2 4 5 6
Relative-Based Behavior (Via Bluetooth proximity) • The Bluetooth sensor allows for a relative position independent of the when & where of the device. Only the relative behavior to other Bluetooth enabled devices matters. Behavior in this case is measured by how well a set of currently scanned device IDs match the history of previously seen IDs & the rate of new or unseen IDs. • Generally will have three categories of IDs determined by user; White, Gray, & Black. • All IDs come into the Gray list first, from this list the user picks IDs to White-list or • Black-list. • White-list IDs can increase or decrease behavior probability; additionally they are • associated low risk situations, connected with goal 2. • Black-list IDs can increase or decrease behavior probability; but they are associated • with high risk situations, also connected with goal 2. • The increase or decrease depends on whether White/Black-list IDs are seen during the • user’s routine or not, respectively. • So, for goal 1, only the Gray list is considered, since the White-list & the Black-list are • used to help define situational context giving by user’s choices and determine risk; • goal 2.
Relative-Based Behavior (Via Bluetooth proximity) • Since Bluetooth enabled devices can be stationary or moving, have a much more dynamic • set of behaviors to track. So, the times and order that IDs are seen may not be reliable. • Use overall occurrence probabilities obtained from device history; no order or time • dependence (Can extend to include a limited time dependence). • Given the set of IDs seen in a scan interval, calculate probability the event should occur. • IDs belonging to the history will increase probability & those never seen before will • decrease it. Higher the probability, lower the risk 7 Bluetooth signal & movement direction 1 3 2 4 5 6
Proximity-Based Behavior (Via Bluetooth) • Example setup: • Use 30 day training history. • Based on expected number of previously unknown IDs, i.e. those not in device history, a • reduction factor of ¾ is chosen (Essentially a penalty for unknown IDs, user can set). • The day is divided into 5min intervals (288 total), where the device wakes up & scans for • Bluetooth IDs for 1 minute within each interval (Limited by energy consumption of scans). • Training : • For each day of training the scanned IDs are stored along with the number of times they have • been seen in previous scans (This is updated at the end of each day). • When unknown IDs are encountered, they are added to the list of scanned IDs along with • their number of occurrences & also totaled under a single special *ID for that day. • E.g. On first day all IDs are unknown, so *ID1 is the total of the occurrences of all IDs & • *ID2 is total of all occurrences of unknown IDs on 2nd day; etc. for *ID3 ... • When no IDs are seen, this is treated as a special NoID recorded for every day, just as with the • unknown *IDs, there is one for every training day; NoID1, …, NoID30. • At end of 30 training days have 30 *IDs representing probability of seeing unknown IDs when • scanning; similarly for NoIDs; in addition to a table containing the probability of seeing • any single ID from the device’s history.
Proximity-Based Behavior (Via Bluetooth) • Evaluation continued: • In the case where no IDs are detected in a scan interval, will use the most likely • probability value from the probability distribution over NoID1, …, NoID30. • This expected probability is calculated by summing squares of all 30 NoID probabilities. • Finally, for all the 288 scan intervals in a day, can assign a probability to each interval based on the Bluetooth IDs detected. That is, every 5min the device will output a percentage value to the user estimating the degree of alignment to past behavior; relative to Bluetooth. • Note this is only detection, and does not include analysis of data transfer; if there is any. • Can extend model to include data transfer behaviors. • Wifi or other signal types can be included similarly in other models. • The simplicity was meant to work in real-time (continuous scan) , and to accommodate large numbers of IDs that move in and out of the system. The idea is to test current model with different extensions, e.g. can add a logic component that analyzes how other devices interact, as seen by user’s device within its Bluetooth range. A more powerful model can always be constructed, but can it function when it is needed. Runtime & energy consumption are major factors for mobile devices and must be balanced with model robustness; & particulars of the deployment environment.
SMALL DEMO OF BLUETOOTH SENSOR