230 likes | 263 Views
Detecting and Correcting Malicious Data in VANETs. Philippe Golle, Dan Greene, Jessica Staddon Palo Alto Research Center. Presented by: Jacob Lynch. Table of Contents. Introduction Related Work Classification of Attacks Distinguishability Model Example Conclusion. Introduction.
E N D
Detecting and Correcting Malicious Data in VANETs Philippe Golle, Dan Greene, Jessica Staddon Palo Alto Research Center Presented by: Jacob Lynch
Table of Contents • Introduction • Related Work • Classification of Attacks • Distinguishability • Model • Example • Conclusion
Introduction • Vehicular ad-hoc networks rely heavily on node-to-node communication • Potential for malicious data • VANETs need a method for evaluating the validity of data • Nodes search for explanations for the data they receive and accept the data based on highest score • Nodes can tell “at least some” other nodes apart from one another • Parsimony argument accurately reflects adversarial behavior in a VANET
Introduction (2) • Each node builds a world view in an offline mode • Rules: two vehicles cannot occupy the same position at the same time, etc. • Statistics: vehicles rarely travel faster than 100 MPH, etc. • Density combined with mobility supports parsimony
Related Work • Sybil attacks can foil many algorithms • Resource testing (storage, computation, communication) in MANETs • Not appropriate for VANETs, attackers may cheaply acquire resources • Node registration does not scale well • Position verification can identify messages coming from the same source
Classification of Attacks • Decisions based on likelihood of attack scenarios in a VANET, not accumulation of agreeing data • Distinguish attacks based on • Nature • Target • Scope • Impact
Attack Nature • Adversary may report • False information about other parts of VANET • False information about itself • Some attacks may be unpreventable • If a node can only sense distance instead of precise location, this gives an area that one node may successfully mount Sybil attacks
Attack Target • Local targets • Close proximity to attacker • Better for adversary because the likelihood of conflicting data from neighbors is reduced • Harder to maintain proximity, less likely • Remote targets • Further away • Data received from neighbor nodes may be conflicting • Easier for an adversary to setup
Attack Scope • Scope is measured by the area of nodes that have data of uncertain validity • Scope is limited if the area of affected nodes is small • May be local or remote area to the malicious nodes • Extended attack if larger area of nodes is affected • Approach used is designed to slow local attacks growing into extended attacks by using information propagation
Attack Impact • Three outcomes of an attack • Undetected • Attack is completely successful • May occur when node is alone or completely surrounded by malicious nodes • Detected • Attack is detected but uncertain data remains • Nodes have access to honest nodes, but insufficient information to justify the risk in attempting to correct data • Corrected • Attack is detected and corrected with no remaining uncertain data • Lots of honest nodes available, enough information to identify false information and correct the attack
Model Exploitation • Attacker may choose an attack whose effects are hidden by other incorrect explanations chosen to be more likely in the ordering relation of the model • Two ways to help prevent this • Model shows these hidden attacks to be more costly than simpler attacks • Allow model to be changed, adjusts to short term and long term changes • Even though the possibility of a complicated attack is included in the model, most attackers will use simple attacks, which makes the sophisticated attacker’s job easier
Distinguisability • In order to tell nodes apart there are four assumptions • Node can bind observations of its local environment with the communication it receives • Node can tell its local neighbors apart • Network is sufficiently dense • Nodes can authenticate communication to one another after coming close enough
Local Distinguishability • A node can distinguish local neighbors • Node can associate a message with the physical source of that message • Node can measure relative position of the source of message • Example setup • Equip nodes with cameras and exchange messages using visible or infrared light • Estimate position by analyzing light, message tied to source because the node can tell where it came from • Also use time of arrival, angle of arrival, and received signal strength, which may be tampered with
Extended Distinguishability • Nodes will communicate local observations to nodes farther away • If multiple trusted nodes verify other further nodes as distinct, these nodes may be included in world view as distinct • Use private/public keys refreshed constantly to authenticate communication • Distinguishability is lost once key is refreshed if the node moves out of local neighborhood
Privacy • Trade-off between privacy and ability to detect and correct malicious data • Changing keys increases privacy but hinders detection and correction of malicious data • An isolated node regularly reporting its position changes its key • Easy to assume the new key belongs to the same node based on trajectories • Suggestions for changing keys • Change keys at synchronized times • Introduce gaps in data reported near key changes • Change keys when nodes are near one another
Model • Nodes may record an observation if the location of the event is within their observation range the entire duration of the event • Assertions recorded by a node are instantaneously available to all other nodes • Value of data declines the further away from the event it is transmitted, dealing with a small area
Model (2) • To explain a set of events at a node • Each event must be tagged with a hypothesis • Hypotheses are chosen from a set of hypotheses • The set of hypotheses is partitioned into valid and invalid based • If all the hypotheses matched to the set of events are valid, then the explanation is valid • Explanations are ordered based on statistical methods, for example, Occam’s razor
Example • Assume nodes are able to precisely sense the location of neighbors within communication range • There is a set of observed events K, which can included observations about nodes made by themselves • Model for the VANET will be valid if there is a reflexive observation for every node, and every non-reflexive observation agrees with the reflexive observations
Example (2) • Each node comes up with an explanation • Label each observation in the set of events as truthful, malicious, or spoof • The observations made by the node constructing the explanation are truthful • Observers labeled as spoofs should not have any of their observations recorded as truthful • One added observation per reflexive observation made be made that supplies correct location information consistent with other truthful observations
Example (3) • Score each explanation according to the number of distinct observers that are labeled malicious • The valid explanation with the fewest malicious nodes is considered the simplest and most plausible explanation • There may be enough information in the set of events to identify all the truthful and malicious nodes
Example (4) • When there are only a few malicious nodes, explanations can be computed by • Treating truthful observations as arcs in a graph and beginning a breadth first search starting at the nodes location, traverse arcs as long as the next node hasn’t been labeled as malicious • All unreached nodes will be labeled as spoofs • Algorithm terminates when it has found explanations consistent with VANET model with fewest malicious nodes
Example (5) • Second example of model included • Nodes are not able to distinguish between another nodes with the same precision • Use another breadth first search to generate explanations • Order explanations by looking for few malicious nodes and a regular density as opposed to spare or dense patterns of nodes
Conclusion • Accurate and precise sensor data is important in identifying malicious nodes and data • Finding the most likely explanation in each case will be difficult • Manageable when there are only a few malicious nodes • Could be accelerated by having nodes share candidate explanations with each other