210 likes | 358 Views
Sherlock is around: Detecting Network failures with local evidence fusion. Study group 2012.04.09 Junction. Qiang Ma 1 , Kebin Liu 2 , Xin Miao 1 , Yunhao Liu 1,2 1 Department of Computer Science and Engineering, Hong Kong University of Science and Technology
E N D
Sherlock is around: Detecting Network failures with local evidence fusion Study group 2012.04.09 Junction Qiang Ma1, Kebin Liu2, Xin Miao1, Yunhao Liu1,2 1 Department of Computer Science and Engineering, Hong Kong University of Science and Technology 2 MOE Key Lab for Information System Security, School of Software, Tsinghua National Lab for Information Science and Technology, Tsinghua University
Introduction • Motivations: • Widely deployed WSNs for numerous application • Need to sustain for years, and operate reliably • Error-prone and subject to component faults, performance degradations • It’s more challenging to explore the root causes for WSNs • Ad-hoc feature of WSNs: large-scale, dynamical changes of topology • Limit sources of sensor nodes: power, computation capability • The existence of a large variety of specific protocols for WSNs
Related works • Traditional/popular way of diagnosis process • Sink-based • Actively collect global evidences from sensor nodes to the sink • Remaining energy, MAC layer backoff, neighbor table, routing table … • Conduct centralized analysis at the powerful back-end • Disadvantages • Communication overhead • Avoid large overhead in evidence collection process • Self-diagnosis • Injects fault inference model into sensor nodes • Make local decisions • Disadvantages • Results from single nodes: Inaccurate due to the narrow scope • Inconsistent results from different inference processes
Local diagnosis (ld2) • Main Design • Diagnosis efficiency • Local diagnosis process instead of backend • Reduce communication overhead • Diagnosis accuracy • Take judgments form all nodes with the local area into consideration
System architecture Naïve Bayesian Classifier to encode the probability correlation between a set of state attributes and root causes • Working like this: • Nodes running NBC: *state attributes = evidences • Posterior probability distribution: P(root causes|evidences) • Once a node detect anomalies • Construct a fusion tree and do evidence fusion • Advantages: • Balance the workload • ensure a local consensus to the final diagnosis result If its neighbor node has been removed from the neighbor list, the process would be triggered. Dempster-Shafer Theory Theory of evidence (DST)
Naïve bayesian classifier (nbc) • Parameters learned from historical data • R: root cause; Fi, where i=1,…,n: evidences; • : store s discrete values • Calculate the posterior probability • The posterior probabilities of different root causes • Each node, based on Fiobserved, calculate the • With certain mapping (normalization), • Used later as the basic probability assignmentsin DST Pre-learned Scale factor: constant for different R
Dempster-shafertheory (dst) • Fundamentals • Allow us to combine evidence from different sources and arrive at a degree of beliefin all possible states/hypotheses (R, root causes) that takes into account all the available evidences (F, metrics). • Terms: • Hypotheses: • The frame of discernment: • basic probability/belief assignment: m • (subjective or objective) • , A: focal element • constraint: • *posterior probability (objective)
Dempster-shafertheory (dst) • Different from the concept of probability • Belief: • Plausibility: • Pl(s)=1-Bel(~s) • Belief <= plausibility • In this study • The frame of discernment • , Ri: root causes • RO: no problem • Only generates
Dempster’s rule of combination • Combine the belief from different observers (sensor nodes) • To do evidence fusion conflict factor • joint mass • Problem: • The combination result goes against the practical sense!! • When with low or high conflict factor
Low/high conflict factor • Example: • Hypotheses Ω = {T, M, C} • T: brain tumor • M: meningitis (腦膜炎) • C: concussion (腦震盪) • The frame of discernment = 2Ω m(T)=1!!
Modified combination rule • Believe those results highly consensus between nodes • Definition 1: the distance between m1 and m2 is • Where • And • Proof:
Modified combination rule • Definition 2: The similar degree of m1 and m2 is • If we have one node i whose Mi is similar to all the others, than we believe that this node’s Miis important. • Definition 3: The basic confidence of evidence i (i = 1,2,..,N) • Normalization: • Modified = Basic probability assignmentx basic confidence • Reduce the impact of those evidences with less importance
EVIDENCE FUSION • Criterion: • the fusion result keeps the same even if we change the fusion order • Theorem 1:
Fusion algorithm • Trigger node • Detect abnormal symptoms • Node crash • Traffic contention • Route loop • Determine the diagnosis area • ??? • Standard set • Reduce computation overhead • root node and its one-hop neighbors • DREQ contains • Establish the fusion tree • Detail of diagnosis task • Standard set => basic confidence
Evidence fusion algorithm In case the loss of DREQ
evaluation • CitySee project: • Urban carbon dioxide sensing • 494 sensor nodes • Testbed using CTP protocol • 50 TelosB motes • Comparison • LD2 and TinyD2 • Manually inject evidences • Node crash • Traffic contention • Route loop • Metrics • False negative rate v.s. False positive rate
[1] KebinLiu; Qiang Ma; Xibin Zhao; Yunhao Liu; "Self-diagnosis for large scale wireless sensor networks," INFOCOM, 2011 Tinyd2 [1] • Fault detector (Self-diagnosis) • Finite state machine (FSM) model • Fault detector M=(E, S, S0, f, F) E: the set of input evidences S: the set of states S0: start state f: state transition function F: all Accept states • E.g. high retransmission rate between A and B (A->B) • A finds rate increasing • A broadcasts the current state together with the fault detector • If B received, check ACK or DATA • B -> S2 and broadcast -> Ci • NUM: threshold • Bc: severe contention at B Accept states: final diagnosis decision
Time cost • Problem node: 25 • With 16 neighbors • Root node of fusion tree: 13 • Time cost • Sampling evidences • Assign local basic confidence • Establishing fusion tree • Receive & broadcast beacons Time cost is stable for all the tree structures Traffic contention with longer time cost; DEVI packet contains 3 possible root causes:1. ingress overflow, 2. egress overflow 3. bad link => More combination work is needed
Diagnosis accuracy TinyD2 performs unstable: Worse when neighbors increase => Fail to achieve a consensus TinyD2 performs unstable: Worse when neighbors increase => Fail to achieve a consensus Decrease as neighbors increase: More determinate diagnosis Several root causes make it difficult for TinyD2 to use FSM to achieve an accept stat
Coupling effect with application Application packet loss
Conclusion • Conduct diagnosis in local area • Reduce the communication overhead • Distribute the diagnosis workload to the sensor nodes within a diagnosis area • Use fusion tree to do evidence fusion • A local consensus to the final diagnosis report is achieved • Need to predefine the failures!!