Sherlock is around: Detecting Network failures with local evidence fusion

Sherlock is around: Detecting Network failures with local evidence fusion Study group 2012.04.09 Junction Qiang Ma1, Kebin Liu2, Xin Miao1, Yunhao Liu1,2 1 Department of Computer Science and Engineering, Hong Kong University of Science and Technology 2 MOE Key Lab for Information System Security, School of Software, Tsinghua National Lab for Information Science and Technology, Tsinghua University

Introduction • Motivations: • Widely deployed WSNs for numerous application • Need to sustain for years, and operate reliably • Error-prone and subject to component faults, performance degradations • It’s more challenging to explore the root causes for WSNs • Ad-hoc feature of WSNs: large-scale, dynamical changes of topology • Limit sources of sensor nodes: power, computation capability • The existence of a large variety of specific protocols for WSNs

Related works • Traditional/popular way of diagnosis process • Sink-based • Actively collect global evidences from sensor nodes to the sink • Remaining energy, MAC layer backoff, neighbor table, routing table … • Conduct centralized analysis at the powerful back-end • Disadvantages • Communication overhead • Avoid large overhead in evidence collection process • Self-diagnosis • Injects fault inference model into sensor nodes • Make local decisions • Disadvantages • Results from single nodes: Inaccurate due to the narrow scope • Inconsistent results from different inference processes

Local diagnosis (ld2) • Main Design • Diagnosis efficiency • Local diagnosis process instead of backend • Reduce communication overhead • Diagnosis accuracy • Take judgments form all nodes with the local area into consideration

System architecture Naïve Bayesian Classifier to encode the probability correlation between a set of state attributes and root causes • Working like this: • Nodes running NBC: *state attributes = evidences • Posterior probability distribution: P(root causes|evidences) • Once a node detect anomalies • Construct a fusion tree and do evidence fusion • Advantages: • Balance the workload • ensure a local consensus to the final diagnosis result If its neighbor node has been removed from the neighbor list, the process would be triggered. Dempster-Shafer Theory Theory of evidence (DST)

Naïve bayesian classifier (nbc) • Parameters learned from historical data • R: root cause; Fi, where i=1,…,n: evidences; • : store s discrete values • Calculate the posterior probability • The posterior probabilities of different root causes • Each node, based on Fiobserved, calculate the • With certain mapping (normalization), • Used later as the basic probability assignmentsin DST Pre-learned Scale factor: constant for different R

Dempster-shafertheory (dst) • Fundamentals • Allow us to combine evidence from different sources and arrive at a degree of beliefin all possible states/hypotheses (R, root causes) that takes into account all the available evidences (F, metrics). • Terms: • Hypotheses: • The frame of discernment: • basic probability/belief assignment: m • (subjective or objective) • , A: focal element • constraint: • *posterior probability (objective)

Dempster-shafertheory (dst) • Different from the concept of probability • Belief: • Plausibility: • Pl(s)=1-Bel(~s) • Belief <= plausibility • In this study • The frame of discernment • , Ri: root causes • RO: no problem • Only generates

Dempster’s rule of combination • Combine the belief from different observers (sensor nodes) • To do evidence fusion conflict factor • joint mass • Problem: • The combination result goes against the practical sense!! • When with low or high conflict factor

Low/high conflict factor • Example: • Hypotheses Ω = {T, M, C} • T: brain tumor • M: meningitis (腦膜炎) • C: concussion (腦震盪) • The frame of discernment = 2Ω m(T)=1!!

Modified combination rule • Believe those results highly consensus between nodes • Definition 1: the distance between m1 and m2 is • Where • And • Proof:

Modified combination rule • Definition 2: The similar degree of m1 and m2 is • If we have one node i whose Mi is similar to all the others, than we believe that this node’s Miis important. • Definition 3: The basic confidence of evidence i (i = 1,2,..,N) • Normalization: • Modified = Basic probability assignmentx basic confidence • Reduce the impact of those evidences with less importance

EVIDENCE FUSION • Criterion: • the fusion result keeps the same even if we change the fusion order • Theorem 1:

Fusion algorithm • Trigger node • Detect abnormal symptoms • Node crash • Traffic contention • Route loop • Determine the diagnosis area • ??? • Standard set • Reduce computation overhead • root node and its one-hop neighbors • DREQ contains • Establish the fusion tree • Detail of diagnosis task • Standard set => basic confidence

Evidence fusion algorithm In case the loss of DREQ

evaluation • CitySee project: • Urban carbon dioxide sensing • 494 sensor nodes • Testbed using CTP protocol • 50 TelosB motes • Comparison • LD2 and TinyD2 • Manually inject evidences • Node crash • Traffic contention • Route loop • Metrics • False negative rate v.s. False positive rate

[1] KebinLiu; Qiang Ma; Xibin Zhao; Yunhao Liu; "Self-diagnosis for large scale wireless sensor networks," INFOCOM, 2011 Tinyd2 [1] • Fault detector (Self-diagnosis) • Finite state machine (FSM) model • Fault detector M=(E, S, S0, f, F) E: the set of input evidences S: the set of states S0: start state f: state transition function F: all Accept states • E.g. high retransmission rate between A and B (A->B) • A finds rate increasing • A broadcasts the current state together with the fault detector • If B received, check ACK or DATA • B -> S2 and broadcast -> Ci • NUM: threshold • Bc: severe contention at B Accept states: final diagnosis decision

Time cost • Problem node: 25 • With 16 neighbors • Root node of fusion tree: 13 • Time cost • Sampling evidences • Assign local basic confidence • Establishing fusion tree • Receive & broadcast beacons Time cost is stable for all the tree structures Traffic contention with longer time cost; DEVI packet contains 3 possible root causes:1. ingress overflow, 2. egress overflow 3. bad link => More combination work is needed

Diagnosis accuracy TinyD2 performs unstable: Worse when neighbors increase => Fail to achieve a consensus TinyD2 performs unstable: Worse when neighbors increase => Fail to achieve a consensus Decrease as neighbors increase: More determinate diagnosis Several root causes make it difficult for TinyD2 to use FSM to achieve an accept stat

Coupling effect with application Application packet loss

Conclusion • Conduct diagnosis in local area • Reduce the communication overhead • Distribute the diagnosis workload to the sensor nodes within a diagnosis area • Use fusion tree to do evidence fusion • A local consensus to the final diagnosis report is achieved • Need to predefine the failures!!

Sherlock is around: Detecting Network failures with local evidence fusion

Sherlock is around: Detecting Network failures with local evidence fusion

Presentation Transcript

Network Resilience: Exploring Cascading Failures

Using Redundancy to Cope with Failures in a Delay Tolerant Network

Unreproducible tests Successes, failures, and lessons in testing and verification

detecting and analysing emotion in social network sites

Bro: A System for Detecting network Intruders in Real-Time Vern Paxson

Generic Framework for Context-Dependent Fusion with Application to Landmine Detection

Welcome to Open House! Mrs. Sherlock’s Kindergarten Class

Detecting Network Motifs in Gene Co-expression Networks

Network monitoring: detecting node failures

A Machine Learning Approach to Detecting Attacks by Identifying Anomalies in Network Traffic

MHD for Fusion Where to Next?

On Detecting Pollution Attacks in Inter-Session Network Coding

Detecting Information Pooling: Evidence from Earnings Forecasts after Brokerage Mergers

Using Evidence on Local and School Level in Hungary

I1.2 A Quality-of-Information Theory for Sensor Data Collection and Fusion

Detecting Network Motifs in Gene Co-expression Networks

A Non-intrusive , Wavelet-based Approach To Detecting Network Performance Problems