Data Gathering and Aggregation in Wireless Sensor Networks

ELG7178F “Ad Hoc Networking” Albert Wahba – March 11, 2010 Data Gathering and Aggregation in Wireless Sensor Networks

Introduction & Problem Statement Data Queries Sensors Reply Sink(s) End User(s) How can Data be Effectively gathered and aggregated from sensors to End Users?

Outline

Data Storage Location External Local Data-Centric * [1] Wei-Peng Chen and Jennifer C. Hou, 2005

Outline

Distributed Index for MultidimensionalData (DIM) DIM Builds an in-network distributed data structure to effectively answer multi-dimensional range queries. Assumptions: • All nodes are aware of the network geographic boundaries. • Each sensor node is aware of its geographic location. • Data values normalized to be between 0 and 1. * [3] Xin Li, Young Jin Kim, Ramesh Govindan, and Wei Hong, 2003

DIM Zone Assignment A1<0.5 A1<1 1 0 0.75<A2<1 0101 0111 1101 1111 1 A2<1 1 0.5<A2<0.75 0100 0110 1100 1110 0 0.25<A2<0.5 0001 0011 1001 1011 1 A2<0.5 0 0000 0010 1000 1010 A2<0.25 0 A1<0.25 0.25<A1<0.5 0.5<A1<0.75 0.75<A1<1 0 1 0 1 * [3] Xin Li, Young Jin Kim, Ramesh Govindan, and Wei Hong, 2003

Routing an Event to its Owner Example A1<0.5 0 A1<1 1 1 1 1 E1=(0.8, 0.7) 1111 010 0111 110 0.75<A2<1 1 3 5 A2<1 1 4 1110 2 0.5<A2<0.75 0 1 6 1 1 1 0 1 1 1 0 0110 0001 Store E1 7 0.25<A2<0.5 1 9 8 A2<0.5 0 10 A2<0.25 0 001 10 0000 A1<0.25 0 0.25<A1<0.5 1 0.5<A1<0.75 0 0.75<A1<1 1 DIM’s Zone Tree

Enhancing DIM Performance Using k-d Tree • Divide the deployment field to cells. • Cells are utilized as the storage unit. • Index node covers one or more cells. • All cells belong to the same index node stores the same data. • Dynamically control the depth of DIM’s Zone Tree. • Solve the scalability problem of DIM. • Better energy efficiency. * [4] Lei Xie, Lijun Chen, Daoxu Chen, Li Xie, 2009

Outline

Flat Network Architecture • Two-Phase Pull Diffusion: • Sinks search by flooding, Sources reply by flooding, then Sinks choose best route. • Many sources and only few sinks. • One-Phase Pull Diffusion: • Replies sent to neighbors that first sent the query. • Large number of events being queried. • Push Diffusion: • Sources floods the collected data, Sinks subscribe to events of interest. • Many sinks and only few sources, target tracking.

Outline

Directed Diffusion (Two-Phase Pull) • Consists of three phases: • Interest Propagation • Data Propagation • Reinforcement * [5] C. Intanagonwiwat, R Govindan and D. Estrin , 2000

Outline

Sensor Protocols for Information via Negotiation SPIN (Push-Diffusion) • Data sources initiate the data-sending activities. • Consists of three-stage handshaking: • Advertisement (metadata). • Request for data. • Data Message. * [7] Joanna Kulik, Wendi Heinzelman and Hari Balakrishnan, 2002

Outline

A Novel Real-Time Routing Protocol Assumptions: • Network is Data-Centric. • Sensor know its energy. • Sensors has IDs. • Real-Time Route Tree • Alternate suboptimal routes, slower. • Route monitoring and reporting algorithm. • None of the routes are used all the time. * [6] Li-Ming He, Xi’an , 2009

Outline

Minimum-Latency Aggregation Protocols Assumptions: • Interference Radius (p) = 1 • Communication topology routed at the sink. • Synchronous time-slot communication. • Node transmits a Max of one packet of a fixed size in each time slot. • Children nodes must transmits first before their parents can transmits. p 1 s • [18] Peng-Jun Wan, Scott C.-H. Huang, Lixin Wang, Zhiyuan Wan, Xiaohua Jia 2009

Minimum-Latency Aggregation Protocols Development History • Minimum-Latency for p = 1: • (Δ-1)R 2005 • 23R + Δ – 18 2007 • 15R + Δ – 4 2009 SAS • 2R + O(log R) + Δ 2009 PAS p: Interference Radius Δ: Maximum degree of communication topology R: Radius of communication topology, maximum hop distance s R Unit-Disk Graph (UDG) • [18] Peng-Jun Wan, Scott C.-H. Huang, Lixin Wang, Zhiyuan Wan, Xiaohua Jia 2009 • [19] www.wikipedia.org (Graphs Only)

Connected Dominating Sets (CDS) Construction Phase One: Constructs DS U • Maximal Independent Set (MIS) Phase Two: Connectors Selection W There is an edge between two dominators iff they have a common neighbor • UυW is a CDS * [18] Peng-Jun Wan, Scott C.-H. Huang, Lixin Wang, Zhiyuan Wan, Xiaohua Jia 2009 * [19] www.wikipedia.org (Graphs Only)

Iterative Minimal Covering (IMC) Y = { y1, y2, y3, y4, y5 } X = { x1, x2, x3, x4, x5, x6, x7 } A= { } y1 y2 y3 y4 y5 ( x4 , y3 ) ( x6 , y5 ) A= { ( x1 , y2 ) ( x2 , y2 ) ( x5 , y5 ) ( x3 , y2 ) ( x7 , y5 ) } ℓ( x1 , y2 ) = 1 ℓ( x4 , y3 ) = 1 ℓ( x6 , y5 ) = 1 1 2 3 1 1 3 2 ℓ( x2 , y2 ) = 2 ℓ( x5 , y5 ) = 2 x1 x2 x3 x4 x5 x6 x7 ℓ( x3 , y2 ) = 3 ℓ( x7 , y5 ) = 3 * [18] Peng-Jun Wan, Scott C.-H. Huang, Lixin Wang, Zhiyuan Wan, Xiaohua Jia 2009

Canonical Breadth-First-Search (CBFS) v0 3 Parent Rank Assignment: If v has no Child  Rank (v) = 0 If v has only 1 Child  Rank (v) = r If v has more than 1 Child  Rank (v) = r+1 r: The maximum rank of a parent’ children 1 1 2 1 2 2 v1 0 2 2 1 0 1 2 1 1 1 1 v2 0 2 2 0 0 1 2 1 1 1 1 2 v3 1 0 2 1 0 0 1 1 1 1 1 1 v4 1 0 2 0 1 0 1 2 1 2 1 2 v5 0 0 1 1 0 0 1 1 2 1 2 1 v6 (R’) 0 0 0 0 0 0 * [18] Peng-Jun Wan, Scott C.-H. Huang, Lixin Wang, Zhiyuan Wan, Xiaohua Jia 2009

Pipelined Aggregation Scheduling (PAS) v0 3 • Link Time Slot = • (R’ – i) + 44j + 4(ℓ – 1) • Where: • i = radius • 0 ≤ i ≤ R’ • j = node rank(s) • 0 ≤ j ≤ r • ℓ = link label 1 1 2 1 2 2 v1 0 2 2 1 0 1 5 93 97 49 9 53 2 1 1 1 1 1 v2 0 2 2 0 0 1 4 92 92 4 8 48 2 1 1 1 1 2 v3 1 0 2 1 0 0 47 3 91 51 3 7 1 1 1 1 1 1 v4 1 0 2 0 1 0 46 2 90 2 46 2 1 2 1 2 1 2 v5 1 5 45 49 1 5 0 0 1 1 0 0 (6-4) + 44(2) + 4(1-1) = 90 1 1 2 1 2 1 v6 (R’) (6-6) + 44(0) + 4(1-1) = 0 0 0 4 0 4 0 0 0 0 0 0 0 * [18] Peng-Jun Wan, Scott C.-H. Huang, Lixin Wang, Zhiyuan Wan, Xiaohua Jia 2009

Conclusion • Data gathering and aggregation in wireless sensor networks can be classified based on: • Data Storage: External, Local and Data-Centric. • Network Architectural: • Flat: Two-Phase Pull Diffusion, One-Phase Pull Diffusion, and Push Diffusion. • Hierarchical: Tree, Grid, Cluster and Chain. • Resources: Maximum Lifetime, Data Reliability, and Minimum Latency. • There are several algorithms that will deliver an optimal performance for a given application. • There is no ONE algorithm that will work for all applications. • Data gathering and aggregation algorithms advance in recent years as a result of the big improvement in electronic design.

Questions?

Q1: Mapping an event to a DIM zone. • The Distributed Index for Multidimensional Data (DIM) algorithm builds an in-network distributed data structure to effectively store multi-attribute events, and also effectively answer multidimensional range queries. • The algorithm divides the area of interest to several zones, and then uses a hash function to map a multi-attribute event to a geographic zone. • The hashing scheme assigns a k bit zone code to an event as follows: • For i between 1 and m (m is the total number of attributes), if Ai < 0.5, the i-th bit of the zone code is assigned 0, else 1. For i between m + 1 and 2m, if Ai−m < 0.25 or Ai−m ∈ [0.5, 0.75), the i-th bit of the zone is assigned 0, else 1, because the next level divisions are at 0.25 and 0.75 which divide the ranges to [0, 0.25), [0.25, 0.5), [0.5, 0.75), and [0.75, 1). We repeat this procedure until all k bits have been assigned. • Using the DIM algorithm explained in the lecture, show where the following event will be stored in the DIM zones? • Temperature =0.9 and • Humidity = 0.4 • The event was initiated from the node located at zone 000. • What would be the answer if the event passed through a node located at zone 1001?

Q1 Answer < 0.9, 0.4 > 0.9 > 0.5? 1 < 0.9, 0.4 > 0.4 > 0.5? 0 < 0.9, 0.4 > 0.9 > 0.75? 1 Answer: 101 < 0.9, 0.4 > 0.4 > 0.25? 1 Answer: 1011 < 0.9, 0.4 >

Q2: Applying the IMC Algorithm • The Iterative Minimal Covering (IMC) algorithm is used to construct a spanning inward s-arborescence tree, which is associated with a link labeling. • The IMC algorithm takes as an input a pair (X,Y) of disjoint subsets X and Y, satisfying that X is covered by Y and outputs a single-hop (X,Y)-aggregation schedule. • Using the IMC algorithm explained in the lecture (algorithm outline is provided below [18]), provide the minimum covering set of Y with the associated link labels? y1 y2 y3 y4 y5 x1 x2 x3 x4 x5

Q2 Answer Y = { y1, y2, y3, y4, y5 } X = { x1, x2, x3, x4, x5 } y1 y2 y3 y4 y5 A= { } ( x3 , y3 ) ( x2 , y1 ) A= { ( x1 , y1 ) ( x4 , y3 ) ( x5 , y1 ) } ℓ( x1 , y1 ) = 1 ℓ( x3 , y3 ) = 1 1 1 2 2 3 ℓ( x2 , y1 ) = 2 ℓ( x4 , y3 ) = 2 x1 x2 x3 x4 x5 ℓ( x5 , y1 ) = 3

Q3: Applying the PAS Algorithm a) In the Pipeline Aggregation Scheduling (PAS) protocol, each sensor node is assigned a specific time slot based on its node rank, communication radius, and link label. The link label indicated on the following graph has been calculated using the IMC algorithm. Use the PAS algorithm to calculate the rank of each node using the following set of rules: If v has no Child  Rank (v) = 0 If v has only 1 Child  Rank (v) = r If v has more than 1 Child  Rank (v) = r+1 Where r: The maximum rank of a parent’ children b) Then use the following equation to assign a time slot to each sensor node. Link Time Slot = (R’ – i) + 44j + 4(ℓ – 1) Where: • i = radius 0 ≤ i ≤ R’ • j = node rank(s) 0 ≤ j ≤ r • ℓ = link label • R: Radius of communication topology c) Based on your answer for part (b) what is the advantages and disadvantages of the PAS algorithm? v0 1 2 1 v1 2 1 1 v2 1 2 1 v3

Q3 Answer a) By applying the set of rules mentioned in the question, the node ranks can be easily found. See red numbers inside each node repents the node rank. b) Using the node ranks from part (a) and the equation mentioned in the question with R’ = 3, all time slots can be calculated as represented by the green numbers in the following graph. c) Although the number of sensor nodes are very small, the total number of time slots to complete data aggregation is 50, which indicates that the PAS algorithm is not suitable for a network with small communication radius. The advantage of using the PAS algorithm is that the sink node will start receiving data after 2 time slots only, which is due to the pipeline algorithm that increases the network throughput. v0 2 1 2 1 v1 1 1 46 50 2 0 2 1 1 v2 1 0 0 45 1 5 1 2 1 v3 0 0 0 0 4 0

References • Chapter Book “Data Gathering and Fusion in Sensor Networks” by Wei-Peng Chen and Jennifer C. Hou, 2005 • Presentation “Data Gathering and Aggregation in Wireless Sensor Networks” by Ivan Stojmenovic. • Technical Paper ”Multi-Dimensional Range Queries in Sensor Networks” by Xin Li, Young Jin Kim, RameshGovindan, and Wei Hong, 2003 • Technical Paper “A Decentralized Storage Scheme for Multi-Dimensional Range Queries Over Sensor Networks” by Lei Xie, Lijun Chen, Daoxu Chen, Li Xie, 2009 • Technical Paper “Direct Diffusion: a Scalable and Robust Communication Paradigm for Sensor Networks” by C. Intanagonwiwat, R Govindan and D. Estrin, 2000 • Technical Paper “A Novel Real-Time Routing Proto col for Wireless Sensor Networks” by Li-Ming He, Xi’an, 2009 • Technical Paper “Negotiation-Based Protocols for Disseminating Information in Wireless Sensor Networks” by Joanna Kulik, Wendi Heinzelman and HariBalakrishnan, 2002 • Technical Paper ”Minimum-Energy Asynchronous Dissemination to Mobile Sinks in Wireless Sensor Networks” by HyungSeok Kim, Tarek F. Abdelzaher, Wook Hyun Kwon, 2003

References Cont. • Technical Paper ”A Two-Tier Data Dissemination Model for Large-scale Wireless Sensor Networks” by Fan Ye, Haiyun Luo, Jerry Cheng, Songwu Lu, Lixia Zhang, 2002 • Technical Paper “Spiral Grid Routing for Load Balance in Wireless Sensor Networks” by Chiu-Kuo Liang and Chih-Shiuan Li, 2009 • Technical Paper ”Energy-Efficient Communication Protocol for Wireless Microsensor Networks” by Wendi RabinerHeinzelman, AnanthaChandrakasan, and HariBalakrishnan, 2000 • Technical Paper “Adaptable Protocol for Time Critical Information Dissemination via Negotiation in Large Scale Wireless Sensor Networks” by M. Tabibzadeh, M. Sarram, M. Ghasemzadeh, 2009 • Technical Paper “Data Gathering Algorithms in Sensor Networks Using Energy Metrics” by S. Lindsey, C. Raghavendra, and K. M. Sivalingam, 2002 • Technical Paper “TAG: A Tiny Aggregation Service for ad-hoc Sensor Networks” by Samuel Madden, Michael J. Franklin, Joseph Hellerstein, and Wei Hong, 2002 • Technical Paper “Energy-Efficient Wake-Up Scheduling for Data Collection and Aggregation” by Yanwei Wu, Xiang-Yang Li, YunHao Liu, Wei Lou. 2010 • Technical Paper “An Evaluation of Overhearing-based Data Transmission Reduction in Wireless Sensor Networks” by YuukiIima, Akimitsu Kanzaki, Takahiro Hara, and ShojiroNishio., 2009

References Cont. • Technical Paper ” AIDA: Adaptive Application-Independent Data Aggregation in Wireless Sensor Networks” by TIAN HE, BRIAN M. BLUM, JOHN A. STANKOVIC and TAREK ABDELZAHER, 2004 • Technical Paper ”Minimum-Latency Aggregation Scheduling in Multihop Wireless Networks” by Peng-Jun Wan, Scott C.-H. Huang, Lixin Wang, Zhiyuan Wan, XiaohuaJia 2009 • www.wikipedia.org (Graphs Only)

Data Gathering and Aggregation in Wireless Sensor Networks

Data Gathering and Aggregation in Wireless Sensor Networks

Presentation Transcript

Secure Data Aggregation in Wireless Sensor Networks

Compressive Data Gathering for Large-Scale Wireless Sensor Networks

“ Data Gathering over Underwater Wireless Sensor Networks ”

Compressive Data Gathering for Large-Scale Wireless Sensor Networks

Energy Efficient Spanning Tree for Data Aggregation In Wireless SENSOR NETWORKS

RCDA: Recoverable Concealed Data Aggregation for Data Integrity in Wireless Sensor Networks

Constructing Load-Balanced Data Aggregation Trees in Probabilistic Wireless Sensor Networks

In-Network Data Aggregation in Wireless Sensor Networks

Wireless Sensor Networks Data and Databases

Secure Data Aggregation in Wireless Sensor Networks: A Survey

Secure In-Network Aggregation for Wireless Sensor Networks

Rateless Packet Approach for Data Gathering in Wireless Sensor Networks

LPT for Data Aggregation in Wireless Sensor networks

An Efficient Clustering-based Heuristic for Data Gathering and Aggregation in Sensor Networks

Adaptive Data Aggregation for Wireless Sensor Networks

Data Gathering Tours in Sensor Networks

Energy-Efficient Data Gathering in Wireless Sensor Networks with Asynchronous Sampling

Bounded relay hop mobile data gathering in wireless sensor networks

Efficient clustering-based data aggregation techniques for wireless sensor networks

Toward Optimal Data Aggregation in Random Wireless Sensor Networks

Data Aggregation In Wireless Sensor Networks

Data funneling : routing with aggregation and compression for wireless sensor networks