Clustering in Ad hoc and Sensor Networks

Clustering in Ad hoc and Sensor Networks

Why Clustering? • The data collected by each sensor is communicated through the network to a single processing center that uses the data • Clustering groups nodes into groups such that each node communicate information only to clusterheads and then the clusterheads communicate the aggregated information to the processing center, saving energy and bandwidth • The cost of transmitting a bit is higher than a computation; therefore, it may be beneficial to organize the sensors into clusters • Cluster-based control structures provides more efficient use of resources for large dynamic networks • Clustering can be used for • Transmission management • Backbone formation • Routing Efficiency

Link-Clustered Architecture[Baker+ 1981a, 1981b, Ephremides+ 1987] • Reduces interference in multiple-access broadcast environment • Distinct clusters are formed to schedule transmissions in a contention-free way • Each cluster has a clusterhead, one or more gateways and zero or more ordinary nodes • Clusterhead schedules transmission and allocates resources within its cluster • Gateways connect adjacent clusters • To establish link-clustered control structure • Discover neighbors • Select clusterhead to form clusters • Decide on gateways between clusters

Cluster Clusterhead Gateway Ordinary node Link-Clustered Architecture[Baker+ 1981a, 1981b, Ephremides+ 1987]

Clusterheads • Resemble base stations in cellular networks, but dynamic • Responsible for resource allocation • Maintains network topology • Acts as routers – forwards packets from one node to another • Aware of its cluster members • Aware of its one-hop neighboring clusterheads Since clusterheads decide network topology, election of clusterheads optimally is critical

Previous Work • Highest-Degree Heuristic [Gerla+ 1995, Parekh 1994] • Computes the degree of a node based on the distance (transmission range) between the node and the other nodes • The node with the maximum number of neighbors (maximum degree) is chosen to be a clusterhead and any tie is broken by the node ids Drawbacks: • A clusterhead cannot handle a large number of nodes due to resource limitations • Load handling capacity of the clusterhead puts an upper bound on the node-degree • The throughput of the system drops as the number of nodes in cluster increases

Previous Work • Lowest-ID Heuristic [Baker+ 1981a-b, Ephremides+ 1987] • The node with the minimum node-id is chosen to be a clusterhead • A node is called a gateway if it lies within the transmission range of two or more clusters • Distributed gateway is a pair of nodes that reside within different clusters, but they are within the transmission range of each other Drawbacks: • Since it is biased towards nodes with smaller node-ids, leading to battery drainage • It does not attempt balance the load for across all the nodes

Previous Work • Node-Weight Heuristic [Basagni 1999a, 1999b] • Node-weights are assigned to nodes based on the suitability of a node being a clusterhead • The node is chosen to be a clusterhead if its node-weight is higher than any of its neighbor’s node-weights and any tie is broken by the minimum node ids Drawbacks: • No concrete criteria of assigning the node-weights • Works well for “quasi-static” networks where the nodes do not move much or move very slowly

Optimizing Clustering Algorithm in Mobile Ad hoc Networks Using Genetic Algorithmic Approach [Turgut+ 2002] Weighted Clustering Algorithm (WCA) • Aclusterhead can ideally support nodes • Ensures efficient MAC functioning • Minimizes delay and maximizes throughput • A clusterhead uses more battery power • Does extra work due to packet forwarding • Communicates with more number of nodes • A clusterhead should be less mobile • Helps to maintain same configuration • Avoids frequent WCA invocation • A better power usage with physically closer nodes • More power for distant nodes due to signal attenuation

Weighted Clustering Algorithm (WCA) Steps 1. Compute the degreedv each node v Coordinate distance, predefined transmission range. • Compute the degree-differencefor every node For efficient MAC (medium access control) functioning. Upper bound on # of nodes a cluster head can handle.

3 2 12 13 4 1 14 17 15 7 16 5 6 Weighted Clustering Algorithm (WCA) Steps 3. Compute the sum of the distancesDv with all neighbors Energy consumption; more energy for greater dist. communication. Power required to support a link increases faster than linearly with distance.(For cellular networks)

Yt Yt-1 time Xt-1 Xt Weighted Clustering Algorithm (WCA) Steps 4. Compute the average speed of every node; gives a measure of mobilityMv where and are the coordinates of the node at time and Component with less mobility is a better choice for clusterhead.

Weighted Clustering Algorithm (WCA) Steps • Compute the total (cumulative) timePv a node acts as clusterhead Battery drainage = Power consumed 6. Calculate the combined weightWv for each node Wv = w1Δv + w2Dv + w3Mv + w4Pvfor each node 7. Find min Wv; choose node v as the cluster head, remove all neighbors of v for further WCA • Repeat steps 2 to 7 for the remaining nodes

Load Balancing Factor (LBF) • It is desirable to balance the loads among the clusters • Load balancing factor (LBF) has defined as (should be high) where, is the number of clusterheads is the cardinality of cluster i and is the average number of neighbors of a clusterhead (N being the total number of nodes in the system)

Connectivity • For clusters to communicate with each other, it is assumed that clusterheads are capable of operating in dual power mode • A clusterhead uses low power mode to communicate with its immediate neighbors within its transmission range and high power mode is used for communication with neighboring clusters • Connectivity is defined as (for multiple component graph) • Probability that a node is reachable from any other node ( 0 – 1; 1 being most desirable)

Demonstration Scattered nodes in the network

Demonstration Clusterheads are identified

Demonstration Clusters are formed

Demonstration Clusters are connected

Features of WCA • Invocation of WCA is on-demand • Reduces information exchange by less system updates • Reduces computation/communication costs • Manages mobility by reaffiliations • Delays (avoids) invocation of clustering as far as possible • WCA is distributive • No clusterhead is over loaded • Balances load by limiting the cluster size

Performance Metric • Number of clusterheads • Number of reaffiliations • a process where a node detaches from one clusterhead and attaches to another • Number of dominant set updates • when a node can no longer attach to any of the existing clusterheads These parameters are studied for the varying number of nodes transmission range maximum displacement

Simulation Environment • System with N nodes on a 100x100 grid • N was varied between 20 and 60 • Nodes moved in all directions randomly • Velocity of nodes were varied uniformly between 0 and 10 • Transmission range of nodes was varied between 0 and 70 • Ideal degree was fixed at = 10 • Weighing factors: w1 = 0.7, w2 = 0.2, w3 = 0.05 and w4 = 0.05

Max displacement = 5 (const) Transmission range = 0 - 70 Number of nodes = 20 - 60 Ideal degree = 10 Experimental Results

Max displacement = 1 - 10 Transmission range = 30 (const) Number of nodes = 20 - 60 Ideal degree = 10 Experimental Results

Load Balancing

Connectivity

Performance of WCA

Genetic Algorithms • Map the possible solutions of the problem to symbolic space • Possible solutions form a pool of solutions – population • Solution strings – chromosomes and components of chromosomes – genes • Genetic Algorithm operations: • Selection • Crossover • Mutation • Replacement • Elitism

1 99 88 44 5 3 15 8 3 7 - - - - - - - - - - - - 1 55 77 6 5 3 2 . . 50 Encoding of the Chromosome • N = # of nodes in the network each with unique node id [1..N] used to encode the chromosome by integer permutation all the ids should be included without any duplication, and without order. For instance: N = 100 , node ids [1..100] Pool size = 50 (50 strings of integers/chromosomes)

WV Values 1 56 12 25 8 7 3 - - - - - - - - - 55 25 34 44 35 64 45 34 22 3 6 1 7 5 3 Node Ids of all nodes 2 … 1 22 5 7 . . . … 3 10 2 . . 3 7 25 12 . . 50 … 5 4 99 100 … 8 63 8 Neighbors list Mapping WCA to GA Data Encoded into chromosomes WCA intermediate results

WV Values Node Id of a ClusterHead … 1 22 5 7 … 7 3 10 . . . . . . . . 25 12 76 2 … 15 55 Mapping WCA to GA ClusterHead Set for a single chromosome

GA Steps 1.Choose Initial Population Randomly generate the initial population. Pool size = 50 (means 50 chromosomes) While (new_pool_size < old_pool_size) repeat step 3 to 6 (repeat step 2 until the number of generation or the convergence is met) 2. Selection Compute the fitness value for each chromosome by WV . Roulette Wheel method is used based on the fitness values. 3. Crossover X_Order1 method is used. Crossover rate = 0.8

GA Steps 4.Mutation Swap method is used; randomly selecting two gene at positions i and j. Mutation rate = 0.1 5. Replacement Append method is used. The new children will be appended into the new pool. 6. Elitism - Check if the new children are better than the best, then replace the best by the child - Avoid being stuck on local optima

Cfit Value Algorithm FitnessValue = 0; 1. For each gene in chromosome repeat step 2 to 3 2. node = gene[I]; 3. if node is not clusterH and is not a member of the other clusterH and Nodedegree <= MAX_DEGREE ( const ) Then it is a clusterH, Compute WV for this node insert it into clusterHSet fitnessValue += WV;

Cfit Value Algorithm 4. For each remaining node I from the network If (it is not a clusterH and member of other clusterH, and NodeDegree <= MAX_DEGREE) then Compute WV for this node insert it into clusterHSet fitnessValue += WV;

Performance Metric • Number of clusterheads • Number of reaffiliations • a process where a node detaches from one clusterhead and attaches to another These parameters are studied for the varying number of nodes transmission range maximum displacement • Load Distribution

Simulation Environment • System with N nodes on a 100x100 grid • N was varied between 20 and 60 • Nodes moved in all directions randomly • Velocity of nodes were varied uniformly between 0 and 10 • Transmission range of nodes was varied between 0 and 70 • Ideal degree was fixed at = 10 • Weighing factors: w1 = 0.7, w2 = 0.2, w3 = 0.05 and w4 = 0.05

Max displacement = 5 (const) Transmission range = 0 - 70 Number of nodes = 20 - 60 Ideal degree = 10 Experimental Results WCA Optimized WCA

Max displacement = 1 - 10 Transmission range = 30 (const) Number of nodes = 20 - 60 Ideal degree = 10 Experimental Results Optimized WCA WCA

Load Balancing with WCA

Load Balancing with GA The load balancing factor has improvement ten times with GA

An Energy Efficient Hierarchical Clustering Algorithm for Wireless Sensor Networks [Bandyopadhyay+, 2003] • This paper proposes a distributed, randomized clustering algorithm to organize the sensors in a wireless sensor network into clusters to minimize the energy used to communicate information from all nodes to the processing center • By the generation of hierarchy of clusterheads, the energy savings increase with the number of levels in the hierarchy • Sensor detects events and then communicate the collected information to a central location where parameters characterizing these events are estimated • In the clustered environment, the data gathered by the sensors is communicated to the data processing center through a hierarchy of clusterheads • The processing center determines the final estimates of the parameters using information communicated by the clusterheads

An Energy Efficient Hierarchical Clustering Algorithm for Wireless Sensor Networks [Bandyopadhyay+, 2003] • The processing center can be a specialized device or one of the sensors itself • In such clustered environment, sensor data is communicated over smaller distances, the energy consumed in the network will be much lower than the energy consumption when every sensor communicates directly to the information processing center • The results in stochastic geometry are used to derive values of parameters for the algorithm that minimize the energy spent in the sensor network

An Energy Efficient Hierarchical Clustering Algorithm for Wireless Sensor Networks [Bandyopadhyay+, 2003] • A New, Energy-Efficient, Single-Level Clustering Algorithm • Each sensor becomes a clusterhead (CH) with probability p and advertises itself as a clusterhead to the sensors within its radio range – these clusterheads are called volunteer clusterheads • This advertisement is forwarded to all the sensors that are no more than k hops away from the clusterhead • Any sensor node that is not clusterhead itself receiving such advertisement joins the cluster of the closest clusterhead • Any sensor node that is neither a clusterhead nor has joined any cluster itself becomes a clusterhead – called forced clusterheads • Since the advertisement forwarding has been limited to k hops, if a sensor does not receive a CH advertisement within time duration t (where t is the time required for data to reach the CH from any sensor k hops away), it means that the sensor node is not within k hops of any volunteer CHs

An Energy Efficient Hierarchical Clustering Algorithm for Wireless Sensor Networks [Bandyopadhyay+, 2003] • A New, Energy-Efficient, Single-Level Clustering Algorithm • Therefore, the sensor node becomes a forced clusterhead • The CH can transmit the aggregated information to the processing center after every t units of time since all the sensors within a cluster are at most k hops away from the CH • The limit on the number of hops allows the CH to reschedule their transmissions • This is a distributed algorithm and does not demand clock synchronization between the sensors • The energy consumed for the information gathered by the sensors to reach the processing center will depend on the parameters p and k • Since the objective of this work is to organize sensors in clusters to minimize the energy consumption, values of the parameters (p and k) must be found to ensure the goal

An Energy Efficient Hierarchical Clustering Algorithm for Wireless Sensor Networks [Bandyopadhyay+, 2003] d / r • A New, Energy-Efficient, Single-Level Clustering Algorithm • Assumptions made for the optimal parameters are as follows: • The sensors are distributed as per a homogeneous spatial Poisson process of intensity λ in 2-dimensional space • All sensors transmit at the same power level – have the same radio range r • Data exchanged between two communicating sensors not within each others’ radio range is forwarded by other sensors • A distance of d between any sensor and its CH is equivalent to hops • Each sensor uses 1 unit of energy to transmit or receive 1 unit of data • A routing infrastructure is in place; when a sensor communicates data to another sensor, only the sensors on the routing path forward the data • The communication environment is contention- and error-free; sensors do not have to retransmit any data

An Energy Efficient Hierarchical Clustering Algorithm for Wireless Sensor Networks [Bandyopadhyay+, 2003] • A New, Energy-Efficient, Hierarchical Clustering Algorithm • This algorithm is extension of the previous one by allowing more than one level of clustering in place • Assume that there are h levels in the clustering hierarchy with level 1 being the lowest level and level h being the highest • The sensors communicate the gathered data to level-1 clusterheads (CHs) • The level-1 CHs aggregate this data and communicate the aggregated data to level-2 CHs and so on • Finally, level-h CHs communicate the aggregated data or estimates based on this aggregated data to the processing center

An Energy Efficient Hierarchical Clustering Algorithm for Wireless Sensor Networks [Bandyopadhyay+, 2003] • A New, Energy-Efficient, Hierarchical Clustering Algorithm • The cost of communicating the information from the sensors to the processing center is the energy consumed by the sensors to communicate the information to level-1 CHs, plus the energy consumed by the level-1 CHs to communicate the aggregated data to level-2 CHs, …., plus the energy consumed by the level-h CHs to communicate the aggregated data to the information processing center • Algorithm Details • The algorithm works in a bottom-up fashion • First, it elects the level-1 clusterheads, then level-2 clusterheads, and so on

An Energy Efficient Hierarchical Clustering Algorithm for Wireless Sensor Networks [Bandyopadhyay+, 2003] • A New, Energy-Efficient, Hierarchical Clustering Algorithm • Algorithm Details • Level-1 clusterheads are chosen as follows: • Each sensor decides to become a level-1 CH with certain probability p1 and advertises itself as a clusterhead to the sensors within its radio range • This advertisement is forwarded to all the sensors within k1 hops of the advertising CH • Each sensor receiving an advertisement joins the cluster of the closest level-1 CH; the remaining sensors become forced level-1 CHs • Level-1 CHs then elect themselves as level-2 CHs with a certain probability p2 and broadcast their decision of becoming a level-2 CH • This decision is forwarded to all the sensors within k2 hops

Clustering in Ad hoc and Sensor Networks