310 likes | 516 Views
Power Control to Mitigate Internetwork Interference in Wireless Body Area Networks. Ramtin Kazemi Department of Electronic Engineering Macquarie University Sydney, Australia. Outline. WBANs Problem Motivation System Model Game Theory Approach Genetic Fuzzy Approach
E N D
Power Control to Mitigate Internetwork Interference in Wireless Body Area Networks Ramtin Kazemi Department of Electronic Engineering Macquarie University Sydney, Australia
Outline • WBANs • Problem Motivation • System Model • Game Theory Approach • Genetic Fuzzy Approach • Reinforcement Learning Approach • Simulation Results • Conclusion
WBAN – Wireless Body Area Network • Definition: • Short-range (<3m) • Ultra low power consumption • Data rate: 10 bps ~10Mbps • Star Topology (One BNC multiple BNs) • Regulatory Compliance for safe in-body communications (transmission power, duty cycle, frequency bands…) • Applications: • Patient real-time monitoring • Affordable healthcare systems • Entertainments and gaming • Sports • Military application • Ubiquitous and pervasive communications
Problem Motivation • Energy is the most valuable resource in WBANs • Inter-network interference: • Reduced SINR • Increased Packet Loss • More Power Consumption • Less Reliability for life critical applications • Solution: Power Control • Achieving a certain goal by using as little power for transmissions as possible • Common Goals: • Maximizing throughput • Minimizing energy consumption per bit
System Model • SINR: • The Power Control Optimization Problem:
Game Theory–based Approach Power Control Game (PCG): • The player imodels the link between the transmitter i and receiver iat a certain time slot. • Strategies are the power levels: • where • The payoff function for each player i is the difference between its utility function and a price function as: • Pricing Parameters: and • Nash Equilibrium existence and uniqueness • Static or Dynamic adjustment of the tradeoff between throughput and power consumption
Fuzzy Power Controller (FPC) • FPC Structure: • Gaussian membership functions: • Inputs: • PI : Interference power level • SINR: The Signal to Interference and Noise Ratio • Pt : The current transmission power level • Output: • The next transmission power level
Fuzzy Power Controller (FPC) - contd • Chromosome Structure: • Real-valued codes for Parameter Genes Integer-valued codes for Rule Genes • Fitness Function: • C:Normalized channel capacity • P:Normalized transmission power level • N: Normalized number of iterations FPC needs to converge to a steady state solution • Recombination Operators: • Rule Genes: N-point crossover and bit inversion mutation • Parameter Genes: Arithmetic crossover and stochastic mutation
RL-based Power Controller (RLPC) • State Vector: • PI(t) :Interference power level • SINR(t) : The Signal to Interference and Noise Ratio • PT(t) : The current transmission power level • Reward function: • C : Normalized channel capacity • P:Normalized transmission power level • RBF Approximators: • ϕi(.) : Radial basis functions: • θi : Weighting coefficient of basis function i • y(x) : Function to be approximated • N : The number of basis functions
Simulation Environment • 10m by 10m room • WBANs move around according to a Random Walk model • One implanted sensor node inside the body at a depth of 50mm • Gateway is located one meter away from the sensor on the body surface • Channel gains: • In-body path from the sensor node to the body surface (CM1) • On-body path from the body surface to the gateway in the same WBAN (CM3) • Off-body path from the body surface to a gateway in another WBAN (CM4).
Simulation Results Average node power consumption versus the number of nodes Average node throughput versus the number of nodes
Simulation Results Average node energy consumption per bit versus the number of nodes
Simulation Results Average node lifetime versus the number of nodes
Conclusion • We have developed a lightweight power control approach in WBANs based on reinforcement learning. • We compared the performance of the proposed approach, RLPC, with two other approaches, one based on the game theory, namely PCG, and the other one based on the genetic fuzzy systems, namely FPC. • Simulation results shows that RLPC provides a good tradeoff between power consumption and throughput, as it sacrifices only a small amount of channel capacity for a great saving in power consumption. • The energy consumption per bit in RLPC is much less than that of PCG and FPC, as nodes live almost 10 times longer than those in PCG and 5 times longer than those in FPC.
WBAN Sensors • Sensor Architecture: • Sensing • Sampling • Processing • Communicating • Types of Sensors: • ECG (electrocardiogram) sensor for monitoring heart activity • EEG (electroencephalography) sensor for monitoring brain electrical activity • EMG (electromyography) sensor for monitoring muscle activity • Blood pressure sensor • Pulse Oximiter sensor used to measure the blood oxygen saturation (SpO2) • Glucose sensor • Tilt sensor for monitoring trunk position • Activity sensors used to estimate user’s movements
Game Theory • A non-cooperative Game: • There are mplayers in the game as: M= {1,2,...,m} • The global strategy is given by the Cartesian product of all players’ action set: , • The set of payoff functions: • The set of strategies chosen by all players except player i: • Nash Equilibrium: no player has incentive to deviate unilaterally from his strategy given that the other players do not deviate: • Best-Response of player i to the strategies chosen by the other players is the strategy that yields him the greatest payoff:
Reinforcement Learning Problem Agent state xt reward rt action ut Environment rt+1 xt+1 u0 u1 u2 … x0 x1 x2 x3 r1 r2 r3 Goal: Learn to choose actions at that maximize discounted cumulative rewards: r1+r2+2 r3+… where 0<<1 is a discount factor
MDP Model • Finite set of states X and finite set of actions U(xt) • At each time step the agent observes state xtX and chooses action utU(xt), then state changes to xt+1 and receives immediate reward rt+1 • State transition function: xt+1=f(xt ,ut) • Reward function: rt+1=r (xt ,ut) • Next reward and state only depend on current state xt and action ut (Markov property) • Functions f(xt ,ut) and r(xt ,ut) may be non-deterministic • Functions f(xt ,ut) and r(xt ,ut) not necessarily known • Try to learn the best policy pt(x,u): X Uthat maximizes cumulative rewards : Rt=rt+rt+1+ 2rt+2+…starting from any arbitrary state
Non-Deterministic Case • State transition function f(x,u) : • Reward function r(x,u):
Value Functions • State value function denotes the expected reward for starting in state x and following policy p. • Action value function denotes the expected rewards for starting from state x, taking action u and following policy p afterwards.
Bellman Optimality: • Bellman Equations: • Optimal Policy: The policy p* that maximizes Vp(x) or equivalently Qp(x,u):
Realistic Assumtionms • Model-free:Transition probability function and reward function are unknown • So we need to estimate value functions or 2. Large and continuous-space problems:where X and/or U(x) are infinite, exact solutions cannot be found in general - So approximationis necessary
Approximation • Curse of Dimensionality: • Large State-Action Space • Continuous State-Action space • Generalization: Using approximation, the Q-values of each state influence the Q-values nearby states, so the agent can also make reasonable decisions in nearby states. This can help algorithms work well despite using only a limited number of samples.
Fuzzy Logic • Fuzzy Sets: A set that allows its members to have different degree of membership (membership function) in the interval [0,1] • Q: How cool is 36 F° ? • A: It is 30% Cool and 70% Freezing • Fuzzy Operators: • 1- Fuzzy Conjunction (Union) : • µAUB(x) = Max(µA(x),µB(x)) • 2- Fuzzy Disjunction (Intersection): • µA∩B(x)= Min(µA(x),µB(x)) • 3- Complement: • µAc(x) = 1 - µA(x) 0.7 0.3
Fuzzy Control • FIS: Fuzzy Inference System: • A way of mapping an input space to an output space using fuzzy logic • Fuzzy Rules: • “IF x is A THEN y is B“ • x is input (antecedent) • y is output (consequent) • A and B are linguistic values defined by fuzzy sets • FIS Components: • 1. Fuzzification module: transforms the inputs, which are crisp numbers, into fuzzy sets • 2. Knowledge base: stores IF-THEN rules provided by experts • 3. Inference engine: simulates the human reasoning process by making fuzzy inference on the inputs and IF-THEN rules • 4. Defuzzification module: transforms the fuzzy set obtained by the inference engine into a crisp value
FIS Implementation: • Mamdani’s Method: • 1- Evaluate the antecedent for each rule • 2- Obtain each rule's conclusion • 3- Aggregate conclusions • 4- Defuzzification
Genetic Fuzzy System • Genetic Algorithms (GA): • Fitness function: An objective function to be optimized • Chromosomes:Coded potential solution of the problem • Population: A set of chromosomes on which GA operates • Crossover and Mutation: Recombination operators to obtain new individuals • GA Iterative process: 1-Selection 2-Recombination 3-Mutation 4-Evaluation • Genetic Fuzzy System (GFS): • Parameterizing the fuzzy KB, • i.e. rules and membership functions, • transformed into a suitable genetic • representation as chromosomes and • finding those parameter values that are • optimal with respect to the design criteria • using GA