420 likes | 447 Views
This study proposes a practical game-theoretic model to analyze the cooperation between nodes in a mobile ad hoc network (MANET), taking into account measurable parameters and feasible protocol modifications. The objective is to maximize network utility while considering the power cost imposed by packet relay on the relay nodes.
E N D
A Proposed Game-Theoretic Model of Cooperation between Nodes in a MANET Jim Catt ECE 695 Department of Electrical and Computer Engineering Purdue School of Engineering and Technology Spring 2006
Introduction and Motivation • In mobile ad hoc networks (MANET), nodes in the network must provide some level of relay service to other nodes in the network to achieve optimal global efficiency of network operation. • However, packet relay imposes a power cost on the relaying node. • Since MANET nodes are often battery powered, this is costly shortens node lifetime. • The most rational local strategy for each node is not to cooperate and only transmit its own packets
Introduction and Motivation • If all nodes adopt this locally rational strategy, network connectedness drops to zero. • All nodes lose in this case – nodal utility drops to zero • Yet, if each node cooperates, there is the possibility to maximize the utility of all nodes. • This is a classical game theory scenario • Game theory has been utilized to analyze several aspects of MANET operation • This project is restricted to analysis of cooperation
Objective • The objective of this work is to develop a practical game-theoretic model of nodal cooperation that uses measurable, realistic parameters to make strategy choices, and when combined with feasible protocol modifications, can be reasonably implemented in MANET nodes.
Prisoner’s Dilemma • The Prisoner’s Dilemma is often used as pedagogic example of game theory • Preliminaries • Player – an entity with preferences • Strategy – A set of actions available to a player, in response to the strategy of other players • Outcome – The result of complete set of strategic choices by all players in the game • Utility - the amount of welfare a player derives from an outcome (or strategy) • Often expressed as a utilityfunction, a mathematical mapping of the welfare received by the player from an outcome. • Payoff – Usually formulated as: p = utility - cost
Prisoner’s Dilemma • The Prisoner’s Dilemma scenario: • Two people are arrested for armed robbery • Not enough evidence to convict for armed robbery, but enough to convict for theft of getaway car • Each prisoner is given the following choices: • You confess and implicate your partner, but your partner doesn’t confess, you go free, she gets ten years in prison • If you both confess, both get 5 years in prison • If neither confesses, both get 2 years for auto theft. • Utility (payoff) mapping: • Go free 4 • 2 years 3 • 5 years 2 • 10 years 0
Prisoner’s Dilemma • The game can be represented in strategic form by a matrix: • The prisoners are separated and cannot communicate. • What will they decide? Prisoner 2 Prisoner 1
Prisoner’s Dilemma • Consider one prisoner at a time • For a specific strategy – either defect or cooperate – there are two possible payoffs • Which strategy offers the best set of potential payoffs? Or, equivalently, which strategy maximizes the minimum payoff? Prisoner 2 Prisoner 1
Prisoner’s Dilemma • (Defect, Defect) is an equilibrium solution to the game (Nash Equilibrium) • However, this clearly isn’t the optimal solution, which is (Cooperate, Cooperate). • Hence, a Nash equilibrium isn’t necessarily an optimal solution to a game !!! Prisoner 2 Prisoner 1
Strategies • Types of strategies: • Pure Strategy – a player chooses to play a certain strategy with probability 1. Usually only encountered in games of perfect information. • Mixed Strategy – a player has a set of strategies to choose from. A probability distribution describes the likelihood that a particular strategy will be chosen.
Game Theory and Cooperation in MANETs • Classical game theory models for cooperation in MANETs: • economic payment model • punishment/reward model. • Regardless of model, there is little consistency in the formulation of utility functions. • Many formulations employ abstractions for utilities and costs (less practical) • Some are based on some energy measure (more practical). • Many require extraordinary overhead in the exchange of information between nodes
Proposed Approach • Premise: the basic resource available to a node is its lifetime store of energy battery life. • This resource is available to be consumed for either computational functions or information exchange functions, both part of “mission” execution • Node behavior obtain a balance between: • achieving maximum lifetime • executing its mission.
Proposed Approach: Ground Rules • Sending and receiving packets requires cooperation. • Payment is in-kind (punishment/reward framework) • Payoff should be proportional to the benefit received. • Cost for cooperation: • decrease in potential lifetime, or • alternately, lost opportunity to transmit own packets in the future.
Problem Formulation • Dual objectives : • Maximization of the lifetime function • Subject to maintaining reward (R) 0. • Assumptions and conventions • Slotted communication intervals of fixed length • Packet length L is fixed for this study. • Data (symbol) rate Rb is fixed for this study. • One packet time = Tp = L/Rb.
Assumptions and Conventions (cont.) • On average, a node is connected to two or more adjacent nodes • nodes are uniformly distributed throughout the region of interest, and • The average mobility of the network is sufficiently high such that no node is confined to an edge or border region for long periods of time
Restrictions • Only selfish nodes are considered, not malicious nodes • The proposed approach is for steady state conditions. • Modification for startup conditions requires further study. • Energy consumption associated with packet reception is ignored because even a selfish node will listen for its own packets.
Playing the Game • A node has a relay buffer and own buffer. • At each slot time, a node plays a mixed strategy, and may choose from the following action set: • Neither transmit nor relay • Transmit its own packet, given a packet is available in its own buffer • Relay a received packet, given that a packet is available in its relay buffer. • For this version of the game, the node will not transmit if: • both its own buffer and its relay buffer is empty. • either sending its own packet or relaying a packet causes its cumulative payoff to be negative for the current slot time
Playing the Game • PR = probability that node i relays a packet. • PO = probability that node i sends its own packet. • R = payoff received by node i when it relays a packet • O = payoff received by node i when it sends its own packet • The expected payoff (reward) for node i, is: • A rational node will act to maintain cumulative R 0. Or: • Equality with zero is allowed because temporarily, the only strategy available to node i may cause R = 0.
Definitions • Definitions • Total available energy at t=0 is ET. • k = 1,2…,N, the number of packets relayed by node i for other nodes • m = 1,2…,M the number of own packets transmitted by node i. • The total number of relay nodes (end-to-end) required for node i’s m-th packet, is a random variable, • j = 0,1,2…,J, set of links to adjacent nodes • The power used to transmit the m-th packet over the j-th link is a random variable denoted by: • The energy used to transmit the m-th packet over the j-th link is given by: • Denote relay energy as Er, and energy used to transmit own packet as Eo.
Energy usage function • Average CPU power is Wcpu. • At time t, the total energy remaining for node i is:
Lifetime function • The maximum possible lifetime is: • Maximum remaining lifetime at time t is:
Payoff functions • Payoff = utility - cost.
Constructing PR and PO • PR and PO give the strategy rule that can be used by the node to pick its strategy at each slot time. • PR and PO should be proportional to the payoffs received by node i, and the level of cooperation received by node i. • Define V as a measure of the relationship between the payoffs, or, the ratio of the absolute values of the payoffs: • The expected payoff R becomes :
Constructing PR and PO • Define the following events: • AQR = the event that there is a packet in the relay buffer • AQO = the event that there is a packet in own transmit buffer • AR = the event that a packet is relayed • AO = the event that own packet is transmitted • AT = the event that a packet is transmitted, either a relayed packet or own packet • ARS = the event that a relayed packet successfully reaches its destination • AOS = the event that the node’s own packet successfully reaches its destination
Constructing PR and PO • Assertions: • The relevant event space is AT = (ARU AO) • PO = P(AO|AT), and PR = P(AR|AT) • PO + PR = 1 • From AQR AQO AOS ARS AO AR
Constructing PR and PO • The cooperation experienced by node i for relay of its own packets is P(AOS|AO). • Define the weighted payoff, O’ and weighted V’: • as P(AOS|AO) 0, V’0, PO1, PR0. • as P(AOS|AO) 1, V’, PO, and PR all approach equilibrium values
Strategy Rule parameters • Define β as an estimate of P(AOS|AO): • Define an estimate of PR: • update each parameter prior to each new slot time
Strategy Rule parameters • Define the cumulative reward up to the current slot time: • Define the candidate updates for RC: • Define :
Strategy Rule algorithm • If AQR=1, calculate R,k+1 and RR. • If AQO=1, calculate O,m+1 and RO. • if (AQO=1 & AQR=0), • if O > 0, then AO=1 (send own packet), elsedo nothing • else if (AQO = 0 & AQR=1), • if RR >= 0, then AR=1 (accept relay request), elsedo nothing • else if (AQO = 1 & AQR=1), • ifthen AR=0 (reject relay), and if O >0, AO=1 (send own packet), elsedo nothing • else if RR >= 0, then AR=1 (accept relay request) • else if O >0, then AO=1 (send own packet) • else do nothing • end • update β, PR and RC.
Strategy Rule algorithm • This algorithm can be applied on a global basis (no discrimination between nodes requesting relays) or on a node-by-node basis (a β parameter is calculated for each node).
Proposed Protocol Modifications for Own Packets • Routing Tables • For AODV, routing tables are modified to include all nodes on the path to the destination. However, the current routing method is still employed (i.e. next hop routing). • No change to DSR for path routing list • Furthermore, the routing table is modified by adding two fields to hold values that are used to estimate cooperation from other nodes. • NUM_PKT_OFFERED • NUM_PKT_ACCEPTED • These fields can be used to estimate each node’s unique β if distinguishing between nodes achieves better fairness. • Otherwise, when summed over all nodes, they can be used to calculate a global β
Proposed Protocol Modifications for Own Packets • Transport protocol must support an ACK mechanism in order to estimate P(AOS|AO) • A destination node k sends an ACK for each packet successfully received from node i (i.e., use a wireless, pseudo connection-oriented transport protocol) • To reduce overhead, an ACK could be applied to a block of packets, where block size is adjustable
Implementation for Own Packets • When node i transmits its own packet to destination node k: • If node j is an intermediate (relay) on the path to node k • NUM_PKT_OFFEREDj =+1. • If an ACK is received from node k, • NUM_PKT_ACCEPTEDj =+1 • If ACK timer expires, execute normal transport protocol congestion adaptation • If RERR is received for node k before ACK time out, • NUM_PKT_OFFEREDj =-1.
Summary • Developed payoff functions that include parameters incorporating energy usage and cooperation level. • Can be calculated from available or reasonably measurable information, or from minor modifications to protocol • Developed a stochastic decision rule based on modified payoff functions, thereby taking into account the influence on battery life and cooperation • Proposed minor protocol modification and routing table modification that enable the strategy rule. • Developed an algorithm implementing the strategy rule
Future Work • Formally verify that the proposed approach achieves a stable and optimal or pseudo-optimal equilibrium. • Alternately, prove that the proposed framework is Pareto-efficient. • Test the model using a network simulation tool to verify that: • it achieves optimality • it is stable • it is insensitive to noisy β and estimate of PR • the proposed protocol modifications are viable and do not add unacceptable overhead cost. • Develop a better method to estimate P(AOS|AO), as the estimator should take into account the impact of packet loss due to congestion or noise, i.e., remove or reduce the influence of these effects on β. • β may also need smoothing to account for lag in feedback • Develop modifications to the model that take into account start up conditions
References [1] J. Eichberger, “Game Theory for Economists”, Academic Press, Inc., San Diego, 1993. [2] Selwyn Yuen and Baochun Li, “Strategyproof Mechanisms towards Evolutionary Topology Formation in Autonomous Networks,” IEEE. [3] Haijin Yan and David Lowenthal, “Towards Cooperation Fairness in Mobile Ad Hoc Networks,” IEEE, WCNC 2005, pp. 2143-2148. [4] V. Srinivasan, P. Nuggehalli, C.F. Chiasserini, R.R. Rao,”Cooperation in Wireless Ad Hoc Networks,” IEEE Infocom 2003. [5] M. Felegyhazi, J-P. Hubaux, L. Buttyan,”Nash Equilibria of Packet Forwarding Strategies in Wireless Ad Hoc Networks,” IEEE Transactions on Mobile Computing, Vol. 5, No. 5, May 2006. [6] L. DaSilva and V. Srivastava, “Node Participation in Ad Hoc and Peer-to-Peer Networks: A Game-Theoretic Formulation,” Dept. of Electrical and Computer Engineering, Virginia Tech. University. [7] V. Srivastava, J. Neel, A.B. MacKenzie, R. Menon, L.A. DaSilva, J.E. Hicks, J.H. Reed, R.P. Gilles,”Using Game Theory to Analyze Wireless Ad Hoc Networks,” Mobile and Portable Radio Research Group, Virginia Tech. University. [8] K. Chen and K. Nahrstedt,”iPass: an Incentive Compatible Auction Scheme to Enable Packet Forwarding Service in MANET,” IEEE ICDCS 2004. [9] A.B. MacKenzie and S.B. Wicker, “Game Theory and the Design of Self-Configuring, Adaptive Wireless Networks,” IEEE Communications Magazine, November 2001. [10] P. Michiardi and R. Molva,”A Game Theoretic Approach to Evaluate Cooperation Enforcement Mechanisms in Mobile Ad hoc Networks,” Institut Eurecom, Sophia-Antipolis, Fr.
Utility functions • The utility function for a node transmitting its own packet is: • Utility has units of hops per joule. Maximizing utility with regard to resource usage also maximizes remaining lifetime.
Utility associated with relaying a packet • When node i relays a packet for node j, it should receive a benefit (utility) that is proportional to the utility accrued to node j. • Let hj be the total number of relay nodes required for j’’s packet. Node i’’s share of the utility accrued to j is:
Cost functions • The cost incurred by node i for either transmitting its own packet or relaying a packet is the incremental decrease in its potential future utility. • The incremental cost in lifetime for relaying a packet is:
Cost functions • Likewise, the incremental cost in lifetime for transmitting own packet is: • Let be the average utility received by node i in one packet time as a result of transmitting one of its own packets. Then, the incremental utility cost to node i when it relays a packet is proportional to the incremental cost in lifetime:
Cost functions • Likewise, the incremental utility cost to node i for transmitting its own packet is: