Mean field models of interacting objects: fluid equations, independence assumptions and pitfalls

Mean field models of interacting objects: fluid equations,independence assumptions and pitfalls Jean-Yves Le Boudec EPFL June 2009 1

Abstract We consider a generic model of N interacting objects, where each object has a state and interaction between objects is Markovian, i.e. the evolution of the system depends only on the collection of states at any point in time. This is quite a general modeling framework, which was succesfully applied to model many forms of communication protocols. When the number of objects N is large, one often uses simplifying assumptions called "mean field approximation", "fluid approximation", "fixed point method" or "decoupling assumption". In this tutorial we explain the meaning of these four concepts and show that the first two, namely mean field approximation and fluid approximation are generally valid. However, we also show that the last two, namely fixed point method and decoupling assumption, require more care, as they may not be valid even in simple cases. We give sufficient conditions under which they are valid. We illustrate the concepts with the analysis of the 802.11 WiFi protocol. This slide show is available on my home page under “talks” or directly at http://ica1www.epfl.ch/PS_files/lebSlides.htm 2

Contents A Simple Model of Interacting Objects Fluid Approximation Mean Field Approximation, Fast Simulation Stationary Regime and the Decoupling Assumption The Fixed Point Method Useful Extensions 3

A Simple Model of Interacting Objects Time is discrete N objects Object n has state Xn(t)2{1,…,I} (X1(t), …, XN(t)) is Markov Objects can be observed only through their state N is large, I is small Called “Mean Field Interaction Models” in the Performance Evaluation community (But mean field has other meanings in physics, see later) Example 1:N wireless nodes, state = retransmission stage k Example 2:N wireless nodes, state = k,c (c= node class)Example 3:N wireless nodes, state = k,c,x (x= node location) 4

Example: 2-step malware propagation Mobile nodes are either `S’ Susceptible `D’ Dormant `A’ Active 3 states N nodes Nodes meet pairwise (bluetooth) Possible interactions: Recovery D -> S Mutual upgrade D + D -> A + A Infection by active D + A -> A + A Recovery A -> S Recruitment by Dormant S + D -> D + D Direct infection S -> A 5

Details of the Example To specify the previous model entirely, ( = to be able to simulate it), we need to specify the transition matrix In a compact form, we define probas of each type of transition. Recovery D -> S Mutual upgrade D + D -> A + A Infection by active D + A -> A + A Recovery A -> S Recruitment by Dormant S + D -> D + D Direct infection S -> A • Simulation algorithm: At every time step • Pick one case with prob as given in tablesum of probs is less than 1, possible to do nothing at one step • (case 1) Pick one node uniformly at random among all nodes that are in state ‘D’ • (case 2) Pick one pair of nodes uniformly at random among all pairs nodes that are in state ‘D’ • (case 3) Pick one node among ‘A’ nodes, and one among ‘D’ nodes, each uniformly at random • etc • S, D, A are the numbers of nodes in state `S’, `D’, `A’ 6

Simulation Runs, N=1000 nodes Node 1 State = D State = A State = S Node 2 Node 3 D(t) Proportion of nodes In state i=1 A(t) Proportion of nodes In state i=2 7

Sample Runs with N = 1000 8

What can we do with a Mean Field Interaction Model ? Large N asymptotics = fluid limit Markov chain replaced by a deterministic dynamical system ODE Fast Simulation Issues When valid Don’t want do devote an entire PhD to show mean field limit How to formulate the ODE Large t asymptotic ≈ stationary behaviour Useful performance metric Issues Is stationary regime of ODE an approximation of stationary regime of original system ? Does this justify the “Decoupling Assumption” ? 9

Scaling Assumptions We want to simplify the model for large N, we need scaling assumptions Let WN(t) be the (random) number of objects that do a transition at time slot t when there are N objects Informally, the main scaling assumptions are: The expectation of WN(t) tends to a constant as N grows The second moment of WN(t) remains bounded as N grows i.e., for large N, the probability that this object makes a transition is O(1/N) This is equivalent to a time scale assumption: the time slot duration is O(1/N) 11

Formal Statements Definition: Occupancy MeasureMNi(t) = fraction of objects in state i at time tExample: MN(t) = (D(t), A(t), S(t)) Definition: drift = expected change to MN(t) in one time slot The scaling assumptions are: 12

Writing the Drift Without Error Drift = sum over all transitions of proba of transition£ Delta to the system state MN(t) Can be automated http://icawww1.epfl.ch/IS/tsed 13

Example 14

Fluid Approximation Theorem Under the scaling assumption:stochastic system MN(t) can be approximated by fluid limit (t), solution of the ODE: Rescaled drift of MN(t) 15

Example 16

Fluid limit N = +1 Stochastic system N = 1000 17

Fluid Approximation Theorem Definition: Re-Scaled Occupancy measure [Benaïm, L] : 18

Computing the Mean Field Limit Compute the drift of MN and its limit over intensity 19

Propagation of Chaos Convergence to an ODE implies “propagation of chaos” [Sznitman, 1991] This says that, for large N, any k objects are ≈ independent mean field limit 21

Mean Field Independence At any time t k nodes are asymptotically independent Thus for larget : Prob (node n is dormant) ≈ 0.3 Prob (node n is active) ≈ 0.6 Prob (node n is susceptible) ≈ 0.1 22

Fast Simulation Result A stronger result than propagation of chaos – does not require exchangeability [Tembine, L et al], 2009 Assume we know the state of object n at time 0; we can approximate its evolution by Replacing all other objects collectively by the ODE The state of object n is a jump process, with transition matrix driven by the ODE 23

Example • PNi,j (m) is the transition probability for one object, given that the state of the system is m • Note: Knowing the transition matrix PN(m) is not enough to be able to simulate (or analyze) the system with N objects • Because there may be simultaneous transitions of several objects (on the example, up to 2) • However, the fast simulation says that, in the large N limit, we can consider one (or k) objects as if they were independent of the other N-k • (XN1(t/N), MN(t/N)) can be approximated by the process (X1(t), m(t)) where m(t) follows the ODE and X1(t) is a jump process with time-dependent transition matrix A(m(t)) where 24

AN The state of one objectis a jumpprocesswith transition matrix:where m = (D, A, S) depends on time (is solution of the ODE) 25

Example pdf of node 2 occupancy measure (t) pdf of node 1 pdf of node 3 26

ODEs • Let pNj(t|i) be the probability that a node that starts in state i is in state j at time t: • The fast simulation result says that • With the ODEs:

Computing the Transition Probability PNi,j (m) is the transition probability for one object, given that the state if m 28

The Mean Field Approximation Common in Physics Consists in pretending that XNm(t), XNn(t) are independent in the time evolution equation It is asymptotically true for large N, at fixed time t, for our model of interacting objects Also called “decoupling assumption” (in computer science) 29

Stationary Regime Original system (stochastic): (XN(t)) and (MN(t)) are Markov, finite state space, discrete time Assume either one is irreducible, thus has a unique stationary proba N For large N, how does N relate to the stationary regime of the ODE ? t -> +1 Law of MN(t) N N -> +1 (t) ??? 31

There is one simple case: Assume (H) the ODE has a global attractor m* i.e. () converges to m* for all initial conditions Theorem Under (H) i.e. (1) m* is the limit of N for large N [N = stat. prob. of (X1N(t),…, XNN(t) ] and (2) asymptotic independence (mean field assumption) holds in stationary regime – called “decoupling assumption” Stationary Regime : The Good Case 32

Example In stationary regime: Prob (node n is dormant) ≈ 0.3 Prob (node n is active) ≈ 0.6 Prob (node n is susceptible) ≈ 0.1 Nodes m and n are independent We are in the good case: the diagram commutes t -> +1 Law of MN(t) N N -> +1 N -> +1 m* (t) t -> +1 33

Counter-Example The ODE does not converge to a unique attractor (limit cycle) Assumption H does not hold; does the decoupling assumption still hold ? Same as before Except for one parameter value h = 0.1 instead of 0.3 34

Decoupling Assumption Does Not Hold Here In stationary regime, m(t) = (D(t), A(t), S(t)) follows the limit cycle Assume you are in stationary regime (simulation has run for a long time) and you observe that one node, say n=1, is in state ‘A’ It is more likely that m(t) is in region R Therefore, it is more likely that some other node, say n=2, is also in state ‘A’ This is synchronization R 35

Numerical Example Stationary point of ODE Mean of Limit of N = pdf of one node in stationary regime pdf of node 2 in stationary regime, given node 1 is A pdf of node 2 in stationary regime, given node 1 is D pdf of node 2 in stationary regime, given node 1 is S 36

Where is the Catch ? Fluid approximation and fast simulation result say that nodes m and n are asymptotically independent But we saw that nodes may not be asymptotically independent … is there a contradiction ? 38

The Diagram Does Not Commute • For large t and N:where T is the period of the limit cycle

Generic Result for Stationary Regime Original system (stochastic): (XN(t)) is Markov, finite, discrete time Assume it is irreducible, thus has a unique stationary proba N Let N be the corresponding stationary distribution for MN(t), i.e. P(MN(t)=(x1,…,xI)) = N(x1,…,xI) for xi of the form k/n, k integer TheoremBirkhoff Center: closure of set of points s.t. m2(m)Omega limit: (m) = set of limit points of orbit starting at m 40

Here:Birkhoff center = limit cycle  fixed point The theorem says that the stochastic system for large N is close to the Birkhoff center, i.e. the stationary regime of ODE is a good approximation of the stationary regime of stochastic system 41

At fixed t, the large N limit is deterministic (Dirac) In stationary regime it is not Stationary regime is periodicSampling algorithm for the stationary regime pick t uniformly at random in a period Set m = (t) Example

Quiz • MN(t) is a Markov chain on E={(a, b, c) ¸ 0, a + b + c =1, a, b, c multiples of 1/N} • A. MN(t) is periodic, this is why there is a limit cycle for large N. • B. For large N, the stationary proba of MN tends to be concentrated on the blue cycle. • C. For large N,the stationary proba of MN tends to a Dirac. • D. MN(t) is not ergodic, this is why there is a limit cycle for large N. E (for N = 200)

Take Home Message • For large N the decoupling assumption holds at any fixed time t • It holds in stationary regime under assumption (H) • ODE has a unique global stable point to which all trajectories converge • Otherwise the decoupling assumption may not hold in stationary regime • It has nothing to do with the properties at finite N • In our example, for h=0.3 the decoupling assumption holds in stationary regime • For h=0.1 it does not • Study the ODE ! 44

The Fixed Point Method Commonly used to model protocol performance, finite state machines etc When valid, works as follows Nodes 1…N each have a state in {1,2,…,I} Assume N is large and therefore nodes are independent (decoupling assumption) Let ibe the proba that any given node n is in state I. Write the equilibrium equations using the independence Can often be cast as a fixed point equation for , solved numerically by iteration 46

The transition matrix for one object depends on the occupancy measure, assumed equal to  This is the same asor For large N it is the same as the drift Thus  is a fixed point of the ODE This is justified if assumption (H) holds, otherwise not The Fixed Point Method Consists in Finding the Stationary Points of the ODE

Existence and Unicity of a Fixed Point are not Sufficient for Validity of Fixed Point Method Essential assumption is (H) () converges to a unique m* It is not sufficient to find that there is a unique stationary point, i.e. a unique solution to F(m*)=0 Counter Example on figure (XN(t)) is irreducible and thus has a unique stationary probability N There is a unique stationary point ( = fixed point ) (red cross) F(m*)=0 has a unique solution but it is not a stable equilibrium The fixed point method would say here Prob (node n is dormant) ≈ 0.1 Nodes are independent … but in reality We have seen that nodes are not independent, but are correlated and synchronized 48

Stationary point of ODE Mean of limit of N = pdf of one node in stationary regime 49

Correct Use of Fixed Point Method Verify scaling assumption Write ODE Study stationary regime of ODE, not just fixed point Verify assumption (H), i.e. there is a unique attractor to which all solutions converge 50

Mean field models of interacting objects: fluid equations, independence assumptions and pitfalls