11 - Markov Chains

11 - Markov Chains Jim Vallandingham

Outline • Irreducible Markov Chains • Outline of Proof of Convergence to Stationary Distribution • Convergence Example • Reversible Markov Chain • Monte Carlo Methods • Hastings-Metropolis Algorithm • Gibbs Sampling • Simulated Annealing • Absorbing Markov Chains

Stationary Distribution • As approaches Each row is the stationary distribution

Stationary Dist. Example

Stationary Dist. Example • Long Term averages: • 24% time spent in state E1 • 39% time spent in state E2 • 21% time spent in state E3 • 17% time spent in state E4

Stationary Distribution • Any finite, aperiodic irreducible Markov chain will converge to a stationary distribution • Regardless of starting distribution • Outline of Proof requires linear algebra • Appendix B.19

L.A. : Eigenvalues • Let P be an s x s matrix. • P has s eigenvalues • Found as the s solutions to • Assume all eigenvalues of P are distinct

L.A. : left & right eigenvectors • Corresponding to each eigenvalue • Is a right eigenvector - • And a left eigenvector - • For which: • Assume they are normalized:

L.A. : Spectral Expansion • Can express P in terms of its eigenvectors and eigenvalues: • Called a spectral expansion of P

L.A. : Spectral Expansion • If is an eigenvalue of P with corresponding left and right eigenvectors & • Then is an eigenvalue of Pn with same left and right eigenvectors &

L.A. : Spectral Expansion • Implies spectral expansion of Pn can be written as:

Outline of Proof • Going back to proof… • P is transition matrix for finite aperiodic irreducible Markov chain • P has one eigenvalue, equal to 1 • All other eigenvalues have absolute value < 1

Outline of Proof • Choosing left and right eigenvectors of • Requirements: • Also satisfies : & = 1 Probability vector (sum to 1) Normalization (definition of left eigenvector as eigenvalue of 1)

Outline of Proof • Also: • Can be shown that there is a unique solution of this equation that also satisfies so so that Same equation satisfied by the stationary distribution

Outline of Proof • Pn gives the n-step transition probabilities. • Spectral Expansion of Pn is: • So as n increases Pn approaches Only one eigenvalue is = 1. Rest are < 1

Convergence Example

Convergence Example Has Eigenvalues of :

Convergence Example Has Eigenvalues of : Less than 1

Convergence Example • Left & Right eigenvectors satisfying

Convergence Example • Left & Right eigenvectors satisfying Stationary distribution

Convergence Example • Spectral expansion Stationary distribution 0 0 0

Reversible Markov Chains

Reversible Markov Chains • Typically moving forward in ‘time’ in a Markov chain • 1  2  3  … t • What about moving backward in this chain? • t  t-1  t-2 …  1

Reversible Markov Chains Ancestor Back in time Forward in time Species A Species B

Reversible Markov Chains • Have a finite irreducible aperiodic Markov chain • with stationary distribution • During t transitions, chain will move through states: • Reverse chain • Define • Then reverse chain will move through states:

Reversible Markov Chains • Want to show structure determining the reverse chain sequence is also a Markov chain • Typical element found from typical element of P, using:

Reversible Markov Chains • Shown by using Bayes rule to invert conditional probability • Intuitively: • The future is independent of the past, given the present • The past is independent of the future, given the present

Reversible Markov Chains • Stationary distribution of reverse chain is still • Follows from Stationary distribution property

Reversible Markov Chains • Markov chain is said to be reversible if • This only holds if

Monte Carlo Methods

Markov Chain Monte Carlo • Class of algorithms for sampling from probability distributions • Involve constructing a Markov Chain • Want to have stationary distribution • State of chain after large number of steps is used as a sample of desired distribution • We discuss 2 algorithms • Gibbs Sampling • Simulated Annealing

Basic Problem • Find transition matrix P such that • Its stationary distribution is the target distribution • Know that Markov chain will converge to stationary distribution, regardless of initial distribution • How can we find such a P with its stationary distribution as the target distribution?

Basic Idea • Construct transition matrix Q • “candidate generating matrix” • Modify to have correct stationary distribution • Modification involves inserting factors • So that Various ways to picking a’s

Hastings-Metropolis • Goal: construct aperiodic irreducible Markov chain • Having prescribed stationary distribution • Produces a correlated sequence of draws from the target density that may be difficult to sample using a classical independence method.

Hastings-Metropolis Process: • Choose set of constants • Such that • And • Define Accept state change Reject state change Chain doesn’t change value

Hastings-Metropolis Example = (.4 .6) Q =

Hastings-Metropolis Example = (.4 .6) Q = P=

Hastings-Metropolis Example = (.4 .6) P= P2= P50=

Algorithmic Description • Start with State E1, then iterate • Propose E’ from q(Et,E’) • Calculate ratio • If a > 1, • Accept E(t+1) = E’ • Else • Accept with probability of a • If rejected, E(t+1) = Et

Gibbs Sampling

Gibbs Sampling Definitions Be the random vector Be the distribution of Assume We define a Markov chain whose states are the possible values of Y

Gibbs Sampling Process • Enumerate vectors in some order • 1, 2,…,s • Pick vector j with jth state in chain • pij : • 0 : if vectors i & j differ by more than 1 component If they differ by at most 1 component, y1*

Gibbs Sampling • Assume Joint distribution p(X,Y) • Looking to sample k values of X • Begin with value of y0 • Sample xi using p(X | Y = yi-1) • Once xi is found use it to find yi • p(Y | X = xi) • Repeat k times

Visual Example

Gibbs Sampling • Allows us to deal with univariate conditional distributions • Instead of complex joint distributions • Chain has stationary distribution of

Why is is Hastings-Metropolis ? • If we define • Can see that for Gibbs: • When a is always 1

Simulated Annealing

Simulated Annealing • Goal: Find (approximate) minimum of some positive function • Function defined on an extremely large number of states, s • And to find those states where this function is minimized • Value of the function for state is:

Simulated Annealing Process • Construct neighborhood of each state • Set of states “close” to the state • Variable in Markov chain can move to a neighbor in one step • Moves outside neighborhood not allowed

Simulated Annealing • Requirements of neighborhood • If is in neighborhood of then is in the neighborhood of • Number of states in a neighborhood (N) is independent of that state • Neighborhoods are linked so that chain can eventually make it from any Ej to any Em. • If in state Ej, then the next move must be in neighborhood of Ej.

11 - Markov Chains

11 - Markov Chains

Presentation Transcript

Markov Chains

Markov Chains

Markov Chains

Markov Chains

Markov Chains

Markov Chains

Markov Chains

Eager Markov Chains

Markov chains

Markov chains

Distributed Markov Chains

Markov Chains

Markov Chains Regular Markov Chains Absorbing Markov Chains

11. Markov Chains (MCs) 2

Markov Chains and Hidden Markov Models

Markov Chains

Markov Chains

Tutorial: Markov Chains

Markov Chains

Markov Chains

Markov chains

Markov Chains