180 likes | 273 Views
Capacity of Finite-State Channels : Lyapunov Exponents and Shannon Entropy. Tim Holliday Peter Glynn Andrea Goldsmith Stanford University. Introduction. We show the entropies H(X), H(Y), H(X,Y), H(Y|X) for finite state Markov channels are Lyapunov exponents.
E N D
Capacity of Finite-State Channels:Lyapunov Exponents and Shannon Entropy Tim Holliday Peter Glynn Andrea Goldsmith Stanford University
Introduction • We show the entropies H(X), H(Y), H(X,Y), H(Y|X) for finite state Markov channels are Lyapunov exponents. • This result provides an explicit connection between dynamic systems theory and information theory • It also clarifies Information Theoretic connections to Hidden Markov Models • This allows novel proof techniques from other fields to be applied to Information Theory problems
c1 c0 R(c0, c2) c3 c2 Finite-State Channels • Channel state Zn {c0, c1, … cd}is a Markov Chain with transition matrix R(cj, ck) • States correspond to distributions on the input/output symbols P(Xn=x, Yn=y)=q(x,y|zn, zn+1) • Commonly used to model ISI channels, magnetic recording channels, etc. R(c1, c3)
Time-varying Channels with Memory • We consider finite state Markov channels with no channel state information • Time-varying channels with finite memory induce infinite memory in the channel output. • Capacity for time-varying infinite memory channels is defined in terms of a limit
Previous Research • Mutual information for the Gilbert-Elliot channel • [Mushkin Bar-David, 1989] • Finite-state Markov channels with i.i.d. inputs • [Goldsmith/Varaiya, 1996] • Recent research on simulation based computation of mutual information for finite-state channels • [Arnold, Vontobel, Loeliger, Kavčić, 2001, 2002, 2003] • [Pfister, Siegel, 2001, 2003]
Symbol Matrices • For each symbol pair (x,y)X x Y define a |Z|x|Z| matrix G(x,y) • Where (c0,c1) are channel states at times (n,n+1) • Each element corresponds to the joint probability of the symbols and channel transition G(x,y)(c0,c1) = R(c0,c1) q(x0 ,y0|c0,c1), (c0,c1) Z
Probabilities as Matrix Products • Let m be the stationary distribution of the channel The matrices G are deterministic functions of the random pair (x,y)
Entropy as a Lyapunov Exponent • The Shannon entropy is equivalent to the Lyapunov exponent for G(X,Y) • Similar expressions exist for H(X), H(Y), H(X,Y)
Growth Rate Interpretation • The typical set An is the set of sequences x1,…,xnsatisfying • By the AEP P(An)>1-efor sufficiently large n • The Lyapunov exponent is the average rate of growth of the probability of a typical sequence • In order to compute l(X) we need information about the “direction” of the system
Lyapunov Direction Vector • The vector pn is the “direction” associated with l(X) for any m. • Also defines the conditional channel state probability • Vector has a number of interesting properties • It is the standard prediction filter in hidden Markov models • pn is a Markov chain if m is the stationary distribution for the channel) m G G ... G X X X = = n p P ( Z | X ) n 1 2 + n n 1 m || G G ... G || X X X 1 n 1 2
Random Perron-Frobenius Theory • The vector pis the random Perron-Frobenius eigenvector associated with the random matrix GX For all n we have For the stationary version of p we have The Lyapunov exponent we wish to compute is
Technical Difficulties • The Markov chain pn is not irreducible if the input/output symbols are discrete! • Standard existence and uniqueness results cannot be applied in this setting • We have shown that pnpossesses aunique stationary distribution if the matrices GX are irreducible and aperiodic • Proof exploits the contraction property of positive matrices
Computing Mutual Information • Compute the Lyapunov exponents l(X), l(Y), and l(X,Y)as expectations (deterministic computation) • Then mutual information can be expressed as • We also prove continuity of the Lyapunov exponents on the domain q, R, hence
Simulation-Based Computation(Previous Work) • Step 1: Simulate a long sequence of input/output symbols • Step 2: Estimate entropy using • Step 3: For sufficiently large n, assume that the sample-based entropy has converged. • Problems with this approach: • Need to characterize initialization bias and confidence intervals • Standard theory doesn’t apply for discrete symbols
Rigorous Simulation Methodology • We prove a new functional central limit theorem for sample entropy with discrete symbols • A new confidence interval methodology for simulated estimates of entropy • How good is our estimate? • A method for bounding the initialization bias in sample entropy simulations • How long do we have to run the simulation? • Proofs involve techniques from stochastic processes and random matrix theory
Computational Complexity of Lyapunov Exponents • Lyapunov exponents are notoriously difficult to compute regardless of computation method • NP-complete problem [Tsitsiklis 1998] • Dynamic systems driven by random matrices typically posses poor convergence properties • Initial transients in simulations can linger for extremely long periods of time.
Conclusions • Lyapunov exponents are a powerful new tool for computing the mutual information of finite-state channels • Results permit rigorous computation, even in the case of discrete inputs and outputs • Computational complexity is high, multiple computation methods are available • New connection between Information Theory and Dynamic Systems provides information theorists with a new set of tools to apply to challenging problems