1 / 18

Capacity of Finite-State Channels : Lyapunov Exponents and Shannon Entropy

Capacity of Finite-State Channels : Lyapunov Exponents and Shannon Entropy. Tim Holliday Peter Glynn Andrea Goldsmith Stanford University. Introduction. We show the entropies H(X), H(Y), H(X,Y), H(Y|X) for finite state Markov channels are Lyapunov exponents.

mtimms
Download Presentation

Capacity of Finite-State Channels : Lyapunov Exponents and Shannon Entropy

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Capacity of Finite-State Channels:Lyapunov Exponents and Shannon Entropy Tim Holliday Peter Glynn Andrea Goldsmith Stanford University

  2. Introduction • We show the entropies H(X), H(Y), H(X,Y), H(Y|X) for finite state Markov channels are Lyapunov exponents. • This result provides an explicit connection between dynamic systems theory and information theory • It also clarifies Information Theoretic connections to Hidden Markov Models • This allows novel proof techniques from other fields to be applied to Information Theory problems

  3. c1 c0 R(c0, c2) c3 c2 Finite-State Channels • Channel state Zn {c0, c1, … cd}is a Markov Chain with transition matrix R(cj, ck) • States correspond to distributions on the input/output symbols P(Xn=x, Yn=y)=q(x,y|zn, zn+1) • Commonly used to model ISI channels, magnetic recording channels, etc. R(c1, c3)

  4. Time-varying Channels with Memory • We consider finite state Markov channels with no channel state information • Time-varying channels with finite memory induce infinite memory in the channel output. • Capacity for time-varying infinite memory channels is defined in terms of a limit

  5. Previous Research • Mutual information for the Gilbert-Elliot channel • [Mushkin Bar-David, 1989] • Finite-state Markov channels with i.i.d. inputs • [Goldsmith/Varaiya, 1996] • Recent research on simulation based computation of mutual information for finite-state channels • [Arnold, Vontobel, Loeliger, Kavčić, 2001, 2002, 2003] • [Pfister, Siegel, 2001, 2003]

  6. Symbol Matrices • For each symbol pair (x,y)X x Y define a |Z|x|Z| matrix G(x,y) • Where (c0,c1) are channel states at times (n,n+1) • Each element corresponds to the joint probability of the symbols and channel transition G(x,y)(c0,c1) = R(c0,c1) q(x0 ,y0|c0,c1),  (c0,c1) Z

  7. Probabilities as Matrix Products • Let m be the stationary distribution of the channel The matrices G are deterministic functions of the random pair (x,y)

  8. Entropy as a Lyapunov Exponent • The Shannon entropy is equivalent to the Lyapunov exponent for G(X,Y) • Similar expressions exist for H(X), H(Y), H(X,Y)

  9. Growth Rate Interpretation • The typical set An is the set of sequences x1,…,xnsatisfying • By the AEP P(An)>1-efor sufficiently large n • The Lyapunov exponent is the average rate of growth of the probability of a typical sequence • In order to compute l(X) we need information about the “direction” of the system

  10. Lyapunov Direction Vector • The vector pn is the “direction” associated with l(X) for any m. • Also defines the conditional channel state probability • Vector has a number of interesting properties • It is the standard prediction filter in hidden Markov models • pn is a Markov chain if m is the stationary distribution for the channel) m G G ... G X X X = = n p P ( Z | X ) n 1 2 + n n 1 m || G G ... G || X X X 1 n 1 2

  11. Random Perron-Frobenius Theory • The vector pis the random Perron-Frobenius eigenvector associated with the random matrix GX For all n we have For the stationary version of p we have The Lyapunov exponent we wish to compute is

  12. Technical Difficulties • The Markov chain pn is not irreducible if the input/output symbols are discrete! • Standard existence and uniqueness results cannot be applied in this setting • We have shown that pnpossesses aunique stationary distribution if the matrices GX are irreducible and aperiodic • Proof exploits the contraction property of positive matrices

  13. Computing Mutual Information • Compute the Lyapunov exponents l(X), l(Y), and l(X,Y)as expectations (deterministic computation) • Then mutual information can be expressed as • We also prove continuity of the Lyapunov exponents on the domain q, R, hence

  14. Simulation-Based Computation(Previous Work) • Step 1: Simulate a long sequence of input/output symbols • Step 2: Estimate entropy using • Step 3: For sufficiently large n, assume that the sample-based entropy has converged. • Problems with this approach: • Need to characterize initialization bias and confidence intervals • Standard theory doesn’t apply for discrete symbols

  15. Simulation Traces for Computation of H(X,Y)

  16. Rigorous Simulation Methodology • We prove a new functional central limit theorem for sample entropy with discrete symbols • A new confidence interval methodology for simulated estimates of entropy • How good is our estimate? • A method for bounding the initialization bias in sample entropy simulations • How long do we have to run the simulation? • Proofs involve techniques from stochastic processes and random matrix theory

  17. Computational Complexity of Lyapunov Exponents • Lyapunov exponents are notoriously difficult to compute regardless of computation method • NP-complete problem [Tsitsiklis 1998] • Dynamic systems driven by random matrices typically posses poor convergence properties • Initial transients in simulations can linger for extremely long periods of time.

  18. Conclusions • Lyapunov exponents are a powerful new tool for computing the mutual information of finite-state channels • Results permit rigorous computation, even in the case of discrete inputs and outputs • Computational complexity is high, multiple computation methods are available • New connection between Information Theory and Dynamic Systems provides information theorists with a new set of tools to apply to challenging problems

More Related