Chapter 5

Chapter 5 A Measure of Information

Outline • 5.1 Axioms for the uncertainty measure • 5.2 Two Interpretations of the uncertainty function • 5.3 Properties of the uncertainty function • 5.4 Entropy and Coding • 5.5 Shannon-Fano Coding

5.1 Axioms for the uncertainty measure x : discrete random variable x1x2．．． xM p1p2．．． pM h(p): the uncertainty of an event with probability p h(pi): the uncertainty of { x = xi} The average uncertainty of x: If p1 = p2 =．．．= pM= , we say

Axiom 1: f(M) should be a monotonically increasing function of M, that is, M＜M ’ implies f(M)＜f(M ’) For example, f(2)＜f(6) • Axiom 2: X: (x1, . . ., xM) Y: (y1, . . ., yL) (X,Y): Joint experiment has M．L equally likely outcome. f(M．L) = f(M) + f(L) independent

Axiom 3 (Group Axiom): X = (x1, x2, . . . , xr, xr+1, . . . , xM ) Construct a compound experiment X1 A Xr X Xr+1 B XM

Axiom 5: H(p,1-p) is a continuous function of p, i.e., a small change in p will correspond to a small change in uncertainty. • We can use four axioms above to find the H function. • Thm 5.1: The only function satisfying the four given axioms is H(p1, . . . , PM)= , where C > 0 and the logarithm base > 1

For example, C = 1, and base = 2 H(p,1-p) Coin : { tail, head } 1 Max. uncertainty ½ ½ ▪ ▪ ▪ 1 0 Min. uncertainty 0 1 ½

5.2 Two Interpretations of the uncertainty function • (1) H(p1, . . . , pM) may be interpreted as the expectation of a random variable W = w(x)

(2) H(p1, . . . , pM) may be interpreted as the min average number of ‘yes’ ‘no’ questions required to specify the values of x For example, H(x) = H( 0.3 , 0.2 , 0.2 , 0.15 , 0.15 ) = 2.27 x1 x2 x3 x4 x5 x1 Y x=x1? Y N x2 Does x=x1 or x2? N x3 Y x=x3? x4 Y N x=x4 N x5

Avg # of q = 2·0.7 + 3·0.3 = 2.3 > 2.27 H.W. : X = { x1, x2 } p(x1) = 0.7 p(x2) = 0.3 How many questions (in average) are required to specify the outcome of a joint experiment involving 2 independent observation of x?

5.3 Properties of the uncertainty function y • Lemma 5.2 Let p1, . . . , pM & q1, . . . , qM be arbitrary positive number with Then y = x -1 y = ln x ln x ≤ x -1 x

Thm 5.3 H(p1, . . . , pM) ≤ log M with equality iff pi =

5.4 Entropy and Coding • Noiseless Coding Theorem X : x1x2· · · · xM p1 p2· · · · pM Codeword: w1 w2· · · · wM length: n1n2· · · · nM Minimize: Code Alphabet: { a1, a2, …, aD} Ex. D = 2, { 0, 1 }

Thm (Noiseless Coding Thm) • If is the average codeword length of a uniquely decodable code for X, then with equality iff , for i = 1, 2, …, M. • Note: • is the uncertainty of X computed by using the base D.

pf:

A code is called “absolutely optimal” if it achieves the lower bound by the noiseless coding thm. • Ex. H(x) = 7/4 =

5.5 Shannon-Fano Coding • Select the integer ni s.t. => An instantaneous code can be constructed with the lengths n1, n2, …, nM obtained from Shannon-Fano coding.

Thm: Given a random variable X with uncertainty

In fact, we can always approach the lower bound as closely as desired if we are allowed to use “block coding”. • Take a series of observation of X Let Y = (x1, x2, …, xs) Assign a codeword to Y => Block coding decrease the average codeword length per value of X

Ex. But H(X) = 0.88129 H(p), p = 0.3 or p = 0.7 look up table

How do we find the actual code symbols? • We simply assign them in order. • By S-F coding: • We then assign

How bad is Shannon-Fano Coding?

Chapter 5

Chapter 5

Presentation Transcript

Chapter 5

Chapter 5

Chapter 5

Chapter 5

Chapter 5 5

chapter 5

Chapter 5

Chapter 5

Chapter 5

Chapter 5

Chapter 5

CHAPTER 5

Chapter 5

CHAPTER 5

Chapter 5

Chapter 5

Chapter 5

Chapter 5

Chapter 5

Chapter 5

Chapter 5

Chapter 5