180 likes | 448 Views
The Relations between information theory and Set theory. I-Measure. Recall Shannon’s Information measures. For Random Variables X , Y and Z, we have by definition, H(X) = = -E[logp(x)] that is the negative of the expectation of logp(x). H(X,Y) = - = -E[logp( X ,Y)]
E N D
The Relations between information theory and Set theory I-Measure
Recall Shannon’s Information measures For Random Variables X , Y and Z, we have by definition, • H(X) = = -E[logp(x)] that is the negative of the expectation of logp(x). • H(X,Y) = - = -E[logp(X,Y)] • H(Y|X) = - = -E[logp(Y|X)] These measures are respectively the entropy of X the joint entropy of X and Y and the conditional entropy of Y given X
Recall Shannon’s Information measures I( X;Y) = = E[log ] is the mutual information between random variables X and Y. I( X;Y) = = = I( Y;X). The mutual information between two random variables is symmetric.
The Chain Rules • Chain rules for Entropy, Conditional Entropy and Mutual Information of random variables. H(, ,….,)=) H(, ,…,|Y) = ) I(, ,…, Y) = )
Markov Chains • For random variables , ,…, …. forms a Markov Chain if P(,,…, ) =P()P(|)……P(|) (i) This implies that P(|,….., = = by (i)
Markov Chains P(|,….., = (ii) This shows that the Markov chain is memoryless. The realisation of condition on , ……, only depends on .
A Signed measure function and its properties. A signed Measure “”is a function that maps elements (sets) of a sigma Algebra S(on a universal setΩ) into the real numbers such that, for any pairwise disjoint sequence of sets the infinite series ) is convergent and ()=). As a consequence λ(ɸ)=0, λ()=) and for F and E elements of S we have λ(F-E) = λ(F)-λ(E)
I-measure and Signed measure For the sake of constructing a one to one correspondence between Information Theory and Set Theory , let’s map to any random variable X a set X’ (abstract set). So, for random variables , ……., ,associate respectively the sets , ,…., . The set of sets obtained by taking any sequence of usual set operations ( union, intersectionand difference) on , ,…., is a Sigma Algebra S on the universal set Ω = . Also any set of the form where is either or (the complement of in Ω) is called an atom of S.
I-measureand Signed measure For the set N={1, 2 ,…,n} let G be any subset of N Let ={ and = . For G, G’ and G’’ subsets of N, The one to one correspondence between Shannon’s information measure and set theory is expressed as follows:
I-measure and Signed measure λ() = H() (1) The signed measure “λ” defined in (1) on the universal set Ω = is called the I-measure for random variables , ,……., . For λ to be consistent with all Shannon’s Information measures we should have: λ(-) = I( ; ’|) (2).
I-measure and Signed measure So that when G’’ is empty then λ() = I( ; ’ ) (3) and also when G= G’ we would have (2) becomes λ( -) = H(|) (4) and finally when G = G’ and G” is empty, (4) becomes (1) λ() = H()
I-measureand Signed measure Does (1) implies (2)? Let’s check, For random X, Y and Z we have λ(X’’) = λ(X’) +λ(Y’) -λ(X’) - λ(Z’) = H(X,Z) +H(Y,Z) –H(X,Y,Z) – H(Z) by (1), this nothing but I(X;Y|Z). So λ(X’’)= I(X;Y|Z). It turns out that (1) implies (2) and the signed λ defined above has been proved to be unique.
The I-measureand Signed measure Hence we can clearly see the one to one correspondence between Shannon’s information measure and set theory in general.
Applications of the I- measure For three random variables, , andwe have : • λ() = H( , ,) λ()=λ() ) =λ() + λ() + λ() = H() +H(|) + H(|,)=H(, ) which is nothing but the chain rule for entropy for 3 random variables , and .
Applications of the I- measure λ[() = I(,, ; ) by (3) = λ[()] =λ[()] = λ() +λ(-) + λ(-) = I(; ) +I(|) +I(|,) which is the chain rule for mutual information between three random variable taken jointly(, , ) and another random variable () .
Information diagram The one to one relations between Shannon’s information measure and set theory suggest that we should be able to display Shannon’s information measures in an information diagram using Venn diagram. But actually for n random variable we need a n-1 dimension to completely display all of them. On the other hand when the random variable that we are dealing with form a markov Chain , 2 dimensions is enough to
Information diagram Display the measures. Example of the markov chain
References • Book: Fundamentals of Real Analysis Sterling K, Berberian • Book : A first course in Information Theory Raymond W, Yeung . The Chinese University of Hong Kong