1 / 39

The Origin of Entropy

The Origin of Entropy. Rick Chang. Agenda. Introduction Reference What is information? A straight forward way to derive the form of entropy A mathematical way to derive the form of entropy Conclusion. Introduction. We use entropy matrices

sancho
Download Presentation

The Origin of Entropy

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Origin of Entropy Rick Chang

  2. Agenda • Introduction • Reference • What is information? • A straight forward way to derive the form of entropy • A mathematical way to derive the form of entropy • Conclusion TEIL @ NTU

  3. Introduction • We use entropy matrices to measure dependencies of any pairs of genes, but why ? • What is entropy? TEIL @ NTU

  4. Introduction – cont. • I will : try to explain what information, entropy are • I will not : tell you how entropy is related to GA - I don’t know (may be a future work) TEIL @ NTU

  5. References • A mathematical theory of communication By C.E. Shannon 1949 part I , Appendix 2 • Information theory, Inference, and learning algorithms By David J.C MacKay 2003 chapter 1, 4 • Information theory and reliable communication By Robert G. Gallager 1976 chapter 2 TEIL @ NTU

  6. Shannon 1916 ~ 2001 TEIL @ NTU

  7. What is information? • Ensemble • The outcome x is the value of a random variable, which takes on one of a set of possible values, having probabilities with and TEIL @ NTU

  8. What is information? TEIL @ NTU

  9. What is information? • Hartley R. V. L. “Transmission of Information“ : If the number of messages in the set is finite then this number or any monotonic function of this number can be regarded as a measure of the information produced when one message is chosen from the set, all choices being equally likely. TEIL @ NTU

  10. A Straight forward way • When we try to measure the influence of event y to event x, we may consider > 1 : when occurrence of event y increase our belief of event x = 1 : event x and y are independent < 1 TEIL @ NTU

  11. A Straight forward way – cont. • We define the information provided about the event x by the occurrence of event y is > 0 : when appearance of event y increase our belief of event x = 0 : event x and y are independent < 0 TEIL @ NTU

  12. Why use logarithmic? • More convenient • practically more useful • nearer to our intuitive feeling we intuitively measures entities by linear comparison • mathematically more suitable Many of the limiting operations are simple in terms of the logarithm TEIL @ NTU

  13. Mutual information = I (y ; x) Mutual information between event x and event y TEIL @ NTU

  14. Mutual information – cont. • Mutual information => use logarithmic to quantify the difference between the belief of event x given event y and the belief of event x => the amount of uncertainty of event x we can resolve after the occurrenceof event y TEIL @ NTU

  15. Self-information • Consider an event y, p(x | y) = 1 => the amount of uncertainty of event x we resolve after we know event x will certainly occur => the priori uncertainty of the event x • Define Self-information of event x TEIL @ NTU

  16. Intuitively Information about the system We know everything about the system Our priori knowledge about event x TEIL @ NTU

  17. Intuitively – cont. Information about the system We know everything about the system Our priori knowledge about event x After we know event x will certainly occur TEIL @ NTU

  18. Intuitively – cont. Information about the system Information of event x TEIL @ NTU Uncertainty of event x

  19. Conditional Self-information • Same, define conditional self-information of event x, given the occurrence of event y • We now have TEIL @ NTU

  20. Intuitively – cont. Information about event x We know everything about event x (we know event x will certainly occur) Our priori knowledge about event x After the occurrence of event y TEIL @ NTU

  21. Intuitively – cont. Information about event x Mutual Information between event x and event y TEIL @ NTU

  22. A Straight Forward Way – cont. • Like above, define self-information of event x and event y • We now have TEIL @ NTU

  23. A Straight Forward Way – cont. • The uncertainty of event y is never increased by knowledge of x TEIL @ NTU

  24. From instance to expectation • I(x;y) • I(x) • I(x|y) • I(x,y) • I(x;y)=I(x)-I(x|y) • I(x,y)=I(x)+I(y)-I(x;y) • I(X;Y) • H(X) • H(X|Y) • H(X,Y) • I(X;Y)=H(X)-H(X|Y) • H(X,Y)=H(X)+H(Y)-I(X;Y) Average TEIL @ NTU

  25. Relationship H(X,Y) H(X) H(Y) H(X|Y) I(X;Y) H(Y|X) TEIL @ NTU

  26. Entropy • The entropy of an ensemble is defined to be the average value of the self-information of all event x TEIL @ NTU Average priori uncertainty of an ensemble

  27. Interesting Properties of H(X) • H = 0 if and only if all the but one are zero, this one having the value unity. Thus only when we are certain of the outcome does H vanish. Otherwise H is positive. • For a given n, H is a maximum and equal to log(n) when all the are equal, i.e., . This is also intuitively the most uncertain situation. • Any change toward equalization of the probabilities ,…,increases H. TEIL @ NTU

  28. A mathematical way • Can we find a measure of how uncertain we are of an ensemble ? • If there is such a measure, say, it is reasonable to require of it the following properties: • H should be continuous in the • If all the are equal, =1/n, then H should be a monotonic increasing function of n. • If a choice be broken down into two successive choices, the original H should be the weighted sum of the individual values of H. TEIL @ NTU

  29. A mathematical way – cont. • If a choice be broken down into two successive choices, the original H should be the weighted sum of the individual values of H. TEIL @ NTU Second choice occurs half the time

  30. A mathematical way – cont. • Theorem: The onlyH satisfying the three above properties is of the form: TEIL @ NTU

  31. A mathematical way – cont. • Proof: Let From property(3) we can decompose a choice from equally likely possibilities into a series of m choices from s equally likely possibilities and obtain TEIL @ NTU m A(s)

  32. A mathematical way – cont. • Similarly • We can choose n arbitrarily large and find an m to satisfy TEIL @ NTU

  33. A mathematical way – cont. • from the monotonic property of A(n) TEIL @ NTU

  34. A mathematical way – cont. • From equation (1) and (2) • We get A(t) = K log(t) , K must be positive to satisfy property (2) TEIL @ NTU

  35. A mathematical way – cont. • Now suppose we have a choice from n possibilities with commeasurable probabilities where all are integers. • We can break down a choice from possibilities into a choice from n possibilities with probabilities and then, if the was chosen, a choice from with equal probabilities. TEIL @ NTU

  36. A mathematical way – cont. • Using property (3) again, we equate the total choice from as computed by two methods TEIL @ NTU

  37. A mathematical way – cont. • Hence • If the pi are not commeasurable, they may be approximated by rational and the same expression must hold by our continuity assumption (property(1) ). • The choice of coefficient K is a matter of convenience and amounts to the choice of a unit of measure. TEIL @ NTU

  38. Conclusion • We first use a intuitive method to measure information content of an event or an ensemble • We explain why we choose logarithm intuitively • Mutual information, entropy is introduced • We show the relationship between information content and uncertainty • At last, we set three assumptions and derive the only way to measure information content and show that logarithm must be adopted. TEIL @ NTU

  39. Thanks TEIL @ NTU

More Related