1 / 63

Today:

Today:. Entropy <break> Information Theory. Information Theory. Claude Shannon Ph.D. 1916-2001. Entropy. Entropy. A measure of the disorder in a system. Entropy. The (average) number of yes/no questions needed to completely specify the state of a system.

Download Presentation

Today:

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Today: Entropy <break> Information Theory

  2. Information Theory

  3. Claude Shannon Ph.D. 1916-2001

  4. Entropy

  5. Entropy A measure of the disorder in a system

  6. Entropy The (average) number of yes/no questions needed to completely specify the state of a system

  7. The (average) number of yes/no questions needed to completely specify the state of a system

  8. What if there were two coins?

  9. What if there were two coins?

  10. What if there were two coins?

  11. What if there were two coins?

  12. number of yes-no questions number of states = 2 2 states. 1 question. 4 states. 2 questions. 8 states. 3 questions. 16 states. 4 questions.

  13. number of yes-no questions number of states = 2 log2(number of states) = number of yes-no questions

  14. His entropy, the number of yes-no questions required to specify the state of the system n is the number of states of the system, assumed (for now) to be equally likely

  15. Consider Dice

  16. The Six Sided Die H = log2(6) = 2.585 bits

  17. The Four Sided Die H = log2(4) = 2.000 bits

  18. The Twenty Sided Die H = log2(20) = 4.322 bits

  19. What about all three dice? H = log2(4620)

  20. What about all three dice? H = log2(4)+log2(6)+log2(20)

  21. What about all three dice? H = 8.907 bits

  22. What about all three dice? Entropy, from independent elements of a system, adds

  23. Trivial Fact 1: log2(x) = - log2(1/x) Let’s the rewrite this a bit...

  24. Trivial Fact 2:if there are n equally likely possibilites p = (1/n) Trivial Fact 1: log2(x) = - log2(1/x)

  25. Trivial Fact 2:if there are n equally likely possibilites p = (1/n)

  26. What if the n states are not equally probable? Maybe we should use the expected value of the entropies, a weighted average by probability

  27. Let’s do a simple example: n = 2 , how does H change as we varyp1andp2 ?

  28. n = 2 p1 + p2 = 1

  29. n = 3 p1 + p2 + p3 = 1 how about n = 3

  30. The bottom line intuitions for Entropy: • Entropy is a statistic for describing a probability distribution. • Probabilities distributions which are flat, broad, sparse, etc. have HIGH entropy. • Probability distributions which are peaked, sharp, narrow, compact etc. have LOW entropy. • Entropy adds for independent elements of a system, thus entropy grows with the dimensionality of the probability distribution. • Entropy is zero IFF the system is in a definite state, i.e. p = 1 somewhere and 0 everywhere else.

  31. Pop Quiz: 2. 1. 3. 4.

  32. Entropy The (average) number of yes/no questions needed to completely specify the state of a system

  33. 11:16 am (Pacific) on June 29th of the year 2001, there were approximately 816,119 words in the English Language H(english) = 19.6 bits Twenty Questions:220 = 1,048,576 What’s a winning 20 Questions Strategy?

  34. <break>

  35. So, what is information? It’s a change in what you don’t know. It’s a change in the entropy.

  36. Information as a measure of correlation x y

  37. Information as a measure of correlation x y

  38. 1 1 probability probability 1/2 1/2 0 0 tails tails heads heads I (X;Y) = H(Y) - H(Y|X) = 0 bits H(Y) = 1 H(Y|x=heads) = 1 P(Y) P(Y|x=heads )

  39. Information as a measure of correlation x y

  40. Information as a measure of correlation x y

  41. 1 1 probability probability 1/2 1/2 0 0 tails tails heads heads I (X;Y) = H(Y) - H(Y|X) ~ 1 bit H(Y) = 1 H(Y|x=heads) ~ 0 P(Y) P(Y|x=heads )

  42. Information Theory in Neuroscience x y

  43. The Critical Observation: Information is Mutual I(X;Y) = I(Y;X) H(Y)-H(Y|X) = H(X)-H(X|Y)

  44. The Critical Observation: I(Stimulus;Spike) = I(Spike;Stimulus) What a spike tells the Brain about the stimulus, is the same as what our stimulus choice tells us about the likelihood of a spike.

  45. The Critical Observation: stimulus response This, we can measure.... What our stimulus choice tells us about the likelihood of a spike.

  46. Estimate: P( neural response |stimulus presented ) From that, Estimate: P( neural repsones ) How to use Information Theory: Show your system stimuli. Measure neural responses. Compute: H(neural response) and H(neural response | stimulus presented) Calculate: I(response ; stimulus)

More Related