1 / 20

Kolmogorov Complexity and Universal Distribution

Kolmogorov Complexity and Universal Distribution. Presented by Min Zhou Nov. 18, 2002. Content. Kolmogorov complexity Universal Distribution Inductive Learning. Principle of Indifference (Epicurus). Keep all hypotheses that are consistent with the facts. Occam’s Razor.

ebony-blake
Download Presentation

Kolmogorov Complexity and Universal Distribution

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Kolmogorov Complexity and Universal Distribution Presented by Min Zhou Nov. 18, 2002

  2. Content • Kolmogorov complexity • Universal Distribution • Inductive Learning

  3. Principle of Indifference(Epicurus) • Keep all hypotheses that are consistent with the facts

  4. Occam’s Razor • Among all hypotheses consistent with the facts, choose the simplest • Newton’s rule #1 for doing nature philosophy • We are to admit no more costs of nature things than such as are both true and sufficient to explain the appearances

  5. Question • What does “simplest” mean? • How to define simplicity? • Can a thing be simple under one definition and not under another?

  6. Bayes’ Rule • P(H|D) = P(D|H)*P(H)/P(D) -P(H) is often considered as initial degree of belief in H • In essence, Bayes’ rule is a mapping from prior probability P(H) to posterior probability P(H|D) determined by D

  7. How to get P(H) • By the law of large numbers, we can get P(H|D) if we use many examples • Give as much information about that from only a limited of number of data • P(H) may be unknown, uncomputable, even may not exist • Can we find a single probability distribution to use as prior distribution in each different case, with a proximately the same result as if we had used the real distribution

  8. Hume on Induction • Induction is impossible because we can only reach conclusion by using known data and methods. • So the conclusion is logically already contained in the start configuration

  9. Solomonoff’s Theory of Induction • Maintain all hypotheses consistent with the data • Incoporate “Occam’s Razor”-assign the simplest hypotheses with highest probability • Using Bayes’ rule

  10. Kolmogorov Complexity • k(s) is the length of the shortest program which, on no input, prints out s • k(s)<=|s| • There is a string s, k(s) >=n • k(s) is objective (program language independent) by Invariance Theorem

  11. Universal Distribution • P(s) = 2-k(s) • We use k(s) to describe the complexity of an object. By Occam’s Razor, the simplest should have the highest probability.

  12. Problem: P(s)>1 • For every n, there exists a n-bit string s, k(s) = log n, so P(s) = 2-log n = 1/n • ½+1/3+….>1

  13. Levin’s improvement • Using prefix-free program • A set of programs, no one of which is a prefix of any other • Kraft’s inequality • Let L1, L2,… be a sequence of natural numbers. There is a prefix-code with this sequence as lengths of its binary code words iff n2-ln<=1

  14. Multiplicative domination • Levin proved that there exists c, c*p(s) >= p’(s) where c depends on p, but not on s • If true prior distribution is computable, then use the single fixed universal distribution p is almost as good as the actually true distribution itself

  15. Turing’s thesis: Universal turing machine can compute all intuitively computable functions • Kolmogorov’s thesis: the Kolmogorov complexity gives the shortest description length among all description lengths that can be effectively approximated according to intuition. • Levin’s thesis: The universal distribution give the largest distribution among all the distribution that can be effectively approximated according to intuition

  16. Universal Bet • Street gambler Bob tossing a coin and offer: • Next is head “1” – give Alice 2$ • Next is tail “0” – pay Bob 1$ • Is Bob honest? • Side bet: flip coin 1000 times, record the result as a string s • Alice pay 1$, Bob pay Alice 21000-k(s) $

  17. Good offer: • |s|=1000 2-1000 21000-k(s)= |s|=1000 2-k(s)<=1 • If Bob is honest, Alice increase her money polynomially • If Bob cheat, Alice increase her money exponentially

  18. Notice • The complexity of a string is non-computable

  19. Conclusion • Kolmogorov complexity – optimal effective descriptions of objects • Universal Distribution – optimal effective probability of objects • Both are objective and absolute

  20. Reference • Ming Li, Paul Vitanvi, An Introduction to Kolmogorov complexity and its applications, 2nd Edtion Spring – Verky 1997

More Related