1 / 15

Bayesian Reasoning

Bayesian Reasoning. Thomas Bayes (1702-1761). Pierre-Simon Laplace (1749-1827). A/Prof Geraint Lewis A/Prof Peter Tuthill. “Probability theory is nothing but common sense, reduced to calculation.”. Laplace. Are you a Bayesian or Frequentist?. 4.

afram
Download Presentation

Bayesian Reasoning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bayesian Reasoning Thomas Bayes (1702-1761) Pierre-Simon Laplace (1749-1827) A/Prof Geraint Lewis A/Prof Peter Tuthill “Probability theory is nothing but common sense, reduced to calculation.” Laplace

  2. Are you a Bayesian or Frequentist? 4 “There are 3 kinds of lies: Lies, Damned Lies, and Statistics” ...and Bayesian Statistics Benjamin Disraeli Frequentists Fig 1. A Frequentist Statistician Fig 2. Bayesian Statistics Conference

  3. What is Inference? If A is true then B is true(Major Premise) A = A,B(in Boolean notation) Deductive Inference (Logic) Aristotle 4th Century B.C. AB A is true (Minor Premise) thereforeB is true (conclusion) } T → T STRONG SYLLOGISMS B is False (Minor Premise) thereforeA is False (conclusion) F ← F Inductive Inference (Plausible Reasoning) B is true (Minor Premise) thereforeA is more plausible } t ← T WEAK SYLLOGISMS A is false (Minor Premise) thereforeB is less plausible F → f

  4. What is Inference? Deductive Logic: Effects or outcomes Cause Inductive Logic: Effects or observations Possible Causes

  5. What is a Probability? Frequentists Bayesians P(A|B) = Real number measure of the plausibility of proposition A, given (conditional upon) the truth of proposition B P(A) = long run relative frequency of A occurring in identical repeats of an observation “A” is restricted to propositions about random variables “A” can be any logical proposition All probabilities are conditional; we must be explicit what our assumptions B are (no such thing as an absolute probability!)

  6. Probability depends on our state of Knowledge Monte Hall A B C ?

  7. Probability depends on our state of Knowledge 7 Red 5 Blue ? 1st draw 2nd draw 5/12 Blue 7/12 Red

  8. The Desiderata of Bayesian Probability Theory • Degrees of plausibility are represented by real numbers (higher degree of belief represented by a larger number) • With extra evidence supporting a proposition, the plausibility should increase monotonically up to a limit (certainty). • Consistency. Multiple ways to arrive at a conclusion must all produce the same answer (see book for additional details)

  9. Logic and Probability • In the certainty limit, where probabilities go to zero (falsehood) or one (truth), then the sum and product rules reduce to formal Boolean deductive logic (strong syllogisms). • Bayesian Probability is therefore an extension of formal logic into intermediate states of knowledge. • Bayesian inference gives a measure of our state of knowledge about nature, not a measure of nature itself.

  10. The two rules underlyingprobability theory P(A|B) + P(A|B) = 1 SUM RULE: P(A,B|C) = P(A|C) P(B|A,C) PRODUCT RULE: = P(B|C) P(A|B,C) Blue, Left Blue Eyes Right Handed Left Handed All Kangaroos Brown Eyes

  11. Bayes’ Theorem Posterior P(Hi|I) P(D|Hi I) P(Hi|D,I) = Bayes Theorem: P(D|I) Hi=proposition asserting truth of a hypothesis of interest I =proposition representing prior information D =proposition representing the data P(D|Hi I) =Likelihood: probability of obtaining the data given that the hypothesis is true P(Hi|I) =Prior: probability of hypothesis before new data P(D|I) =Normalization factor (prob all hypothesis i sum to 1)

  12. Example: The Gambler’s coin problem P(H|I) P(D|HI) P(H|D,I) = P(D|I) Normalization factor – Ignore this for now as only need relative merit Prior – what do we know about the coin? Assume H=pdf(head) is uniformly distributed 0-1 Likelihood –if we assume the data D gives R heads in N tosses: P(D|HI)  HR (1-H)N-R The full distribution, assuming independence of throws, is the Binomial Distribution. We omit terms not containing H, and use a proportionality.

  13. Data Example: A fair coin? H H T T

  14. Example: A fair coin?

  15. The effects of the Prior

More Related