1 / 46

Bayesian Statistics: Asking the “Right” Questions

Bayesian Statistics: Asking the “Right” Questions. Michael L. Raymer, Ph.D. Statistical Games. “The defendant’s DNA is consistent with the evidentiary sample, and the defendant’s DNA type occurs with a frequency of one in 10,000,000,000.”.

axl
Download Presentation

Bayesian Statistics: Asking the “Right” Questions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bayesian Statistics: Asking the “Right” Questions Michael L. Raymer, Ph.D.

  2. Statistical Games “The defendant’s DNA is consistent with the evidentiary sample, and the defendant’s DNA type occurs with a frequency of one in 10,000,000,000.” “Only about 0.1% of wife batterers actually murder their wives. Therefore, evidence of abuse and battering should not be admissible in a murder trial.” M. Raymer – WSU, FBS

  3. The Question • “Given the evidentiary DNA typeand the defendant’s DNA type, what is the probability that the evidence sample contains the defendant’s DNA?” • Information available: • How common is each allele in a particular population? • CPI, RMP etc. M. Raymer – WSU, FBS

  4. An Example Problem • Suppose the rate of breast canceris 1% • Mammograms detect breast cancer in 80% of cases where it is present • 10% of the time, mammograms will indicate breast cancer in a healthy patient • If a woman has a positive mammogram result, what is the probability that she has breast cancer? M. Raymer – WSU, FBS

  5. Results • 75% -- 3 • 50% -- 1 • 25% -- 2 • <10% -- a lot M. Raymer – WSU, FBS

  6. Determining Probabilities • Counting all possible outcomes • If you flip a coin 4 times, what is the probability that you will get heads twice? • TTTT THTT HTTT HHTT • TTTH THTHHTTH HHTH • TTHT THHTHTHT HHHT • TTHH THHH HTHH HHHH • P(2 heads) = 6/16 = 0.375 M. Raymer – WSU, FBS

  7. Statistical Preliminaries • Frequency and Probability • We can guess at probabilities by counting frequencies: • P(heads) = 0.5 • The law of large numbers: the more samples we take the closer we will get to 0.5. M. Raymer – WSU, FBS

  8. Distributions • Counting frequencies gives us distributions Gaussian Distribution (Continuous) Binomial Distribution (Discrete) M. Raymer – WSU, FBS

  9. Density Estimation • Parametric • Assume a Gaussian (e.g.) distribution. • Estimate the parameters (,). • Non-parametric • Histogram sampling • Bin size is critical • Gaussian smoothingcan help M. Raymer – WSU, FBS

  10. Combining Probabilities • Non-overlapping outcomes: • Possible Overlap: • Independent Events: TheProduct Rule M. Raymer – WSU, FBS

  11. Product Rule Example • P(Engine > 200 H.P.) = 0.2 • P(Color = red) = 0.3 • Assuming independence: • P(Red & Fast) = 0.2 × 0.3 = 0.06 • 1/4 * 1/10 * 1/6 * 1/8 * 1/5  1/10,000 M. Raymer – WSU, FBS

  12. Statistical Decision Making • One variable: A ring was found at the scene of the crime. The ring is size 11. The defendant’s ring size is also 11. If a random ring were left at the crime scene, what is the probability that it would have been size 11? M. Raymer – WSU, FBS

  13. Multiple Variables • Assume independence: • Note what happens to significant digits! The ring is size 11, and also made of platinum. M. Raymer – WSU, FBS

  14. Which Question? • If a fruit has a diameter of 4”, how likely is it to be an apple? 4” Fruit Apples M. Raymer – WSU, FBS

  15. “Inverting” the question Given an apple, what is the probability that it will have a diameter of 4”? Given a 4” diameter fruit, what is the probability that it is an apple? M. Raymer – WSU, FBS

  16. Forensic DNA Evidence • Given alleles (17, 17), (19, 21),(14, 15.1), what is the probability that a DNA sample belongs to Bob? • Find all (17,17), (19,21), (14,15.1) individuals, how many of them are Bob? • How common are 17, 19, 21, 14, and 15.1 in “the population”? M. Raymer – WSU, FBS

  17. Conditional Probabilities • For related events, we can expressprobability conditionally: • Statistical Independence: M. Raymer – WSU, FBS

  18. Bayesian Decision Making • Terminology • We have an object, and we want to decide if it belongs to a class • Is this fruit a type of apple? • Does this DNA come from a Caucasian American? • Is this car a sports car? • We measure features of the object (evidence): • Size, weight, color • Alleles at various loci M. Raymer – WSU, FBS

  19. Bayesian Notation • Feature/Evidence Vector: • Classes & Posterior Probability: M. Raymer – WSU, FBS

  20. A Simple Example • You are given a fruit with adiameter of 4” – is it a pear or an apple? • To begin, we need to know the distributions of diameters for pears and apples. M. Raymer – WSU, FBS

  21. Maximum Likelihood Class-Conditional Distributions P(x) 1” 2” 3” 4” 5” 6” M. Raymer – WSU, FBS

  22. A Key Problem • We based this decision on (class conditional) • What we really want to use is (posterior probability) • What if we found the fruit in a pear orchard? • We need to know the prior probability of finding an apple or a pear! M. Raymer – WSU, FBS

  23. Prior Probabilities • Prior probability + Evidence Posterior Probability • Without evidence, what is the “prior probability” that a fruit is an apple? • What is the prior probability that a DNA sample comes from the defendant? M. Raymer – WSU, FBS

  24. The heart of it all • Bayes Rule M. Raymer – WSU, FBS

  25. Bayes Rule or M. Raymer – WSU, FBS

  26. Example Revisited • Is it an ordinary apple or an uncommon pear? M. Raymer – WSU, FBS

  27. Bayes Rule Example M. Raymer – WSU, FBS

  28. Bayes Rule Example M. Raymer – WSU, FBS

  29. Posing the question • What are the classes? • What is the evidence? • What is the prior probability? • What is the class-conditional probability? M. Raymer – WSU, FBS

  30. An Example Problem • Suppose the rate of breast canceris 1% • Mammograms detect breast cancer in 80% of cases where it is present • 10% of the time, mammograms will indicate breast cancer in a healthy patient • If a woman has a positive mammogram result, what is the probability that she has breast cancer? M. Raymer – WSU, FBS

  31. Practice Problem Revisited • Classes: healthy, cancer • Evidence: positive mammogram (pos), negative mammogram (neg) • If a woman has a positive mammogram result, what is the probability that she has breast cancer? M. Raymer – WSU, FBS

  32. A Counting Argument • Suppose we have 1000 women • 10 will have breast cancer • 8 of these will have a positive mammogram • 990 will not have breast cancer • 99 of these will have a positive mammogram • Of the 107 women with a positive mammogram, 8 have breast cancer • 8/107 0.075 = 7.5% M. Raymer – WSU, FBS

  33. Solution M. Raymer – WSU, FBS

  34. An Example Problem • Suppose the chance of a randomly chosen person being guilty is .001 • When a person is guilty, a DNA sample will match that individual 99% of the time. • .0001 of the time, a DNA will exhibit a false match for an innocent individual • If a DNA test demonstrates a match, what is the probability of guilt? M. Raymer – WSU, FBS

  35. Solution M. Raymer – WSU, FBS

  36. Marginal Distributions M. Raymer – WSU, FBS

  37. Combining Marginals • Assuming independent features: • If we assume independence and use Bayes rule, we have a Naïve Bayes decision maker (classifier). M. Raymer – WSU, FBS

  38. Bayes Decision Rule • Provably optimum when the features (evidence) follow Gaussian distributions, and are independent. M. Raymer – WSU, FBS

  39. Forensic DNA • Classes: DNA from defendant, DNA not from defendant • Evidence: Allele matches at various loci • Assumption of independence • Prior Probabilities? • Assumed equal (0.5) • What is the true prior probability that an evidence sample came from a particular individual? M. Raymer – WSU, FBS

  40. The Importance of Priors M. Raymer – WSU, FBS

  41. Likelihood Ratios • When deciding between two possibilities, we don’t need the exact probabilities. We only need to know which one is greater. • The denominator for all the classes is always equal. • Can be eliminated • Useful when there are many possible classes M. Raymer – WSU, FBS

  42. Likelihood Ratio Example  M. Raymer – WSU, FBS

  43. Likelihood Ratio Example M. Raymer – WSU, FBS

  44. From alleles to identity: • It is relatively easy to find the allele frequencies in the population • Marginal probability distributions • Independence assumption • Class conditional probabilities • Equal prior probabilities • Bayesian posterior probability estimate M. Raymer – WSU, FBS

  45. Thank you. M. Raymer – WSU, FBS

  46. A Key Advantage • The oldest citation: T. Bayes. “An essay towards solving a problem in the doctrine of chances.” Phil. Trans. Roy. Soc., 53, 1763. M. Raymer – WSU, FBS

More Related