E N D
1. A Quick Overview of Probability Tom Mitchell
Machine Learning 10-601
Jan 21 2009
a significant amount of this material is pilfered from Andrew Moore’s slides and William Cohen’s slides
www.cs.cmu.edu/~awm/tutorials
http://www.cs.cmu.edu/~tom/10601_sp08/slides/probability-1-23-2008.ppt
2. The Problem of Induction David Hume (1711-1776): pointed out
Empirically, induction seems to work
Statement (1) is an application of induction.
This stumped people for about 200 years
3. A Second Problem of Induction A black crow seems to support the hypothesis “all crows are black”.
A pink highlighter supports the hypothesis “all non-black things are non-crows”
Thus, a pink highlighter supports the hypothesis “all crows are black”.
4. Probability Theory Events
discrete random variables, continuous random variables, compound events
Axioms of probability
What defines a reasonable theory of uncertainty
Independent events
Conditional probabilities
Bayes rule and beliefs
Joint probability distribution
5. Random Variables Informally, A is a random variable if
A denotes something about which we are uncertain
perhaps the outcome of a randomized experiment
Examples
A = True if a randomly drawn person from our class is female
A = Hometown of a randomly drawn person from our class
A = True if two randomly drawn persons from our class have same birthday
A = True if the 1,000,000,000,000th digit of pi is 7
Define P(A) as “the fraction of possible worlds in which A is true”
the set of possible worlds is called the sample space, S
A random variable A is a function defined over S
A: S ? {0,1}
6. A little formalism More formally, we have
a sample space S (e.g., set of students in our class)
aka the set of possible worlds
a random variable is a function defined over the sample space
Gender: S ? { m, f }
Weight: S ? Reals
an event is a subset of S
e.g., the subset of S for which Gender=f
e.g., the subset of S for which (Gender=m) AND (nationality=US)
we’re often interested in probabilities of specific events
and specific events conditioned on other specific events
7. Visualizing A
8. The Axioms of Probability 0 <= P(A) <= 1
P(True) = 1
P(False) = 0
P(A or B) = P(A) + P(B) - P(A and B)
10. These Axioms are Not to be Trifled With There have been many many other approaches to understanding “uncertainty”:
Fuzzy Logic, three-valued logic, Dempster-Shafer, non-monotonic reasoning, …
25 years ago people in AI argued about these; now they mostly don’t
Any scheme for combining uncertain information, uncertain “beliefs”, etc,… really should obey these axioms
If you gamble based on “uncertain beliefs”, then [you can be exploited by an opponent] ? [your uncertainty formalism violates the axioms] - di Finetti 1931 (the “Dutch book argument”)
11. Interpreting the axioms 0 <= P(A) <= 1
P(True) = 1
P(False) = 0
P(A or B) = P(A) + P(B) - P(A and B)
12. Interpreting the axioms 0 <= P(A) <= 1
P(True) = 1
P(False) = 0
P(A or B) = P(A) + P(B) - P(A and B)
13. Interpreting the axioms 0 <= P(A) <= 1
P(True) = 1
P(False) = 0
P(A or B) = P(A) + P(B) - P(A and B)
14. Theorems from the Axioms 0 <= P(A) <= 1, P(True) = 1, P(False) = 0
P(A or B) = P(A) + P(B) - P(A and B)
? P(not A) = P(~A) = 1-P(A)
15. Theorems from the Axioms 0 <= P(A) <= 1, P(True) = 1, P(False) = 0
P(A or B) = P(A) + P(B) - P(A and B)
? P(not A) = P(~A) = 1-P(A)
16. Elementary Probability in Pictures P(~A) + P(A) = 1
17. Another useful theorem 0 <= P(A) <= 1, P(True) = 1, P(False) = 0
P(A or B) = P(A) + P(B) - P(A and B)
? P(A) = P(A ^ B) + P(A ^ ~B)
18. Elementary Probability in Pictures P(A) = P(A ^ B) + P(A ^ ~B)
19. Multivalued Discrete Random Variables Suppose A can take on more than 2 values
A is a random variable with arity k if it can take on exactly one value out of {v1,v2, .. vk}
Thus…
20. Elementary Probability in Pictures
21. More about Multivalued Random Variables Using the axioms of probability…
0 <= P(A) <= 1, P(True) = 1, P(False) = 0
P(A or B) = P(A) + P(B) - P(A and B)
And assuming that A obeys…
22. More about Multivalued Random Variables Using the axioms of probabilityand assuming that A obeys…
23. Definition of Conditional Probability
24. Conditional Probability in Pictures
25. Independent Events Definition: two events A and B are independent if Pr(A and B)=Pr(A)*Pr(B).
Intuition: outcome of A has no effect on the outcome of B (and vice versa).
We need to assume the different rolls are independent to solve the problem.
You almost always need to assume independence of something to solve any learning problem.
28. More General Forms of Bayes Rule
29. More General Forms of Bayes Rule
34. The Joint Distribution
35. The Joint Distribution
36. The Joint Distribution
37. The Joint Distribution
38. Using the Joint
39. Using the Joint
40. Using the Joint
41. Inference with the Joint
42. Inference with the Joint
51. Inference is a big deal I’ve got this evidence. What’s the chance that this conclusion is true?
I’ve got a sore neck: how likely am I to have meningitis?
I see my lights are out and it’s 9pm. What’s the chance my spouse is already asleep?