320 likes | 469 Views
CS 416 Artificial Intelligence. Lecture 14 Uncertainty Chapter 13. An apology to Red Sox fans. The only team ever in baseball to take a 3-0 series to a game seven I was playing the probabilities…. Shortcomings of first-order logic. Consider dental diagnosis
E N D
CS 416Artificial Intelligence Lecture 14 Uncertainty Chapter 13
An apology to Red Sox fans • The only team ever in baseball to take a 3-0 series to a game seven • I was playing theprobabilities…
Shortcomings of first-order logic • Consider dental diagnosis • Not all patients with toothaches have cavities. There are other causes of toothaches
Shortcomings of first-order logic • What’s wrong with this? • An unlimited number of toothache causes
Shortcomings of first-order logic • Alternatively, create a causal rule • Again, not all cavities cause pain. Must expand
Shortcomings of first-order logic • Both diagnostic and causal rules require countless qualifications • Difficult to be exhaustive • Too much work • We don’t know all the qualifications • Even correctly qualified rules may not be useful if the real-time application of the rules is missing data
Shortcomings of first-order logic • As an alternative to exhaustive logic… • Probability Theory • Serves as a hedge against our laziness and ignorance
Degrees of belief • I believe the glass is full with 50% chance • Note this does not indicate the statement is half-true • We are not talking about a glass half-full • “The glass is full” is the only statement being considered • My statement indicates I believe with 50% that the statement is true. There are no claims about what other beliefs I have regarding the glass. • Fuzzy logic handles partial-truths
Decision Theory • What is rational behavior in context of probability? • Pick answer that satisfies goals with highest probability of actually working? • Sometimes more risk is acceptable • Must have a utility function that measures the many factors related to an agent’s happiness with an outcome • An agent is rational if and only if it chooses the action that yields the highest expected utility, averaged over all the possible outcomes of the action
Building probability notation • Propositions • Like propositional logic. The things we believe • Atomic Events • A complete specification of the state of the world • Prior Probability • Probability something is true in absence of other data • Conditional Probability • Probability something is true given something else is known
Propositions • Like propositional logic • Random variables refer to parts of the world with unknown status • Random variables have a well-defined domain • Boolean • Discrete (countable) • Continuous
Atomic events • A complete specification of the world • All variables in the world are assigned values • Only one atomic event can be true • The set of all atomic events is exhaustive – at least one must be true • Any atomic even entails the truth or falsehood of every proposition
Prior probability • The degree of belief in the absence of other info • P (Weather) • P (Weather == sunny) = 0.7 • P (Weather == rainy) = 0.2 • P (Weather == cloudy) = 0.08 • P (Weather == snowy) = 0.02 • P (Weather) = <0.7, 0.2, 0.08, 0.02> • Probability distribution for the random variable Weather
Prior probability - Discrete • Joint probability distribution • P (Weather, Natural Disaster) = an n x m table of probs • n = instances of weather • m = instances of natural disasters • Full joint probability distribution • Probabilities for all variables are established • What about continuous variables where a table won’t suffice?
Prior probability - Continuous • Probability density functions (PDFs) • P (X = x) = Uniform [18, 26] (x) • The probability that tomorrow’s temperature is 20.5 degrees Celsius is U [18, 26] (20.5) = 0.125
Conditional probability • The probability of a given all we know is b • P (a | b) • Written as an unconditional probability
Axioms of probability • All probabilities are between 0 and 1 • Necessarily true propositions have probability 1Necessarily false propositions have probability 0 • The probability of disjunction is:
Using axioms of probability • The probability of a proposition is equal to the sum of the probabilities of the atomic events in which it holds:
An example • Maginalization: • Conditioning:
Normalization • Two previous calculations had the same denominator • P(cavity | toothache) = a P(cavity, toothache) • = a [P(cavity, toothache, catch) + P(cavity, toothache, ~catch)] • = a [<0.108, 0.016> + <0.012, 0.064>] = a<0.12, 0.08> = <0.6, 0.4> • Generalized (X = cavity, e = toothache, y = catch) • P (X, e, y) is a subset of the full joint distribution
Using the full joint distribution • It does not scale well… • n Boolean variables • Table size O (2n) • Process time O (2n)
Independence • Independence of variables in a domain can dramatically reduce the amount of information necessary to specify the full joint distribution • Adding weather (four states) to this table requires creating four versions of it (one for each weather state) = 8*4=32 cells
Independence • P (toothache, catch, cavity, Weather=cloudy) = P(Weather=cloudy | toothache, catch, cavity) * P(toothache, catch, cavity) • Because weather and dentistry are independent • P (Weather=cloudy | toothache, catch, cavity) = P (Weather = cloudy) • P (toothache, catch, cavity, Weather=cloudy) = P(Weather=cloudy) * P(toothache, catch, cavity)4-cell table 8-cell table
Bayes’ Rule • Useful when you know three things and need to know the fourth
Example • Meningitis • Doctor knows meningitis causes stiff necks 50% of the time • Doctor knows unconditional facts • The probability of having meningitis is 1 / 50,000 • The probability of having a stiff neck is 1 / 20 • The probability of having meningitis given a stiff neck:
Power of Bayes’ rule • Why not collect more diagnostic evidence? • Statistically sample to learn P (m | s) = 1 / 5,000 • If P(m) changes… due to outbreak… Bayes’ computation adjusts automatically, but sampled P(m | s) is rigid
Conditional independence • Consider the infeasibility of full joint distributions • We must know P(toothache and catch) for all Cavity values • Simplify using independence • Toothache and catch are not independent • Toothache and catch are independent given the presence or absence of a cavity
Conditional independence • Toothache and catch are independent given the presence or absence of a cavity • If you know you have a cavity, there’s no reason to believe the toothache and the dentist’s pick are related
Conditional independence • In general, when a single cause influences multiple effects, all of which are conditionally independent (given the cause)
Naïve Bayes • Even when “effect” variables are not conditionally independent, this model is sometimes used • Sometimes called a Bayesian Classifier