Probabilistic Reasoning

Probabilistic Reasoning CIS 479/579 Bruce R. Maxim UM-Dearborn

Uncertainty • Dealing with incomplete and uncertain data is an important part of many AI systems • Approaches • Ad Hoc Uncertainty Factors • Classical Probability Theory • Fuzzy Set Theory • Dempster Shaffer Theory of Evidence

Using Probabilistic Reasoning • Relevant world or domain is random • more knowledge does not allow us to describe situation more precisely • Relevant domain is not random, rarely have access to enough data • experiments too costly or too dangerous • Domain is not random, just not described in sufficient detail • need to get more knowledge into the system

Certainty Factor Questions • How are certainties associated with rule inputs (e.g. antecedents)? • How does the rule translate input certainty to output certainty (i.e. how deterministic is the rule?) • How do you determine the certainty of facts supported by several rules?

Ad Hoc Approach • Minimum of value on the interval [0,1] associated with each rule antecedent is a rule’s input certainty • Assume some attenuation or deterministic rule will be used as a multiplier to map input certainty to output certainty • When several rules supporting the same fact the maximum if the rule output certainties will be the overall certainty of the fact

Ad Hoc Example

Ad Hoc Example • Rule A1 translates input to output (0.9) * (1.0) = 1.0 • Rule A2 translates input to output (0.25) * (1.0) = 0.25 • Fact supported by A1 and A2 max(0.25, 0.9) = 0.9 • Input to Rule A7 min(0.9, 0.25) = 0.25 • Rule A7 translates input to output (0.25) * (0.8) = (0.2)

Probability Axioms P(E) = Number of desired outcomes Total number of outcomes = | event | / |sample space| P(not E) = P(~E) = 1 – P(E)

Additive Laws P(A or B) = P(A  B) = P(A) + P(B) – P(A  B) If A and B are mutually exclusive A  B =  P(A  B ) = 0 P(A or B) = P(A) + P(B)

Multiplicative Laws P(A and B) = P(A  B) = P(A) * P(B|A) = P(B) * P(A|B) For independent events P(B|A) = P(B) P(A|B) = P(A) P(A  B) = P(A) * P(B)

Bayes Example • Prior Probability it will rain P(H) = 0.8 • Conditional probabilities: Geese on the lake, given rain tomorrow P(E|H) = 0.2 Geese on lake, with no rain tomorrow P(E | ~H) = 0.025

Bayes Example • Evidence P(E) = P(E | H) * P(H) + P(E | ~H) * P(~H) = (0.02)*(0.8) + (0.025)*(0.2) = (0.016) + (0.005) = 0.021 • Posterior probability Rain given geese on lake P(H | E) = (P( E | H) * P(H)) / P(E) = (0.016 / 0.021) = 0.7619

Bayes Example • Posterior probability No rain given geese on lake P(~H | E) = (P( E | ~H) * P(~H)) / P(E) = (0.005 / 0.021) = 0.2381

Weakness of Bayes Approach • Difficult to get all apriori conditional and joint probabilities required • Database of priorities is hard to modify because of large number of interactions • Lots of calculations required • Outcomes must be disjoint • Accuracy depends on complete hypothesis

Problems Which Can Make Use of Probabilistic Inference • Information available is of varying certainty or completeness • Need nearly optimal solutions • Need to justify decisions in favor of alternate decisions • General rules of inference are known or can be found for the problem

Fuzzy Set Theory • In ordinary set theory every element “x” from a given universe is either in or out of a set S x  S x  S • In fuzzy set theory set membership is not so easily determined

When is a pile of chalk big? • If we have three pieces of chalk in the room is that considered a big pile of chalk? • Some people might say, yes that is a big pile and some would not. • Someplace between those three pieces of chalk and a whole room full of chalk the pile of chalk turns from a small pile into a big pile. • This could be a different spot for different people.

Membership Function F:[0,1]n [0,1] x  S f(x) x  S 1/x

Possibilistic Logic Dependent Events Probabilistic Logic Independent Events A a a B b b not A 1-a 1 - a A and B min(a,b) a * b A or B max(a,b) a + b – a*b A  B not A or (A and B) max(1 - a, b) (1 - a) + a * b A xor B max(min(a, 1 - b), min(1 - a, b)) a+b-2ab+a2b+ab2-a2b2

Possibilistic Example Assume P(X) = 0.5, P(Y) = 0.1, P(Z) = 0.2 Determine P(X  (Y or Z)) P(Y or Z) = max(P(Y), P(Z)) = max(0.1, 0.2) = 0.2 P(X  (Y or Z)) = max(1 – P(X), P(Y or Z)) = max(1 – 0.5, 0.2) = max(0.5, 0.2) = 0.5

Probabilistic Example Assume P(X) = 0.5, P(Y) = 0.1, P(Z) = 0.2 Determine P(X  (Y or Z)) P(Y or Z) = P(Y) + P(Z) – P(Y) * P(Z) = 0.1 + 0.2 – 0.1 * 0.2 = 0.3 – 0.02 = 0.28 P(X  (Y or Z)) = not P(X) + P(X) * P(Y or Z) = (1 – 0.5) + 0.2 * 0.28) = 0.5 + 0.14 = 0.64

Bayesian Inference

Bayesian Inference • Symptoms S1: Clanking Sound S2: Low pickup S3: Starting problem S4: Parts are hard to find • Conclusion C1: Repair Estimate > $250

Bayeisan Inference • Intermediate Hypotheses H1: Thrown connecting rod H2: Wrist Pin Loose H3: Car Out of Tune • Secondary Hypotheses H4: Replace or Rebuild Engine H5: Tune Engine

Bayeisan Inference • These must be known in advance P(H1), P(H2), P(H3) P(S | Hi) for i = 1, 2, 3 • Computed using Bayes formula P(S) =  P(Hi) P(S | Hi) P(Hi | S) for i = 1, 2, 3

Bayesian Inference • H4: Replace or Rebuild Engine P(H4) = P(H1 or H2) = max(P(H1 | S), P(H2 | S)) • H5: Tune Engine P(H5) = not (H1 or H2) and H3 = min(1 – max(P(H1 | S), P(H2 | S)), P(H3)) • C1: Repair Estimate > $250 P(C1) = P(H4 or P(H5 and S4)) = max(P(H4 | S), min(P(H5 | S), V) note: V = 1 if S4 is true and 0 otherwise

Probabilistic Reasoning