1 / 39

CPSC 7373: Artificial Intelligence Lecture 5: Probabilistic Inference

CPSC 7373: Artificial Intelligence Lecture 5: Probabilistic Inference. Jiang Bian, Fall 2012 University of Arkansas at Little Rock. Overview and Example.

maddox
Download Presentation

CPSC 7373: Artificial Intelligence Lecture 5: Probabilistic Inference

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CPSC 7373: Artificial IntelligenceLecture 5: Probabilistic Inference Jiang Bian, Fall 2012 University of Arkansas at Little Rock

  2. Overview and Example • The alarm (A) might go off because of either a Burglary (B) and/or an Earthquake (E). And when the alarm (A) goes off, either John (J) and/or Mary (M) will call to report. • Possible questions: • Given the evidence of either B or E, what’s the probability of J or M will call? • Answer to this type of questions: • Posterior distribution: P(Q1, Q2 … | E1=e1, E2=e2) • It's the probability distribution of one or more query variables given the values of the evidence variables. EVIDENCE B E HIDDEN A QUERY J M

  3. Overview and Example • The alarm (A) might go off because of either a Burglary (B) and/or an Earthquake (E). And when the alarm (A) goes off, either John (J) and/or Mary (M) will call to report. • Possible questions: • Out of all the possible values for all the query variables, which combination of values has the highest probability? • Answer to these questions: • argmaxq: P(Q1=q1, Q2=q2 … | E1=e1, …) • Which Q values are maxable given the evidence values? EVIDENCE B E HIDDEN A QUERY J M

  4. Overview and Example Imagine the situation where Mary has called to report that the alarm is going off, and we want to know whether or not there has been a burglary. For each of the nodes, tell us if the node is an evidence node, a hidden node or a query node? B E A J M

  5. Overview and Example Imagine the situation where Mary has called to report that the alarm is going off, and we want to know whether or not there has been a burglary. For each of the nodes, tell us if the node is an evidence node, a hidden node or a query node? Evidence: M Query: B Hidden: E, A, J B E A J M

  6. Inference through enumeration P(+b|+j, +m) = ??? Imagine the situation where both John and Mary have called to report that the alarm is going off, and we want to know the probability of a burglary. B E A Definition: Conditional probability: P(Q|E) = P(Q, E) / P(E) J M

  7. Inference through enumeration P(+b|+j, +m) = ??? = P(+b, +j, +m) / P(+j, +m) P(+b, +j, +m) B E A J M Definition: Conditional probability: P(Q|E) = P(Q, E) / P(E)

  8. Inference through enumeration B E Given +e, +a ??? A J M

  9. Inference through enumeration P(+b, +j, +m)

  10. Inference through enumeration P(+j, +m)

  11. Inference through enumeration P(+b|+j, +m) = ??? = P(+b, +j, +m) / P(+j, +m) = 0.0005922376 / 0.44741431924 = 0.284 B E A J M Definition: Conditional probability: P(Q|E) = P(Q, E) / P(E)

  12. Enumeration • We assumed binary events/Boolean variables. • Only 5 variables: • 25= 32 rows in the CPT • Practically, what if we have a large network? B E A J M

  13. Example: Car-diagnosis Initial evidence: engine won't start Testable variables (thin ovals), diagnosis variables (thick ovals) Hidden variables (shaded) ensure sparse structure, reduce parameters

  14. Example: Car insurance Predict claim costs (medical, liability, property) given data on application form (other unshaded nodes) If Boolean: 227rows in the CPT NOT Boolean in reality.

  15. Speed Up Enumeration P(+b, +j, +m) Pulling out terms:

  16. Speed up enumeration • Maximize Independence • The structure of the Bayes network determines how efficient to calculate the probability values. O(n) X1 X2 Xn X1 Xn X2 O(2n)

  17. Bayesian networks: definition • A simple, graphical notation for conditional independence assertions and hence for compact specification of full joint distributions • Syntax: • a set of nodes, one per variable • a directed, acyclic graph (link = “directly influences") • a conditional distribution for each node given its parents: P(Xi|Parents(Xi)) • In the simplest case, conditional distribution represented as a conditional probability table (CPT) giving the distribution over Xi for each combination of parent values

  18. Constructing Bayesian Networks • Dependent or Independent? • P(J|M) = P(J)? The alarm (A) might go off because of either a Burglary (B) and/or an Earthquake (E). And when the alarm (A) goes off, either John (J) and/or Mary (M) will call to report. Suppose we choose the ordering M, J, A, B, E B E A J M J M

  19. J M A P(A|J,M) = P(A|J)? P(A|J,M) = P(A)?

  20. J M A P(B|A, J, M) = P(B|A)? P(B|A, J, M) = P(B)? B

  21. J M A B E P(E|B, A, J, M) = P(E|A)? P(E|B, A, J, M) = P(E|A, B)?

  22. J M • Deciding conditional independence is hard in non-causal directions • (Causal models and conditional independence seem hardwired for humans!) • Assessing conditional probabilities is hard in non-causal directions • Network is less compact: 1 + 2 + 4 + 2 + 4=13 numbers needed A B E

  23. Variable Elimination • Variable elimination: carry out summations right-to-left, storing intermediate results (factors) to avoid re-computation (sum out A) (sum out E)

  24. Variable Elimination • Variable elimination: • Summing out a variable from a product of factors: • move any constant factors outside the summation • add up submatrices in pointwise product of remaining factors • still N-P complete, but faster than enumeration Pointwise product of factors f1 and f2

  25. Variable Elimination R T L P(R) P(T|R) P(L|T) • Joining factors • P(R, T)

  26. Variable Elimination R T L P(L|T) RT L • P(R, T) Marginalize on the variable R, to gives us a table of just the variable T. P(R,T) - > P(T)

  27. Variable Elimination R T L P(L|T) RT L • P(R, T) 2) Marginalize on the variable R, to gives us a table of just the variable T. P(R,T) - > P(T)

  28. Variable Elimination R T L P(L|T) RT L T L 3) Joint probability of P(T, L) P(T)

  29. Variable Elimination R T L P(L|T) RT L T L 3) Joint probability of P(T, L) P(T)

  30. Variable Elimination R T L RT L 4) P(L) T L P(T, L) T, L

  31. Variable Elimination R T L RT L 4) P(L) T L P(T, L) T, L Choice of ordering is important!

  32. Approximate Inference: Sampling • Joint probability of heads and tails of a 1 cent, and a 5 cent coin. • Advantages: • Computationally easier. • Works even without CPTs.

  33. Sampling Example Sprinkler: P(S|C) Cloudy: P(C) C Samples: +c, ¬s, +r S R Rain: P(R|C) Sprinkler: P(W|S,R) W • Sampling is consistent if we want to compute the full joint probability of the network or individual variables. • What about conditional probability? P(w|¬c) • Rejection sampling: need to reject samples that do not match the probabilities that we are interested in.

  34. Rejection sampling • Too many rejected samples make it in-efficient. • Likelihood weight sampling: inconsistent B A

  35. Likelihood weighting Sprinkler: P(S|C) Cloudy: P(C) C P(R|+s, +w) S R Rain: P(R|C) Sprinkler: P(W|S,R) W Weight samples: +c, 0.1 +s, +r, 0.99 +w weight: .01 x .99, +c, +s, +r, +w P(C|+s, +r) ??

  36. Gibbs Sampling • Markov Chain Monte Carlo (MCMC) • Sample one variable at a time conditioning on others. +c +s -r -w +c -s -r -w +c -s +r -w

  37. Monty Hall Problem • Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 2 [but the door is not opened], and the host, who knows what's behind the doors, opens another door, say No. 1, which has a goat. He then says to you, "Do you want to pick door No. 3?" Is it to your advantage to switch your choice? P(C=3|S=2) = ?? P(C=3|H=1,S=2) = ??

  38. Monty Hall Problem • Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 2 [but the door is not opened], and the host, who knows what's behind the doors, opens another door, say No. 1, which has a goat. He then says to you, "Do you want to pick door No. 3?" Is it to your advantage to switch your choice? P(C=3|S=2) = 1/3 P(C=3|H=1,S=2) = 2/3 Why???

  39. Monty Hall Problem • P(C=3|H=1,S=2) • = P(H=1|C=3,S=1)P(C=3|S=1)/SUM(P(H=1|C=i, S=2)P(C=i|S=2) = 2/3 • P(C=1|S=2) = P(C=2|S=2)=P(C=3|S=2) = 1/3

More Related