1 / 25

Bayes Nets

Bayes Nets. Introduction to Artificial Intelligence CS440/ECE448 Lecture 19 New homework out today!. Last lecture. Independence and conditional independence Bayes nets This lecture The semantics of Bayes nets Inference with Bayes nets Reading Chapter 14.

rosemary
Download Presentation

Bayes Nets

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bayes Nets Introduction to Artificial Intelligence CS440/ECE448 Lecture 19 New homework out today!

  2. Last lecture • Independence and conditional independence • Bayes nets This lecture • The semantics of Bayes nets • Inference with Bayes nets Reading • Chapter 14

  3. Marginalization & Conditioning • Marginalization: Given a joint distribution over a set of variables, the distribution over any subset (called a marginal distribution for historical reasons) can be calculated by summing out the other variables: P(X) = zP(X, Z=z) • Conditioning: Given a conditional distribution P(X | Z), we can compute the unconditional distribution P(X) by using marginalization and the product rule: P(X) = zP(X, Z=z) = zP(X | Z=z) P(Z=z)

  4. Absolute Independence • Two random variables A and B are (absolutely) independent iff P(A, B) = P(A)P(B) • Using product rule for A & B independent, we can show: P(A, B) = P(A | B)P(B) = P(A)P(B) ThereforeP(A | B) = P(A) • If n Boolean variables are independent, the full joint is: P(X1, …, Xn) = iP(Xi) Full joint is generally specified by 2n - 1 numbers, but when independent only n numbers are needed. • Absolute independence is a very strong requirement, seldom met!!

  5. ConditionalIndependence • Some evidence may be irrelevant, allowing simplification, e.g., P(Cavity | Toothache, Cubswin) = P(Cavity | Toothache) • This property is known as Conditional Independence and can be expressed as: P(X | Y,Z) = P(X | Z) which says that X and Y independent given Z. • If I have a cavity, the probability that the probe catches in it doesn't depend on whether I have a toothache: 1. P(Catch | Toothache, cavity) = P(Catch | cavity) The same independence holds if I don’t have a cavity: 2. P(Catch | Toothache, ~cavity) = P(Catch | ~cavity)

  6. Equivalent definitions of conditional independence X and Y are independent given Z when: P(X | Y, Z) = P(X | Z) or P(Y | X, Z) = P(Y | Z) or P(X, Y | Z) = P(X | Z) P(Y | Z)

  7. Example • Topology of network encodes conditional independence assertions: • Weather is independent of the other variables • Toothache and Catch are conditionally independent given Cavity

  8. Example I am at work. Neighbor John calls to say my alarm is ringing, but neighbor Mary doesn't call. Sometimes it is set off by a minor earthquake. Is there a burglar? Variables:Burglar, Earthquake, Alarm, JohnCalls, MaryCalls Network topology reflects ``causal'' knowledge:

  9. Compactness • A CPT for Boolean Xi with k Boolean parents has 2k rows for the combinations of parent values. • Each row requires one number p for Xi = true (the number for Xi = false is just 1-p). • If each variable has no more than k parents, the complete network requires O(n · 2k) numbers. • I.e., grows linearly with n, vs. O(2n)for the full joint distribution. • For burglary net, 1 + 1 + 4 + 2 + 2 = 10 numbers (vs. 25-1 = 31).

  10. Semantics • “Global” semantics defines the full joint distribution as the product of the local conditional distributions: e.g., P(j m  a b   e) = P(b)P(e)P(a | b e)P(j | a)P(m | a) • “Local” semantics: each node is conditionally independent of its nondescendants given its parents: P(Xi | X1,…, Xi-1) =P(Xi | Parents(Xi)) Theorem: Local semantics  global semantics

  11. Full Joint as fully connected Bayes Net Chain rule is derived by successive application of product rule: P(X1,…Xn) = P(X1, …, Xn-1) P(Xn | X1, …, Xn-1) = P(X1, …, Xn-2) P(Xn-1| X1 , …, Xn-2) P(Xn | X1, …, Xn-1) n =  P(Xi | X1, …, Xi-1) i=1 What does this look like as a Bayes Net? X1 X2 X3 X4 X5

  12. P(A,B,C)=P(C|A,B)P(B|A)P(A) C Table for P(C|A,B) • This is as complicated a network as possible for three random variables • It is not the only way to represent P(A,B,C) as a Bayes Net. B Table for P(B|A) A Table for P(A)

  13. P(A,B,C)=P(A|B,C)P(B|C)P(C) C Table for P(C) • This is just as complicated a network as the previous network. • Suppose B and C are independent of each other, i.e., P(B | C) = P(B). What does the Bayes net look like? B Table for P(B|C) A Table for P(A|B,C)

  14. P(A,B,C)=P(A|B,C)P(B|C)P(C) C Table for P(C) Suppose B and C are independent of each other i.e. P(B | C) = P(B). What does the Bayes net look like? Link between C and B goes away & B’s table is simplified. B Table for P(B) A Table for P(A|B,C)

  15. P(A,B,C)=P(A|B,C)P(B|C)P(C) C Table for P(C) • Suppose A is independent of B given C, i.e. P(A | B,C) = P(A|C). What does Bayes net look like? B Table for P(B|C) A Table for P(A|B,C)

  16. P(A,B,C)=P(A|B,C)P(B|C)P(C) C Table for P(C) • Suppose A is independent of B given C, i.e. P(A | B,C) = P(A|C). What does Bayes net look like? Link between B & A disappears and A’s table is simplified. B Table for P(B|C) A Table for P(A|C)

  17. Constructing belief networks Choose an ordering of variables X1, ..., Xn. For i = 1 to n • Add node Xi to the network. • Draw link from parents in {X1,…, Xi-1} satisfying the conditional independence property, i.e. P(Xi | X1,…, Xi-1) =P(Xi | Parents(Xi)) . • Create conditional probability table for node Xi. Note that there are many legal belief networks for a set of random variables, and the specific network depends upon the order chosen.

  18. Fever Spots Flu Measles Not too intuitive.. An ordering: Fever, Spots, Flu, Measles

  19. Flu Measles Spots Fever It is often better to start by adding causes and then effects.. Another order:Flu, Measles, Fever, Spots

  20. MaryCalls JohnCalls Earthquake No P(A | J, M) = P(A | J)? P(A | J, M) = P(A)? No No Burglary Alarm P(B | A, J, M) = P(B | A)? P(B | A, J, M) = P(B)? Yes No No P(E | B, A, J, M) = P(E | A)? P(E | B, A, J, M) = P(E | A, B)? Yes Example Suppose we choose an ordering M, J, A, B, E P(J | M) = P(J)?

  21. Inference in Bayes nets • Typical query: Compute P(X | E1=e1, … , Em=em) = P(X | E=e) • Denote by Y=(Y1, …, Yk) the remaining (hidden) vars. • P(X | E=e) = P(X , E=e) / P (E=e) = P(X, E=e) • P(X | E=e) = yP(X, E=e,Y=y) • Then use the CPTs to compute the joint probabilities..

More Related