Bayes Nets

Bayes Nets Introduction to Artificial Intelligence CS440/ECE448 Lecture 19 New homework out today!

Last lecture • Independence and conditional independence • Bayes nets This lecture • The semantics of Bayes nets • Inference with Bayes nets Reading • Chapter 14

Marginalization & Conditioning • Marginalization: Given a joint distribution over a set of variables, the distribution over any subset (called a marginal distribution for historical reasons) can be calculated by summing out the other variables: P(X) = zP(X, Z=z) • Conditioning: Given a conditional distribution P(X | Z), we can compute the unconditional distribution P(X) by using marginalization and the product rule: P(X) = zP(X, Z=z) = zP(X | Z=z) P(Z=z)

Absolute Independence • Two random variables A and B are (absolutely) independent iff P(A, B) = P(A)P(B) • Using product rule for A & B independent, we can show: P(A, B) = P(A | B)P(B) = P(A)P(B) ThereforeP(A | B) = P(A) • If n Boolean variables are independent, the full joint is: P(X1, …, Xn) = iP(Xi) Full joint is generally specified by 2n - 1 numbers, but when independent only n numbers are needed. • Absolute independence is a very strong requirement, seldom met!!

ConditionalIndependence • Some evidence may be irrelevant, allowing simplification, e.g., P(Cavity | Toothache, Cubswin) = P(Cavity | Toothache) • This property is known as Conditional Independence and can be expressed as: P(X | Y,Z) = P(X | Z) which says that X and Y independent given Z. • If I have a cavity, the probability that the probe catches in it doesn't depend on whether I have a toothache: 1. P(Catch | Toothache, cavity) = P(Catch | cavity) The same independence holds if I don’t have a cavity: 2. P(Catch | Toothache, ~cavity) = P(Catch | ~cavity)

Example • Topology of network encodes conditional independence assertions: • Weather is independent of the other variables • Toothache and Catch are conditionally independent given Cavity

Example I am at work. Neighbor John calls to say my alarm is ringing, but neighbor Mary doesn't call. Sometimes it is set off by a minor earthquake. Is there a burglar? Variables:Burglar, Earthquake, Alarm, JohnCalls, MaryCalls Network topology reflects ``causal'' knowledge:

Compactness • A CPT for Boolean Xi with k Boolean parents has 2k rows for the combinations of parent values. • Each row requires one number p for Xi = true (the number for Xi = false is just 1-p). • If each variable has no more than k parents, the complete network requires O(n · 2k) numbers. • I.e., grows linearly with n, vs. O(2n)for the full joint distribution. • For burglary net, 1 + 1 + 4 + 2 + 2 = 10 numbers (vs. 25-1 = 31).

Semantics • “Global” semantics defines the full joint distribution as the product of the local conditional distributions: e.g., P(j m  a b   e) = P(b)P(e)P(a | b e)P(j | a)P(m | a) • “Local” semantics: each node is conditionally independent of its nondescendants given its parents: P(Xi | X1,…, Xi-1) =P(Xi | Parents(Xi)) Theorem: Local semantics  global semantics

Full Joint as fully connected Bayes Net Chain rule is derived by successive application of product rule: P(X1,…Xn) = P(X1, …, Xn-1) P(Xn | X1, …, Xn-1) = P(X1, …, Xn-2) P(Xn-1| X1 , …, Xn-2) P(Xn | X1, …, Xn-1) n =  P(Xi | X1, …, Xi-1) i=1 What does this look like as a Bayes Net? X1 X2 X3 X4 X5

P(A,B,C)=P(C|A,B)P(B|A)P(A) C Table for P(C|A,B) • This is as complicated a network as possible for three random variables • It is not the only way to represent P(A,B,C) as a Bayes Net. B Table for P(B|A) A Table for P(A)

P(A,B,C)=P(A|B,C)P(B|C)P(C) C Table for P(C) • This is just as complicated a network as the previous network. • Suppose B and C are independent of each other, i.e., P(B | C) = P(B). What does the Bayes net look like? B Table for P(B|C) A Table for P(A|B,C)

P(A,B,C)=P(A|B,C)P(B|C)P(C) C Table for P(C) Suppose B and C are independent of each other i.e. P(B | C) = P(B). What does the Bayes net look like? Link between C and B goes away & B’s table is simplified. B Table for P(B) A Table for P(A|B,C)

Constructing belief networks Choose an ordering of variables X1, ..., Xn. For i = 1 to n • Add node Xi to the network. • Draw link from parents in {X1,…, Xi-1} satisfying the conditional independence property, i.e. P(Xi | X1,…, Xi-1) =P(Xi | Parents(Xi)) . • Create conditional probability table for node Xi. Note that there are many legal belief networks for a set of random variables, and the specific network depends upon the order chosen.

Fever Spots Flu Measles Not too intuitive.. An ordering: Fever, Spots, Flu, Measles

Flu Measles Spots Fever It is often better to start by adding causes and then effects.. Another order:Flu, Measles, Fever, Spots

Inference in Bayes nets • Typical query: Compute P(X | E1=e1, … , Em=em) = P(X | E=e) • Denote by Y=(Y1, …, Yk) the remaining (hidden) vars. • P(X | E=e) = P(X , E=e) / P (E=e) = P(X, E=e) • P(X | E=e) = yP(X, E=e,Y=y) • Then use the CPTs to compute the joint probabilities..

Bayes Nets

Bayes Nets

Presentation Transcript

Learning in Bayes Nets

Identifying Conditional Independencies in Bayes Nets

Learning Structure in Bayes Nets (Typically also learn CPTs here)

Nets

Bayes Nets

A Tractable Pseudo-Likelihood for Bayes Nets Applied To Relational Data

Bayes Nets

Bayes Nets and Probabilities

Modelling Relational Statistics With Bayes Nets

Exact Inference in Bayes Nets

CSCI 121 Special Topics: Bayesian Networks Lecture #5: Dynamic Bayes Nets

Bayes Nets

The IMAP Hybrid Method for Learning Gaussian Bayes Nets

Artificial Intelligence Chapter 20 Learning and Acting with Bayes Nets

Learning Bayes Nets Based on Conditional Dependencies

Mind Change Optimal Learning Bayes Nets Structure

An Overview of Learning Bayes Nets From Data

CSCI 121 Special Topics: Bayesian Networks Lecture #2: Bayes Nets

NETS

Bayes Nets

Probability Review and Intro to Bayes Nets

Bayes nets Computing conditional probability