890 likes | 1.09k Views
Tutorial on Bayesian Networks. Jack Breese Microsoft Research breese@microsoft.com. Daphne Koller Stanford University koller@cs.stanford.edu. First given as a AAAI’97 tutorial. Probabilities. Probability distribution P(X| x) X is a random variable Discrete Continuous
E N D
Tutorial on Bayesian Networks Jack Breese Microsoft Research breese@microsoft.com Daphne Koller Stanford University koller@cs.stanford.edu First given as a AAAI’97 tutorial.
Probabilities • Probability distribution P(X|x) • X is a random variable • Discrete • Continuous • xis background state of information
Discrete Random Variables • Finite set of possible outcomes X binary:
Continuous Random Variable • Probability distribution (density function) over continuous values 5 7
Bayesian networks • Basics • Structured representation • Conditional independence • Naïve Bayes model • Independence facts
P( S=no) 0.80 P( S=light) 0.15 P( S=heavy) 0.05 Smoking= no light heavy P( C=none) 0.96 0.88 0.60 P( C=benign) 0.03 0.08 0.25 P( C=malig) 0.01 0.04 0.15 Bayesian Networks Smoking Cancer
Product Rule • P(C,S) = P(C|S) P(S)
Marginalization P(Smoke) P(Cancer)
Cancer= none benign malignant P( S=no) 0.821 0.522 0.421 P( S=light) 0.141 0.261 0.316 P( S=heavy) 0.037 0.217 0.263 Bayes Rule Revisited
A Bayesian Network Age Gender Exposure to Toxics Smoking Cancer Serum Calcium Lung Tumor
Independence Age and Gender are independent. Age Gender P(A,G) = P(G)P(A) P(A|G) = P(A) A ^G P(G|A) = P(G) G ^A P(A,G) = P(G|A) P(A) = P(G)P(A) P(A,G) = P(A|G) P(G) = P(A)P(G)
Conditional Independence Cancer is independent of Age and Gender given Smoking. Age Gender Smoking P(C|A,G,S) = P(C|S) C ^ A,G | S Cancer
Serum Calcium is independent of Lung Tumor, given Cancer P(L|SC,C) = P(L|C) More Conditional Independence:Naïve Bayes Serum Calcium and Lung Tumor are dependent Cancer Serum Calcium Lung Tumor
Naïve Bayes in general H …... E1 E2 E3 En 2n + 1 parameters:
P(E = heavy | C = malignant) > P(E = heavy | C = malignant, S=heavy) More Conditional Independence:Explaining Away Exposure to Toxics and Smoking are independent Exposure to Toxics Smoking E ^ S Cancer Exposure to Toxics is dependent on Smoking, given Cancer
Age Gender Exposure to Toxics Smoking Cancer Serum Calcium Lung Tumor Put it all together
General Product (Chain) Rule for Bayesian Networks Pai=parents(Xi)
Conditional Independence A variable (node) is conditionally independent of its non-descendants given its parents. Age Gender Non-Descendants Exposure to Toxics Smoking Parents Cancer is independent of Age and Gender given Exposure to Toxics and Smoking. Cancer Serum Calcium Lung Tumor Descendants
Another non-descendant Age Gender Cancer is independent of Dietgiven Exposure toToxics and Smoking. Exposure to Toxics Smoking Diet Cancer Serum Calcium Lung Tumor
Independence and Graph Separation • Given a set of observations, is one set of variables dependent on another set? • Observing effects can induce dependencies. • d-separation (Pearl 1988) allows us to check conditional independence graphically.
CPCS Network
Age Gender Exposure to Toxic Smoking Genetic Damage Cancer Structuring Network structure corresponding to “causality” is usually good. Extending the conversation. Lung Tumor
Local Structure • Causal independence: from 2nto n+1 parameters • Asymmetric assessment: similar savings in practice. • Typical savings (#params): • 145 to 55 for a small hardware network; • 133,931,430 to 8254 for CPCS !!
Course Contents • Concepts in Probability • Bayesian Networks • Inference • Decision making • Learning networks from data • Reasoning over time • Applications
Inference • Patterns of reasoning • Basic inference • Exact inference • Exploiting structure • Approximate inference
Predictive Inference Age Gender How likely are elderly males to get malignant cancer? Exposure to Toxics Smoking P(C=malignant| Age>60, Gender= male) Cancer Serum Calcium Lung Tumor
Combined Age Gender How likely is an elderly male patient with high Serum Calciumto have malignant cancer? Exposure to Toxics Smoking Cancer P(C=malignant| Age>60, Gender= male, Serum Calcium = high) Serum Calcium Lung Tumor
Smoking • If we then observe heavy smoking, the probability of exposure to toxics goes back down. Explaining away Age Gender • If we see a lung tumor, the probability of heavy smoking and of exposure to toxics both go up. Exposure to Toxics Smoking Cancer Serum Calcium Lung Tumor
P(q, e) P(q | e) = P(e) Inference in Belief Networks • Find P(Q=q|E= e) • Q the query variable • E set of evidence variables X1,…, Xn are network variables except Q, E P(q, e) = S P(q, e, x1,…, xn) x1,…, xn
Basic Inference S C P(c) = ? • P(C,S) = P(C|S) P(S)
C P(b) = S P(a, b) = S P(b | a) P(a) a a P(c) = S P(c | b) P(b) b = S P(c | b) P(b | a) P(a) P(c) = S P(a, b, c) b,a b,a = S P(c | b) S P(b | a) P(a) b a P(b) Basic Inference A B
= S P(x | y1, y2) P(y1) P(y2) because of independence of Y1, Y2: y1, y2 Inference in trees Y2 Y1 X X P(x) = S P(x | y1, y2) P(y1, y2) y1, y2
Polytrees • A network is singly connected (a polytree) if it contains no undirected loops. D C Theorem: Inference in a singly connected network can be done in linear time*. Main idea: in variable elimination, need only maintain distributions over single nodes. * in network size including table sizes.
c c P(g) = P(r, s) ~ 0 The problem with loops P(c) 0.5 Cloudy c c Rain Sprinkler P(s) 0.01 0.99 P(r) 0.01 0.99 Grass-wet deterministic or The grass is dry only if no rain and no sprinklers.
0 0 P(g | r, s) P(r, s) + P(g | r, s) P(r, s) + P(g | r, s) P(r, s) + P(g | r, s) P(r, s) 0 1 = P(r, s) = P(r) P(s) ~ 0.5 ·0.5 = 0.25 problem The problem with loops contd. P(g) = ~ 0
P(c) = S P(c | b) S P(b | a) P(a) P(A) P(B | A) b a P(b) x S P(B, A) P(B) P(C | B) A x S P(C, B) P(C) B Variable elimination A B C
Inference as variable elimination • A factor over X is a function from val(X) to numbers in [0,1]: • A CPT is a factor • A joint distribution is also a factor • BN inference: • factors are multiplied to give new ones • variables in factors summed out • A variable can be summed out as soon as all factors mentioning it have been multiplied.
P(A) P(G) P(S | A,G) P(E | A) S P(A,E,S) P(A,S) P(A,G,S) x x G S P(C | E,S) P(E,S) A x S P(C) P(E,S,C) E,S S P(L | C) x P(C,L) P(L) C Variable Elimination with loops Age Gender Exposure to Toxics Smoking Cancer Serum Calcium Lung Tumor Complexity is exponential in the size of the factors
A, G, S A, E, S Join trees* A join tree is a partially precompiled factorization Age Gender P(A) x P(G) x P(S | A,G) x P(A,S) Exposure to Toxics Smoking Cancer E, S, C Serum Calcium Lung Tumor C, S-C C, L * aka junction trees, Lauritzen-Spiegelhalter, Hugin alg., …
Boolean 3CNF formula f= (u v w) (u w y) U V W Y prior probability1/2 or or and Probability ( ) = 1/2n · # satisfying assignments of f Computational complexity • Theorem: Inference in a multi-connected Bayesian network is NP-hard.
# of live samples with B=b P(b|c) ~ total # of live samples 0.001 0.03 0.4 0.3 0.8 B E A C N b n b e a e b e b e Samples: b e a c n Stochastic simulation Burglary Earthquake P(b) P(e) 0.03 0.001 b e Alarm P(a) 0.98 0.7 0.4 0.01 Call Newscast = c e a P(n) 0.3 0.001 P(c) 0.05 0.8 e a c ...
weight 0.8 b weight of samples with B=b n a P(b|c) = 0.05 b e a c n total weight of samples Likelihood weighting Burglary Earthquake a P(c) Alarm 0.05 0.8 P(c) 0.95 0.2 Call Newscast = c Samples: B E A C N e a c ...
MCMC with Gibbs Sampling • Fix the values of observed variables • Set the values of all non-observed variables randomly • Perform a random walk through the space of complete variable assignments. On each move: • Pick a variable X • Calculate Pr(X=true | all other variables) • Set X to true with that probability • Repeat many times. Frequency with which any variable X is true is it’s posterior probability. • Converges to true posterior when frequencies stop changing significantly • stable distribution, mixing
Markov Blanket Sampling • How to calculate Pr(X=true | all other variables) ? • Recall: a variable is independent of all others given it’s Markov Blanket • parents • children • other parents of children • So problem becomes calculating Pr(X=true | MB(X)) • We solve this sub-problem exactly • Fortunately, it is easy to solve
Example A C X B
Example Smoking Heartdisease Lungdisease Shortnessof breath
Example • Evidence: s, b Smoking Heartdisease Lungdisease Shortnessof breath
Example • Evidence: s, b • Randomly set: h, b Smoking Heartdisease Lungdisease Shortnessof breath
Example • Evidence: s, b • Randomly set: h, g • Sample H using P(H|s,g,b) Smoking Heartdisease Lungdisease Shortnessof breath