1 / 34

Bayesian Networks

Bayesian Networks. CHAPTER 14 Oliver Schulte Summer 2011. Motivation. Logical Inference. Does knowledge K entail A?. Model-based. Rule-based. Model checking enumerate possible worlds. Logical Calculus Proof Rules Resolution. Probabilistic Inference. What is P(A|K)?. Model-based.

minda
Download Presentation

Bayesian Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bayesian Networks CHAPTER 14 Oliver Schulte Summer 2011

  2. Motivation

  3. Logical Inference Does knowledge K entail A? Model-based Rule-based Model checking enumerate possible worlds Logical Calculus Proof Rules Resolution

  4. Probabilistic Inference What is P(A|K)? Model-based Rule-based Sum over possible worlds Probability Calculus • Product Rule • Marginalization • Bayes’ theorem

  5. Knowledge Representation Format

  6. Bayes Net Applications • Used in many applications: medical diagnosis, office clip, … • 400-page book about applications. • Companies: Hugin, Netica, Microsoft.

  7. Basic Concepts

  8. Bayesian Network Structure • A graph where: • Each node is a random variable. • Edges are directed. • There are no directed cycles (directed acyclic graph).

  9. Example: Bayesian Network Graph Cavity Catch Tootache

  10. Bayesian Network • A Bayesian network structure + • For each node X, for each value x of X, a conditional probability P(X=x|Y1 = y1,…,Yk = yk) for every assignment of values to the parents of X. • Such a conditional probability can be interpreted as a probabilistic horn clauseX = x <- Y1 = y1,…,Yk = yk with the given probability. • Demo in AIspace tool

  11. Example: Complete Bayesian Network

  12. The Story • You have a new burglar alarm installed at home. • Its reliable at detecting burglary but also responds to earthquakes. • You have two neighbors that promise to call you at work when they hear the alarm. • John always calls when he hears the alarm, but sometimes confuses alarm with telephone ringing. • Mary listens to loud music and sometimes misses the alarm.

  13. Bayes Nets Encode the Joint Distribution

  14. Bayes Nets and the Joint Distribution • A Bayes net compactly encodes the joint distribution over the random variables X1,…,Xn. How? • Let x1,…,xn be a complete assignment of a value to each random variable. Then P(x1,…,xn) = Π P(xi|parents(Xi))where the index i=1,…,n runs over all n nodes. • This is the product formula for Bayes nets. • In words, the joint probability is computed as follows. • For each node Xi: • Find the assigned value xi. • Find the values y1,..,yk assigned to the parents of Xi. • Look up the conditional probability P(xi|y1,..,yk) in the Bayes net. • Multiply together these conditional probabilities.

  15. Product Formula Example: Burglary • Query: What is the joint probability that all variables are true? • P(M, J,A,E,B) = P(M|A) p(J|A)p(A|E,B)P(E)P(B)= .7 x .9 x .95 x .002 x .001

  16. Cavity Example • Query: What is the joint probability that there is a cavity but no tootache and the probe doesn’t catch? • P(Cavity = T, Tootache = F, Catch = F) = P(Cavity= T) p(T = F|Cav = T) p(Catch = F|Cav = T) = .2 x .076 x 0.46

  17. Compactness of Bayesian Networks • Consider n binary variables • Unconstrained joint distribution requires O(2n) probabilities • If we have a Bayesian network, with a maximum of k parents for any node, then we need O(n 2k) probabilities • Example • Full unconstrained joint distribution • n = 30: need 109 probabilities for full joint distribution • Bayesian network • n = 30, k = 4: need 480 probabilities

  18. Completeness of Bayes nets • The Bayes net encodes all joint probabilities. • Knowledge of all joint probabilities is sufficient to answer any probabilistic query. • A Bayes net can in principle answer every query.

  19. Is is Magic? • Why does the product formula work? • The Bayes net topological semantics. • The Chain Rule.

  20. Bayes Nets Graphical Semantics

  21. Bayes net topological semantics • A Bayes net is constructed so that:each variable is conditionally independent of its nondescendants given its parents. • The graph alone (without specified probabilities) entails conditional independencies.

  22. Example: Common Cause Scenario Cavity Catch Tootache • Catch, Tootache are conditionally independent given Cavity.

  23. A B C Independence = Disconnected • Complete Independence: • all nodes are independent of each other • p(A,B,C) = p(A) p(B) p(C)

  24. A B C Chain Independence • temporal independence: • Each node is independent of the past given its immediate predecessor. • p(A,B,C) = p(C|B) p(B|A)p(A)

  25. Burglary Example • JohnC, MaryC are conditionally independent given Alarm. • Exercise: Use the graphical criterion to deduce at least one other conditional independence relation.

  26. Derivation of the Product Formula

  27. The Chain Rule • We can always write P(a, b, c, … z) = P(a | b, c, …. z) P(b, c, … z) (Product Rule) • Repeatedly applying this idea, we obtain P(a, b, c, … z) = P(a | b, c, …. z) P(b | c,.. z) P(c| .. z)..P(z) • Order the variables such that children come before parents. • Then given its parents, each node is independent of its other ancestors by the topological independence. • P(a,b,c, … z) = Πx. P(x|parents)

  28. Example in Burglary Network • P(M, J,A,E,B) = P(M| J,A,E,B) p(J,A,E,B)= P(M|A) p(J,A,E,B) = P(M|A) p(J|A,E,B)p(A,E,B) = P(M|A) p(J|A)p(A,E,B) = P(M|A) p(J|A) p(A|E,B) P(E,B) = P(M|A) p(J|A) p(A|E,B) P(E)P(B) • Colours show applications of the Bayes net topological independence.

  29. Explaining Away

  30. A B C A characteristic pattern for Bayes nets • Independent Causes: • A and B are independent. (why?) • “Explaining away” effect: • Given C, observing A makes B less likely. • E.g. Bronchitis in UBC “Simple Diagnostic Problem”. • A and B are (marginally) independent • but become dependent once C is known.

  31. More graphical independencies. • If A, B have a common effect C, they become dependent conditional on C. • This is not covered by our topological semantics. • A more general criterion called d-separation covers this.

  32. 1st-order Bayes nets • Can we combine 1st-order logic with Bayes nets? • Basic idea: use nodes with 1st-order variables, like Prolog Horn clauses. • For inference, follow grounding approach to 1st-order reasoning. • Important open topic, many researchers working on this, including yours truly.

  33. What Else Is There? • Efficient Inference Algorithms exploit the graphical structure (see book). • Much work on learning Bayes nets from data (including yours truly).

  34. Summary • Bayes nets represent probabilistic knowledge in a graphical way, with analogies to Horn clauses. • Used in many applications and companies. • The graph encodes dependencies (correlations) and independencies. • Supports efficient probabilistic queries.

More Related