1 / 24

Bayesian Networks

Bayesian Networks. VISA Hyoungjune Yi. BN – Intro. Introduced by Pearl (1986 ) Resembles human reasoning Causal relationship Decision support system/ Expert System. Common Sense Reasoning about uncertainty. June is waiting for Larry and Jacobs who are both late for VISA seminar

nat
Download Presentation

Bayesian Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bayesian Networks VISA Hyoungjune Yi

  2. BN – Intro. • Introduced by Pearl (1986 ) • Resembles human reasoning • Causal relationship • Decision support system/ Expert System

  3. Common Sense Reasoning about uncertainty • June is waiting for Larry and Jacobs who are both late for VISA seminar • June is worried that if the roads are icy one or both of them may have crash his car • Suddenly June learns that Larry has crashed • June think: “If Larry has crashed then probably the roads are icy. So Jacobs has also crashed” • June then learns that it is warm outside and roads are salted • June Think: “Larry was unlucky; Jacobs should still make it”

  4. Causal Relationships State of Road Icy/ not icy Jacobs Crash/No crash Larry Crash/No crash

  5. Larry Crashed ! State of Road Icy/ not icy Information Flow Jacobs Crash/No crash Larry Crash/No crash

  6. But Roads are dry State of Road not icy Information Flow Jacobs Crash/No crash Larry Crash/No crash

  7. Wet grass • To avoid icy roads, Larry moves to UCLA; Jacobs moves in USC • One morning as Larry leaves for work, he notices that his grass is wet. He wondered whether he has left his sprinkler on or it has rained • Glancing over to Jacobs’ lawn he notices that it is also get wet • Larry thinks: “Since Jacobs’ lawn is wet, it probably rained last night” • Larry then thinks: “If it rained then that explains why my lawn is wet, so probably the sprinkler is off”

  8. Larry’s grass is wet Sprinkler On/Off Rain Yes/no Information Flow Larry’s grass Wet Jacobs grass Wet/Dry

  9. Jacobs’ grass is also wet Sprinkler On/Off Rain Yes/no Information Flow Larry’s grass Wet Jacobs grass Wet

  10. Bayesian Network • Data structure which represents the dependence between variables • Gives concise specification of joint prob. dist. • Bayesian Belief Network is a graph that holds • Nodes are a set of random variables • Each node has a conditional prob. Table • Edges denote conditional dependencies • DAG : No directed cycle • Markov condition

  11. Y1 Y2 X Bayesian network • Markov Assumption • Each random variable X is independent of its non-descendent given its parent Pa(X) • Formally, Ind(X; NonDesc(X) | Pa(X))if G is an I-MAP of P (<-? )I-MAP? Later

  12. Burglary Earthquake Radio Alarm Call Markov Assumption • In this example: • Ind( E; B ) • Ind( B; E, R ) • Ind( R; A,B, C | E ) • Ind( A;R | B,E ) • Ind( C;B, E, R |A)

  13. X Y X Y I-Maps • A DAG G is an I-Map of a distribution P if the all Markov assumptions implied by G are satisfied by P • Examples:

  14. I-MAP • G is Minimal I-Map iff • G is I-Map of P • If G’  G then G’ is not an I-Map of P • I-Map is not unique

  15. X Y Factorization • Given that G is an I-Map of P, can we simplify the representation of P? • Example: • Since Ind(X;Y), we have that P(X|Y) = P(X) • Applying the chain ruleP(X,Y) = P(X|Y) P(Y) = P(X) P(Y) • Thus, we have a simpler representation of P(X,Y)

  16. Burglary Earthquake Radio Alarm Call Factorization Theorem Thm: if G is an I-Map of P, then P(C,A,R,E,B) = P(B)P(E|B)P(R|E,B)P(A|R,B,E)P(C|A,R,B,E) versus P(C,A,R,E,B) = P(B) P(E) P(R|E) P(A|B,E) P(C|A)

  17. So, what ? • We can write P in terms of “local” conditional probabilities • If G is sparse, that is, |Pa(Xi)| < k ,  each conditional probability can be specified compactly e.g. for binary variables, these require O(2k) params.  representation of P is compact linear in number of variables

  18. Formal definition of BN • A Bayesian network specifies a probability distribution via two components: • A DAG G • A collection of conditional probability distributions P(Xi|Pai) • The joint distribution P is defined by the factorization • Additional requirement: G is a minimal I-Map of P

  19. Pneumonia Tuberculosis T P(I |P, T ) P p t 0.8 0.2 Lung Infiltrates p t 0.6 0.4 p t 0.2 0.8 XRay Sputum Smear p t 0.01 0.99 Bayesian Network - Example • Each node Xi has a conditional probability distribution P(Xi|Pai) • If variables are discrete, P is usually multinomial • Pcan be linear Gaussian, mixture of Gaussians, …

  20. P T I X S BN Semantics • Compact & natural representation: • nodes have  k parents  2k n vs. 2n params conditional independencies in BN structure local probability models full joint distribution over domain = +

  21. d-separation • d-sep(X;Y | Z, G) • X is d-separated from Y, given Z if all paths from a node in X to a node in Y are blocked given Z • Meaning ? • On the blackboard • Path • Active: dependency between end nodes in the path • Blocked: No dependency • Common cause, Intermediate, common effect • On the blackboard

  22. BN – Belief, Evidence and Query • BN is for “Query” - partly • Query involves evidence • Evidence is an assignment of values to a set of variables in the domain • Query is a posteriori belief • Belief • P(x) = 1 or P(x) = 0

  23. Learning Structure • Problem Definition • Given: Data D • Return: directed graph expressing BN • Issue • Superfluous edges • Missing edges • Very difficult • http://robotics.stanford.edu/people/nir/tutorial/

  24. BN models can be learned from empirical data parameter estimation via numerical optimization structure learning via combinatorial search. BN hypothesis space biased towards distributions with independence structure. P T I X S BN Learning Inducer Data

More Related