1 / 14

Bayes Rule and Bayesian Networks

Bayes Rule and Bayesian Networks. Presentation by: Ravikiran Gunale Y7159. Bayes Rule: If conditional probability of event A, given event B is known, How to determine conditional probability of event B, given event A.

zofia
Download Presentation

Bayes Rule and Bayesian Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bayes Rule and Bayesian Networks Presentation by: RavikiranGunale Y7159

  2. Bayes Rule: If conditional probability of event A, given event B is known, How to determine conditional probability of event B, given event A. P (A/B)= P(B/A)P(A)/P(B) • Provides a mathematical rule to explain how you should change your existing beliefs in the light of new evidence.

  3. Propositional logic fully formalized. • Common situation in human reasoning : To perform inference from incomplete and uncertain knowledge. • Propositional logic fails in this situation. • Hence, Bayes Rule and Bayesian Networks

  4. Bayesian Networks: Probabilistic graphic model that represents a set of random variables and their conditional dependencies via a directed acyclic graph. Car example directed graph. Fuel meter standing start Clean spark plug Fuel

  5. G=(V,E) ,a directed acyclic graph. • X is a Bayesian network with respect to G if its joint probability density function can be written as a product of the individual density functions, conditional on their parent variables. • X is a Bayesian network with respect to G if it satisfies the local Markov property: each variable is conditionally independent of its non-descendants given its parent variables.

  6. An Example P(fo)=0.15 P(U)=0.1 P(do/ fo U )=0.99 P(do/ fo not U)=0.9 P(do/ not fo U)=0.97 P(lo/fo)=0.6 P(do/ not fo not U)=0.3 P(lo/ not fo)=0.15 P(H/do)=0.7 P(H/not do)=0.01 Family out (fo) Unhealthy(U) Lights on(lo) Dog out(do) Hear bark(H)

  7. Independence assumption: • If there are n random variables ,the complete distribution is specified by 2^n -1 joint probabilities. • n=5 . 2^n-1 =31 .But we needed only 10 values. If n=10 , we need 21 values. Where is this savings coming from? • Bayesian Networks have built in independence assumptions. Family out and hear bark example.

  8. d-seperation path: • Let P be a trail from u to v. Then P is said to be d-separated by a set of nodes Z iff one of the following holds: • P contains a chain, i → m → j, such that the middle node m is in Z, • P contains a chain, i ← m ← j, such that the middle node m is in Z, • P contains a fork, i ← m → j, such that the middle node m is in Z, or • P contains a fork i → m ← j,such that middle m is not in Z and no descendant of m is in Z. • If u and v are not d-seperated then they are d- connected.

  9. Consistent probabilities: Consider a system in which we have P(A/B)=0.7, P(B/A)=0.3, P(B)=0.5 Above values are inconsistent. Following property of Bayesian networks comes to rescue: If you specify probabilities of all the nodes given all parent combinations. • The numbers will be consistent. • The network will uniquely define a distribution.

  10. Inference and learning: Parameter learning: Case1 :Complete Data Each data set is a configuration over all the variables in a network. To ensure that the parameters are learned independently, we make two assumptions: • Global independence • Local independence

  11. Maximium likelihood estimation: Likelihood of M given D is L(M/D)=P(d/M), d belongs to D. M is the network, D the set of cases. Then we choose the parameter that maximizes the likelihood: a`= arg max L(Ma/D). Bayesian estimation: • MLE has drawbacks when using for sparse database. • For Bayesian estimation , start with a prior distribution and use data to update the distribution.

  12. Incomplete Data: Some values may be missing , intentionally removed and in extreme case some variables may simply not be observable. Approximate techniques are used for parameter estimation.

  13. Structure Learning: • In simple case, Bayesian network can be specified by expert and used for inference. • In other applications, network structure and parameters must be learned from data. X→Y→Z Type 1 and 2 represent same dependency X←Y→Z Type 3 can be uniquely identified. X→Y←Z • Systematically the skeleton of the underlying graph is determined • Then all arrows are oriented, whose directionality is dictated by the conditional independencies observed.

  14. Applications: • Most common application is medical diagnosis. For e.g. PATHFINDER, a program to diagnose disease of the lymph node. • Modeling knowledge in computational biology, decision support system, bioinformatics ,image processing etc.

More Related