250 likes | 462 Views
Chapter 5 Belief Updating in Bayesian Networks. Bayesian Networks and Decision Graphs Finn V. Jensen Qunyuan Zhang Division. of Statistical Genomics, CGS Statistical Genetics Forum May 7,2007. Contents of the Book. I A practical Guide to Normative Systems
E N D
Chapter 5 Belief Updating in Bayesian Networks Bayesian Networks and Decision Graphs Finn V. Jensen Qunyuan Zhang Division. of Statistical Genomics, CGS Statistical Genetics Forum May 7,2007
Contents of the Book I A practical Guide to Normative Systems 1 Causal and Bayesian Network 2 Building Models 3 Learning, Adaption, and Tuning 4 Decision Graphs II Algorithms for Normative Systems 5 Belief Updating in Bayesian Network 6 Bayesian Network Analysis Tools 7 Algorithms for Influence Diagrams
Structure of the Book I. What is BN? II. How to create a BN? III. What can we use BN to do? and how? [to know sth.] Prob.(a single variable | BN) Joint Prob.(a set variables | BN) Importance of varibales evidence sensitivity parameter sensitivity Data conflict analysis [to make decision] Optimal decision (cost & gain) 1 Causal and Bayesian Network 2 Building Models 3 Learning, Adaption, and Tuning 5 Belief Updating in Bayesian Network 6 Bayesian Network Analysis Tools 4 Decision Graphs 7 Algorithms for Influence Diagrams
V1+V2 V1+V2 V1+V2 V1+V2 AXC AXC AXC AXC D2 D2 D2 D2 V1+V2 V1+V2 V1+V2 V1+V2 AXC AXC AXC AXC D1 T T A B C T V1 V2 D2 BN & Decision Tree U=V1+V2 D1 P(A,C|D1,T,D2)
“BN” of the Book Rules & Theories Data & Algorithms Concept of BN BN Learning (uncertain part of structure) Model Biulding (known part of structure) BN (structure & parameters) Probability Calculation Knowing, Understanding & Explaining Decisions Actions Cost & Gain Changes
C X1 A B X3 X2 Y D e X1 X2 X3 E F Y Chapter 5 Belief Updating in Bayesian Networks Belief = Probability Belief updating = Probability calculating based on a BN (model, parameters and/or evidences) Linear Model BN Logistic Model Marginal Probability P(Y) =∑[-Y] φ Conditional Probability P(Y| X1,X2,X3)
Marginal Probability Calculation in BN I. Simplification (5.5) II. Marginalization (5.2),(5.3),(5.4),(5.6) III. Simulation (5.7)
e F F B B D D G G E E E A A A C C C D D D B B B d-separation e e eG E E A A C C F F F e I. Simplifications By excluding the non-informative nodes (white nodes) Barren Nodes Graph-theoretic Representation e Definitions, Propositions & Theorems
II. Marginalization Calculating sums of products of potentials by eliminating variables repeatedly
Joint Probabilities Marginal Probabilities
A1 A2 A3 A4 A5 A6 An Example of Marginalization/Elimination BN parameters (potentials) : φ1=P (A1) , φ2=P (A2|A1) , φ3=P (A3|A1), φ4=P (A4|A2) φ5=P (A5|A2, A3), φ6=P (A6|A3) P(A4)=? Distributive Law
A1 A2 A3 A4 A5 A6 Marginalization/Elimination Order Variable Elimination Order
A1 A2 A3 A4 A5 A6 Marginalization/Elimination Domain: a set of variables in BN Potential: a real-valued probabilistic table over a domain φ1=P (A1) , φ2=P (A2|A1) , φ3=P (A3|A1), φ4=P (A4|A2) φ5=P (A5|A2, A3), φ6=P (A6|A3) Graph-theoretic Representation Definition 5.1 (Elimination) Let Фbe a set of potentials, and let X be a variable. X is eliminated from Ф by: 1.Remove all potentials in Ф with X in their domains. Call the removed set ФX X=A3 => ФX=(φ3, φ5, φ6 ), Ф=(φ1, φ2, φ4 ) 2.Calculate φ-X = ∑xΠФX = ∑A3φ3φ5φ6 3.Add φ-X to Ф. Call the result set Ф-X =(φ1, φ2, φ4 , φ-X ) P(Y) is calculated by repeatedly eliminating the variables except Y Question : how to find an efficient/optimal elimination order? Definitions, Propositions & Theorems
A1 A1 A2 A3 A2 A3 A4 A5 A6 A4 A5 A6 Domain Graphs Graph-theoretic Representation BN graph 6 domains φ1(A1) , φ2 (A2,A1) , φ3(A3,A1), φ4(A4,A2) φ5 (A5,A2,A3), φ6(A6,A3) Domain graph 6 domains φ1(A1) , φ2 (A2,A1) , φ3(A3,A1), φ4(A4,A2) φ5 (A5,A2,A3), φ6(A6,A3) Definitions, Propositions & Theorems
Perfect Elimination Sequence A1 A1 Graph-theoretic Representation A2 A2 A3 A4 A5 A6 A4 A5 A6 Fill-ins (red links) Perfect Elimination Sequence An elimination sequence without introducing fill-ins. e.g. A6, A5, A3, A1, A2 down to A4 => P(A4) A5, A6, A3, A1, A2 down to A4 => P(A4) A1, A5, A6, A3, A2 down to A4 => P(A4) Definitions, Propositions & Theorems
Domain Set of Elimination Sequence The domain set of an elimination sequence is the set of domains of potentials produced during the elimination where potentials that are subsets of other potentials are removed. For the sequence A6, A5, A3, A1, A2 down to A4 => P(A4) the set of domains is {(A6,A3),(A2,A3,A5),(A1,A2,A3), (A1,A2),(A2,A4)} Domain set reflects the complexity of an elimination sequence. Question: how to find the smallest domain set ? Graph-theoretic Representation Definitions, Propositions & Theorems
Set of Cliques All perfect elimination sequences produce the same the domain set, namely the set of cliques of the domain graph. e.g. all the sequences A6, A5, A3, A1, A2 down to A4 A5, A6, A3, A1, A2 down to A4 A1, A5, A6, A3, A2 down to A4 produce the domain set {(A6,A3),(A2,A3,A5),(A1,A2,A3), (A1,A2),(A2,A4)} which contains 5 domains / cliques Any perfect elimination sequence is optimal. Cliques are a set of domains produce by perfect elimination sequences. Clique set is the optimal set of domains. Question: how to determine the set of cliques? Graph-theoretic Representation Definitions, Propositions & Theorems
Triangulated Graphs An undirected graph with a perfect elimination sequence is called a triangulated graph. A triangulated graph A nontriangulated graph Perfect elimination sequence No perfect elimination sequence A5, A2, A4, A3 down to A1 Graph-theoretic Representation A1 A2 A1 A2 A3 A3 A4 A5 A4 A5 Definitions, Propositions & Theorems
Cliques in Triangulated Graphs D A B X E C X : a node in domain graph Fx : the set of neighbor nodes of X plus X Simplicial: nodes with a complete neighbor set are called simplicial To determine the set of cliques in a triangulated graph 1. Eliminate a simplicial node X. Fx is a clique candidate. 2. If Fx does not include all remaining nodes, go to 1. 3. Prune the set of cliques candidates by removing sets that are subsets of other clique candidates. 4. The resulting set is the set of cliques. Question: given a set of cliques, how to determine the perfect elimination order? Graph-theoretic Representation Definitions, Propositions & Theorems
A B E C D F H G I BCDE V10 J BCDE BCDE ABCD DEFI ABCD DEFI CGHJ V5 ABCD V1 BCDG V1 DEFI V3 BCD S1 DE S3 BCD S1 CG S5 BCDG BCDG CGHJ CGHJ Cliques (V) and Separators (S) Join Tree An organized tree of cliques, in which all nodes on the path between V and W contain the intersection of V and W. Graph-theoretic Representation A join tree Elimination sequence A,F,I,H,J,G,B,C,D down to E Not a join tree A domain graph Definitions, Propositions & Theorems
φ1,φ2,φ3 V4: A1, A2, A3 ↑ ↓ S1:A3 ↑ ↓ S4:A2 ↑ ↓ S2:A2,A3 φ4 V6: A2, A4 φ6 V1: A3, A6 φ5 V2: A2, A3, A5 Propagation Junction Trees A junction tree is a join tree with the following structure: 1. Each potential is attached to a clique containing the domain of this potential (cliques) 2. Each link has the appropriate separator attached (separable) 3. Each separator contains two “mailboxes”, one for each direction (mutual communication) Graph-theoretic Representation Collect evidence to V6 distribute evidence from V6 Junction trees provide a general framework for finding optimal elimination sequence for triangulated graphs. Question: what if a graph is non-triangulated? Definitions, Propositions & Theorems
A B C A B C A B C D E D E D E F G F G F G H I J H I J H I J Triangulations Convert a non-triangulated graph into a triangulated one by adding new link(s) BN non-triangulated graph triangulated graph Graph-theoretic Representation Optimal triangulation? Minimal fill-in size? Heuristic approach: eliminate repeatedly a smplicial node, and if this is not possible, eliminate a node X with minimal size of Fx. Definitions, Propositions & Theorems
A C B E D III. Stochastic Simulations Gibbs Sampling Evidence: B=n, E=n; P(B=n,E=n) is rare P(A)=? P(C| B=n,E=n, A=a0, D=d0) => c1 P(D| B=n,E=n,C=c1,A=a0) => d1 P(A| B=n,E=n, D=d1,C=c1) => a1 P(C| B=n,E=n, A=a1, D=d1) => c2 . . discard P(C| B=n,E=n, A=at-1, D=dt-1) => ct . collect . . Forward Sampling 1. P(A) => A 2. P(B|A)=>B, P(C|A)=>C 3. P(D|B)=>D 4. P(E|C,D)=>E 5. Repeat steps 1~4