70 likes | 174 Views
Belief Propagation: An Extremely Rudimentary Discussion. McLean & Pavel. The Problem. When we have a tree structure describing dependencies between variable, we want to update our probability distributions based on evidence
E N D
Belief Propagation:An Extremely Rudimentary Discussion McLean & Pavel
The Problem • When we have a tree structure describing dependencies between variable, we want to update our probability distributions based on evidence • Trees are nice to work with, as the probability distribution can be expressed as the product of edge marginals divided by the product of separator node marginals • Simple case: if risk of heart disease depends on your father’s risk of heart disease, what can you say about your own risk if you know your grandfather’s risk? • Essentially, given a prior distribution on the tree, find the posterior distribution given some observed evidence
The Solution • Belief Propagation is an algorithm to incorporate evidence into a tree distribution • Non-iterative method: it only requires two passes through the tree to get an updated distribution
The Algorithm • First, incorporate evidence: for each observed variable, find one edge that it is a part of, and set all entries in the edge table that do not correspond to the observed value to zero • Next, choose some edge as a root • Collect evidence in from every direction to the root • Normalize the root edge table • Distribute evidence out in every direction from the root
z Okay, but what do you mean by “collect evidence”? • Well, we want to propagate evidence through the system • This is fairly simple for singly-linked items: just update marginal based on joint, then update the next joint based on that marginal, and so on: • So if we observed x, Txy becomes T*xy, then we get T*y by summing T*xy over all x • Then, T*yz = (Tyz)(T*y/Ty) x y
And if we have multiply linked items? • Then it’s slightly (but only slightly) more complicated: • Now if we observe x1 and x2, we get T*x1y and T*x2y • We then calculate T1y and T2y (the equivalents of T*y from before, but each using only the information from one of the Xs) • Now, T*yz=(Tyz)(T1y/Ty)(T2y/Ty) • See Pavel’s handout for a complete workthrough using this graph, and a justification of the calculation of T*yz x1 y z x2
But you’ve got two different marginals for Y! That can’t be right! • Patience. All will work out in time. • After we have finished collecting evidence, we normalize our root table – in this case, the root would be T*yz • Now we distribute evidence – this is the same process as collecting evidence, but in the opposite direction • Note that now we are only ever having a single “input” edge going to any given node, so we can come up with proper marginals • When we’ve finished distributing evidence, we will have a probability distribution over the tree that reflects the incorporated evidence.