1 / 31

Today

Today. Next week. P. P. (. (. x. x. ,. ,. x. x. ,. ,. x. x. ,. ,. y. y. ,. ,. y. y. ,. ,. y. y. ). ). 1. 1. 2. 2. 3. 3. 1. 1. 2. 2. 3. 3. Marginalization. Suppose you have some joint probability, Involving observations, y, and hidden states, x.

Download Presentation

Today

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Today

  2. Next week

  3. P P ( ( x x , , x x , , x x , , y y , , y y , , y y ) ) 1 1 2 2 3 3 1 1 2 2 3 3 Marginalization Suppose you have some joint probability, Involving observations, y, and hidden states, x. Suppose you’re at x1, and you want to find the marginal probability there, given the observations. Normally, you would have to compute: For N other hidden nodes, each of M states, that will take MN additions. = P(x1) sum sum x x 2 3

  4. x3 y2 y3 x2 y1 x1 = P(x1) sum sum P ( x , x , x , y , y , y ) 1 2 3 1 2 3 x x 2 3 Special case: Markov network But suppose the joint probability has a special structure, shown by this Markov network: Then this sum: can be computed with N M2 additions, as follows…

  5. x3 y3 y1 y2 x2 x1 Derivation of belief propagation = P(x1) sum sum P ( x , x , x , y , y , y ) 1 2 3 1 2 3 x x 2 3

  6. y1 y3 y2 x3 x1 x2 The posterior factorizes = P(x1) sum sum P ( x , x , x , y , y , y ) 1 2 3 1 2 3 x x 2 3 = F sum sum ( x , y ) 1 1 x x 2 3 F Y ( x , y ) ( x , x ) 2 2 1 2 F Y ( x , y ) ( x , x ) 3 3 2 3 = F x mean ( x , y ) 1 1 1 MMSE x 1 F Y sum ( x , y ) ( x , x ) 2 2 1 2 x 2 F Y sum ( x , y ) ( x , x ) 3 3 2 3 x 3

  7. y1 y3 y2 x1 x3 x2 Propagation rules = P(x1) sum sum P ( x , x , x , y , y , y ) 1 2 3 1 2 3 x x 2 3 = F sum sum ( x , y ) 1 1 x x 2 3 F Y ( x , y ) ( x , x ) 2 2 1 2 F Y ( x , y ) ( x , x ) 3 3 2 3 = F P(x1) ( x , y ) 1 1 F Y sum ( x , y ) ( x , x ) 2 2 1 2 x 2 F Y sum ( x , y ) ( x , x ) 3 3 2 3 x 3

  8. y1 y3 y2 x1 x3 x2 Propagation rules = F P(x1) ( x , y ) 1 1 F Y sum ( x , y ) ( x , x ) 2 2 1 2 x 2 F Y sum ( x , y ) ( x , x ) 3 3 2 3 x 3

  9. Belief and message update rules j = i j i

  10. Belief propagation updates = i j i ( ) = .* .* *

  11. Simple example For the 3-node example, worked out in detail, see Sections 2.0, 2.1 of:

  12. Optimal solution in a chain or tree:Belief Propagation • “Do the right thing” Bayesian algorithm. • For Gaussian random variables over time: Kalman filter. • For hidden Markov models: forward/backward algorithm (and MAP variant is Viterbi).

  13. Other loss functions • The above rules let you compute the marginal probability at a node. From that, you can compute the mean estimate. • But you can also use a related algorithm to compute the MAP estimate for x1.

  14. y1 x1 y2 x2 y3 x3 MAP estimate for a chain or a tree

  15. y1 y3 y2 x1 x3 x2 The posterior factorizes

  16. y1 y3 y2 x1 x3 x2 Propagation rules

  17. y1 x1 y2 x2 y3 x3 Using conditional probabilities instead of compatibility functions By Bayes rule

  18. y1 x1 y2 x2 y3 x3 Writing it as a factorization By the fact that conditioning on x1 makes y1 and x2, x3, y2, y3 independent

  19. y1 x1 y2 x2 y3 x3 Writing it as a factorization Now use Bayes rule (with x2) for the rightmost term.

  20. y1 x1 y2 x2 y3 x3 Writing it as a factorization From the Markov structure, conditioning on x1 and x2 is the same as conditioning on x2.

  21. y1 x1 y2 x2 y3 x3 Writing it as a factorization Conditioning on x2 makes y2 independent of x3 andy3.

  22. y1 x1 y2 x2 y3 x3 Writing it as a factorization The same operations, once more, with the far right term.

  23. A toy problem • 10 nodes. 2 states for each node. Local evidence as shown below.

  24. Classic 1976 paper

  25. Relaxation labelling

  26. Belief propagation Relaxation labelling

  27. Yair’s motion example

  28. Yair’s figure/ground example

More Related