120 likes | 143 Views
Learn about Markov Networks, Undirected Graphical Models, Potential Functions, Weight of Features, Inference Methods, Learning Weights, and Structure through Gibbs Sampling, Belief Propagation, and Feature Search Techniques.
E N D
Markov Networks A B • Undirected graphical models C D • Potential functions defined over cliques
Markov Networks A B • Undirected graphical models C D • Potential functions defined over cliques Weight of Feature i Feature i
Hammersley-Clifford Theorem If Distribution is strictly positive (P(x) > 0) And Graph encodes conditional independences Then Distribution is product of potentials over cliques of graph Inverse is also true. (“Markov network = Gibbs distribution”)
Inference in Markov Networks • Goal: compute marginals & conditionals of • Exact inference is #P-complete • Conditioning on Markov blanket is easy: • Gibbs sampling exploits this
Markov Chain Monte Carlo • Gibbs Sampler 1. Start with an initial assignment to nodes 2. One node at a time, sample node given others 3. Repeat 4. Use samples to compute P(X) • Convergence: Burn-in + Mixing time • Many modes Multiple chains
Other Inference Methods • Belief propagation (sum-product) • Mean field / Variational approximations
MAP Inference • Iterated conditional modes • Simulated annealing • Graph cuts • Belief propagation (max-product)
Learning Weights • Maximize likelihood (or posterior) • Convex optimization: gradient ascent, quasi-Newton methods, etc. • Requires inference at each step (slow!) Feature count according to data Feature count according to model
Pseudo-Likelihood • Likelihood of each variable given its Markov blanket in the data • Does not require inference at each step • Very fast numerical optimization • Ignores non-local dependences
Learning Structure • Feature search 1. Start with atomic features 2. Form conjunctions of pairs of features 3. Select best and add to feature set 4. Repeat until no improvement • Evaluation • Likelihood, K-L divergence • Approximation: Previous weights don’t change