CS498-EA Reasoning in AI Lecture #15

CS498-EAReasoning in AILecture #15 Instructor: Eyal Amir Fall Semester 2011

Summary of last time: Inference • We presented the variable elimination algorithm • Specifically, VE for finding marginal P(Xi) over one variable, Xi from X1,…,Xn • Order on variables such that • One variable Xj eliminated at a time (a) Move unneeded terms (those not involving Xj) outside summation over Xj (b) Create a new potential function, fXj(.) over other variables appearing in the terms of the summation at (a) • Works for both BNs and MFs (Markov Fields)

Today • Treewidth methods: • Variable elimination • Clique tree algorithm • Treewidth

Junction Tree • Why junction tree? • Foundations for “Loopy Belief Propagation” approximate inference • More efficient for some tasks than VE • We can avoid cycles if we turn highly-interconnected subsets of the nodes into “supernodes”  cluster • Objective • Compute • is a value of a variable and is evidence for a set of variable

ABD ADE DEF AD DE Cluster ABD SepsetDE Properties of Junction Tree • An undirected tree • Each node is a cluster (nonempty set) of variables • Running intersection property: • Given two clusters and , all clusters on the path between and contain • Separator sets (sepsets): • Intersection of the adjacent cluster

Potentials • Potentials: • Denoted by • Marginalization • , the marginalization of into X • Multiplication • , the multiplication of and

Properties of Junction Tree • Belief potentials: • Map each instantiation of clusters or sepsets into a real number • Constraints: • Consistency: for each cluster and neighboring sepset • The joint distribution

Properties of Junction Tree • If a junction tree satisfies the properties, it follows that: • For each cluster (or sepset) , • The probability distribution of any variable , using any cluster (or sepset) that contains

Moral Graph Triangulated Graph Junction Tree Identifying Cliques Building Junction Trees DAG

A B C G D E H F Constructing the Moral Graph

A B C G D E H F Constructing The Moral Graph • Add undirected edges to all co-parents which are not currently joined –Marrying parents

A B C G D E H F Constructing The Moral Graph • Add undirected edges to all co-parents which are not currently joined –Marrying parents • Drop the directions of the arcs

A B C G D E H F Triangulating • An undirected graph is triangulated iff every cycle of length >3 contains an edge to connects two nonadjacent nodes

EGH CEG A B C G DEF ACE D E H ABD ADE F Identifying Cliques • A clique is a subgraph of an undirected graph that is complete and maximal

EGH CEG ABD ACE CEG ADE AD AE CE DEF ACE DE EG ABD ADE DEF EGH Junction Tree • A junction tree is a subgraph of the clique graph that • is a tree • contains all the cliques • satisfies the running intersection property

DAG Junction Tree Initialization Inconsistent Junction Tree Propagation Consistent Junction Tree Marginalization Principle of Inference

X1 X2 Y1 Y2 X1,Y1 X2,Y2 X1,X2 X1 X2 Example: Create Join Tree HMM with 2 time steps: Junction Tree:

X1,Y1 X2,Y2 X1,X2 X1 X2 Example: Initialization

Example: Collect Evidence • Choose arbitrary clique, e.g. X1,X2, where all potential functions will be collected. • Call recursively neighboring cliques for messages: • 1. Call X1,Y1. • 1. Projection: • 2. Absorption:

X1,Y1 X2,Y2 X1,X2 X1 X2 Example: Collect Evidence (cont.) • 2. Call X2,Y2: • 1. Projection: • 2. Absorption:

Example: Distribute Evidence • Pass messages recursively to neighboring nodes • Pass message from X1,X2 to X1,Y1: • 1. Projection: • 2. Absorption:

X1,Y1 X2,Y2 X1,X2 X1 X2 Example: Distribute Evidence (cont.) • Pass message from X1,X2 to X2,Y2: • 1. Projection: • 2. Absorption:

Example: Inference with evidence • Assume we want to compute: P(X2|Y1=0,Y2=1) (state estimation) • Assign likelihoods to the potential functions during initialization:

Example: Inference with evidence (cont.) • Repeating the same steps as in the previous case, we obtain:

Next Time • Learning BNs and MFs

THE END

Example: Naïve Bayesian Model • A common model in early diagnosis: • Symptoms are conditionally independent given the disease (or fault) • Thus, if • X1,…,Xp denote whether the symptoms exhibited by the patient (headache, high-fever, etc.) and • H denotes the hypothesis about the patients health • then, P(X1,…,Xp,H) = P(H)P(X1|H)…P(Xp|H), • This naïve Bayesian model allows compact representation • It does embody strong independence assumptions

Elimination on Trees • Formally, for any tree, there is an elimination ordering with induced width = 1 Thm • Inference on trees is linear in number of variables

CS498-EA Reasoning in AI Lecture #15

CS498-EA Reasoning in AI Lecture #15

Presentation Transcript

CS498-EA Reasoning in AI Lecture 7

CS498-EA Reasoning in AI Lecture #22

CS498-EA Reasoning in AI Lecture #5

CS498-EA Reasoning in AI Lecture #18

CS498-EA Reasoning in AI Lecture #8

CS498-EA Reasoning in AI Lecture #10

CS498-EA Reasoning in AI Lecture #2

CS498-EA Reasoning in AI Lecture #11

CS498-EA Reasoning in AI Lecture #7