280 likes | 373 Views
CS498-EA Reasoning in AI Lecture #8. Instructor: Eyal Amir Fall Semester 2009. Summary of last time: Inference. We presented the variable elimination algorithm Specifically, VE for finding marginal P(X i ) over one variable, X i from X 1 ,…,X n Order on variables such that
E N D
CS498-EAReasoning in AILecture #8 Instructor: Eyal Amir Fall Semester 2009
Summary of last time: Inference • We presented the variable elimination algorithm • Specifically, VE for finding marginal P(Xi) over one variable, Xi from X1,…,Xn • Order on variables such that • One variable Xj eliminated at a time (a) Move unneeded terms (those not involving Xj) outside summation over Xj (b) Create a new potential function, fXj(.) over other variables appearing in the terms of the summation at (a) • Works for both BNs and MFs (Markov Fields)
Today • Treewidth methods: • Variable elimination • Clique tree algorithm • Treewidth
Junction Tree • Why junction tree? • Foundations for “Loopy Belief Propagation” approximate inference • More efficient for some tasks than VE • We can avoid cycles if we turn highly-interconnected subsets of the nodes into “supernodes” cluster • Objective • Compute • is a value of a variable and is evidence for a set of variable
ABD ADE DEF AD DE Cluster ABD SepsetDE Properties of Junction Tree • An undirected tree • Each node is a cluster (nonempty set) of variables • Running intersection property: • Given two clusters and , all clusters on the path between and contain • Separator sets (sepsets): • Intersection of the adjacent cluster
Potentials • Potentials: • Denoted by • Marginalization • , the marginalization of into X • Multiplication • , the multiplication of and
Properties of Junction Tree • Belief potentials: • Map each instantiation of clusters or sepsets into a real number • Constraints: • Consistency: for each cluster and neighboring sepset • The joint distribution
Properties of Junction Tree • If a junction tree satisfies the properties, it follows that: • For each cluster (or sepset) , • The probability distribution of any variable , using any cluster (or sepset) that contains
Moral Graph Triangulated Graph Junction Tree Identifying Cliques Building Junction Trees DAG
A B C G D E H F Constructing the Moral Graph
A B C G D E H F Constructing The Moral Graph • Add undirected edges to all co-parents which are not currently joined –Marrying parents
A B C G D E H F Constructing The Moral Graph • Add undirected edges to all co-parents which are not currently joined –Marrying parents • Drop the directions of the arcs
A B C G D E H F Triangulating • An undirected graph is triangulated iff every cycle of length >3 contains an edge to connects two nonadjacent nodes
EGH CEG A B C G DEF ACE D E H ABD ADE F Identifying Cliques • A clique is a subgraph of an undirected graph that is complete and maximal
EGH CEG ABD ACE CEG ADE AD AE CE DEF ACE DE EG ABD ADE DEF EGH Junction Tree • A junction tree is a subgraph of the clique graph that • is a tree • contains all the cliques • satisfies the running intersection property
DAG Junction Tree Initialization Inconsistent Junction Tree Propagation Consistent Junction Tree Marginalization Principle of Inference
X1 X2 Y1 Y2 X1,Y1 X2,Y2 X1,X2 X1 X2 Example: Create Join Tree HMM with 2 time steps: Junction Tree:
X1,Y1 X2,Y2 X1,X2 X1 X2 Example: Initialization
Example: Collect Evidence • Choose arbitrary clique, e.g. X1,X2, where all potential functions will be collected. • Call recursively neighboring cliques for messages: • 1. Call X1,Y1. • 1. Projection: • 2. Absorption:
X1,Y1 X2,Y2 X1,X2 X1 X2 Example: Collect Evidence (cont.) • 2. Call X2,Y2: • 1. Projection: • 2. Absorption:
Example: Distribute Evidence • Pass messages recursively to neighboring nodes • Pass message from X1,X2 to X1,Y1: • 1. Projection: • 2. Absorption:
X1,Y1 X2,Y2 X1,X2 X1 X2 Example: Distribute Evidence (cont.) • Pass message from X1,X2 to X2,Y2: • 1. Projection: • 2. Absorption:
Example: Inference with evidence • Assume we want to compute: P(X2|Y1=0,Y2=1) (state estimation) • Assign likelihoods to the potential functions during initialization:
Example: Inference with evidence (cont.) • Repeating the same steps as in the previous case, we obtain:
Next Time • Learning BNs and MFs
Example: Naïve Bayesian Model • A common model in early diagnosis: • Symptoms are conditionally independent given the disease (or fault) • Thus, if • X1,…,Xp denote whether the symptoms exhibited by the patient (headache, high-fever, etc.) and • H denotes the hypothesis about the patients health • then, P(X1,…,Xp,H) = P(H)P(X1|H)…P(Xp|H), • This naïve Bayesian model allows compact representation • It does embody strong independence assumptions
Elimination on Trees • Formally, for any tree, there is an elimination ordering with induced width = 1 Thm • Inference on trees is linear in number of variables