460 likes | 590 Views
PGM 2002/03 Tirgul 4 Exact Inference. Inference in Simple Chains. How do we compute P(X 2 ) ?. X 1. X 2. Inference in Simple Chains (cont.). How do we compute P(X 3 ) ? we already know how to compute P(X 2 ). X 1. X 2. X 3. X n. X 1. X 2. X 3. Inference in Simple Chains (cont.).
E N D
Inference in Simple Chains How do we compute P(X2)? X1 X2
Inference in Simple Chains (cont.) How do we compute P(X3)? • we already know how to compute P(X2)... X1 X2 X3
Xn X1 X2 X3 Inference in Simple Chains (cont.) ... How do we compute P(Xn)? • Compute P(X1), P(X2), P(X3), … • We compute each term by using the previous one Complexity: • Each step costs O(|Val(Xi)|*|Val(Xi+1)|) operations • Compare to naïve evaluation, that requires summing over joint values of n-1 variables
Inference in Simple Chains (cont.) • Suppose that we observe the value of X2 =x2 • How do we compute P(X1|x2)? • Recall that we it suffices to compute P(X1,x2) X1 X2
Inference in Simple Chains (cont.) • Suppose that we observe the value of X3 =x3 • How do we compute P(X1,x3)? • How do we compute P(x3|x1)? X1 X2 X3
Inference in Simple Chains (cont.) ... • Suppose that we observe the value of Xn =xn • How do we compute P(X1,xn)? • We compute P(xn|xn-1), P(xn|xn-2), … iteratively X1 X2 X3 Xn
Inference in Simple Chains (cont.) ... ... • Suppose that we observe the value of Xn =xn • We want to find P(Xk|xn ) • How do we compute P(Xk,xn )? • We compute P(Xk ) by forward iterations • We compute P(xn | Xk ) by backward iterations X1 X2 Xk Xn
E A B C D Elimination in Chains • We now try to understand the simple chain example using first-order principles • Using definition of probability, we have
E A B C D Elimination in Chains • By chain decomposition, we get
E A B C D Elimination in Chains • Rearranging terms ...
E A B C D Elimination in Chains • Now we can perform innermost summation • This summation, is exactly the first step in the forward iteration we describe before X
E A B C D Elimination in Chains • Rearranging and then summing again, we get X X
E A B C D Elimination in Chains with Evidence • Similarly, we understand the backward pass • We write the query in explicit form
E A B C D Elimination in Chains with Evidence • Eliminating d, we get X
E A B C D Elimination in Chains with Evidence • Eliminating c, we get X X
E A B C D Elimination in Chains with Evidence • Finally, we eliminate b X X X
Variable Elimination General idea: • Write query in the form • Iteratively • Move all irrelevant terms outside of innermost sum • Perform innermost sum, getting a new term • Insert the new term into the product
Smoking Visit to Asia Tuberculosis Lung Cancer Abnormality in Chest Bronchitis Dyspnea X-Ray A More Complex Example • “Asia” network:
S V L T B A X D • We want to compute P(d) • Need to eliminate: v,s,x,t,l,a,b Initial factors
S V L T B A X D Compute: • We want to compute P(d) • Need to eliminate: v,s,x,t,l,a,b Initial factors Eliminate: v Note: fv(t) = P(t) In general, result of elimination is not necessarily a probability term
S V L T B A X D Compute: • We want to compute P(d) • Need to eliminate: s,x,t,l,a,b • Initial factors Eliminate: s Summing on s results in a factor with two arguments fs(b,l) In general, result of elimination may be a function of several variables
S V L T B A X D Compute: • We want to compute P(d) • Need to eliminate: x,t,l,a,b • Initial factors Eliminate: x Note: fx(a) = 1 for all values of a !!
S V L T B A X D Compute: • We want to compute P(d) • Need to eliminate: t,l,a,b • Initial factors Eliminate: t
S V L T B A X D Compute: • We want to compute P(d) • Need to eliminate: l,a,b • Initial factors Eliminate: l
S V L T B A X D Compute: • We want to compute P(d) • Need to eliminate: b • Initial factors Eliminate: a,b
Variable Elimination • We now understand variable elimination as a sequence of rewriting operations • Actual computation is done in elimination step • Exactly the same computation procedure applies to Markov networks • Computation depends on order of elimination • We will return to this issue in detail
S V L T B A X D Dealing with evidence • How do we deal with evidence? • Suppose get evidence V = t, S = f, D = t • We want to compute P(L, V = t, S = f, D = t)
S V L T B A X D Dealing with Evidence • We start by writing the factors: • Since we know that V = t, we don’t need to eliminate V • Instead, we can replace the factors P(V) and P(T|V) with • These “select” the appropriate parts of the original factors given the evidence • Note that fp(V) is a constant, and thus does not appear in elimination of other variables
S V L T B A X D Dealing with Evidence • Given evidence V = t, S = f, D = t • Compute P(L, V = t, S = f, D = t ) • Initial factors, after setting evidence:
S V L T B A X D Dealing with Evidence • Given evidence V = t, S = f, D = t • Compute P(L, V = t, S = f, D = t ) • Initial factors, after setting evidence: • Eliminating x, we get
S V L T B A X D Dealing with Evidence • Given evidence V = t, S = f, D = t • Compute P(L, V = t, S = f, D = t ) • Initial factors, after setting evidence: • Eliminating x, we get • Eliminating t, we get
S V L T B A X D Dealing with Evidence • Given evidence V = t, S = f, D = t • Compute P(L, V = t, S = f, D = t ) • Initial factors, after setting evidence: • Eliminating x, we get • Eliminating t, we get • Eliminating a, we get
S V L T B A X D Dealing with Evidence • Given evidence V = t, S = f, D = t • Compute P(L, V = t, S = f, D = t ) • Initial factors, after setting evidence: • Eliminating x, we get • Eliminating t, we get • Eliminating a, we get • Eliminating b, we get
Complexity of variable elimination • Suppose in one elimination step we compute This requires • multiplications • For each value for x, y1, …, yk, we do m multiplications • additions • For each value of y1, …, yk , we do |Val(X)| additions Complexity is exponential in number of variables in the intermediate factor!
Numeric example: “Green” Network Rain Hiker Gazlan Car Waste Pollution
Example: Suppose we wish to calculate P(c0|p1,w0), using the algorithm shown in class. Answer: First, We’ll calculate P(C,p1,w0). We have the initial factors: P(R)=fR (R) = r1 0.2 r0 0.8
Initial Factors P(G|R) = fG(G,R) = r0 g0 0.1 r0 g1 0.9 r1 g0 0.3 r1 g1 0.7 P(H|R) = fH(H,R) = h0 r0 0.2 h0 r1 0.7 h1 r0 0.8 h1 r1 0.3
Initial Factors (cont.) P(w0|G,H) = fW(w0,G,H) = w0 h0 g0 0.9 w0 h0 g1 0.7 w0 h1 g0 0.8 w0 h1 g1 0.15 P(C|H) = fC(C,H) = h0 c0 0.95 h0 c1 0.05 h1 c0 0.4 h1 c1 0.6
Initial Factors (cont.) P(p1|C) = FP(p1,C) = p1 c0 0.2 p1 c1 0.9 We will now choose an elimination order: P(F,p1,w0) =
Calculation The steps are: gwghr = fWXfG = w0 g0 h0 r0 0.9X0.1 = 0.09 w0 g0 h0 r1 0.9X0.3 = 0.27 w0 g0 h1 r0 0.8X0.1 = 0.08 w0 g0 h1 r1 0.8X0.3 = 0.24 w0 g1 h0 r0 0.7X0.9 = 0.63 w0 g1 h0 r1 0.7X0.7 = 0.49 w0 g1 h1 r0 0.15X0.9 = 0.135 w0 g1 h1 r1 0.15X0.7 = 0.105
Calculation (cont.) mhwr = = w0 h0 r0 0.09+0.63 = 0.72 w0 h0 r1 0.27 + 0.49 = 0.76 w0 h1 r0 0.08+0.135 = 0.215 w0 h1 r1 0.24+0.105 = 0.345 gwhr = fH x fR x mhwr = w0 h0 r0 0.8x0.2x0.72 = 0.1152 w0 h0 r1 0.2x0.7x0.76 = 0.1064 w0 h1 r0 0.8x0.8x0.215 = 0.1376 w0 h1 r1 0.2x0.3x0.345 = 0.0207
Calculation (cont.) mhw = = w0 h0 0.1152+0.1064 = 0.2216 w0 h1 0.1376+0.0207 = 0.1583 gwhc =fC x mhw = w0 h0 c0 0.2216x0.95 = 0.21052 w0 h0 c1 0.2216x0.05= 0.01108 w0 h1 c0 0.1583x0.4 = 0.06332 w0 h1 c1 0.1583x0.6 = 0.09498 mwc = = w0 c0 0.21052+0.06332= 0.27384 w0 c1 0.01108+0.09498= 0.10606
Calculation (cont.) gwcp = fP x mwc = w0 c0 p1 0.2x0.27384 = 0.054768 w0 c1 p1 0.8x0.10606 = 0.095454 Now, we have:
The Computational Cost Computing gwghr = fW X fG requires 8 multiplications. Computing mhwr = requires 4 additions. Computing gwhr = fH x fR x mhwr requires 8 multiplications. Computing mhw = requires 2 additions. Computing gwhc =fC x mhw requires 4 multiplications. Computing mwc = requires 2 additions. Computing gwcp = fP x mwc requires 2 multiplications. For a total of: 8+8+4+2 = 22 multiplications and 4+2+2 = 8 additions.
Elimination order does matter We can choose another elimination order, say: R,G,H For a total of 16+4+4+2 = 26 multiplications, and 4+2+2 = 8 additions.