380 likes | 552 Views
S3-SEMINAR ON DATA MINING -BAYESIAN NETWORKS- B. INFERENCE . Concha Bielza , Pedro Larrañaga Computational Intelligence Group Departamento de Inteligencia Artificial Universidad Politécnica de Madrid. Master Universitario en Inteligencia Artificial. Conceptos básicos.
E N D
S3-SEMINAR ON DATA MINING -BAYESIAN NETWORKS- B. INFERENCE Concha Bielza, Pedro Larrañaga ComputationalIntelligenceGroup Departamento de Inteligencia Artificial Universidad Politécnica de Madrid Master Universitario en Inteligencia Artificial
Conceptos básicos Inference in Bayesian networks Types of queries Exact inference: Brute-force computation Variable elimination algorithm Message passing algorithm Approximate inference: Probabilisticlogicsampling
Queries Brute-force VE Message Approx Types of queries Burgl. Earth. Alarm News WCalls Queries: posterior probabilities Given some evidence e (observations), • Posterior probability of a target variable(s) X : • Other names: probability propagation, belief updating or revision… answer queries about P Vector ?
Queries Brute-force VE Message Approx Types of queries Burgl. Burgl. Earth. Earth. Alarm Alarm News News WCalls WCalls Semantically, for any kind of reasoning Predictive reasoning or deductive (causal inference): predict effects Symptoms|Disease • Target variable is usually • a descendant of the evidence ? Diagnostic reasoning (diagnostic inference): diagnose the causes Disease|Symptoms ? • Target variable is usually • an ancestor of the evidence
Queries Brute-force VE Message Approx Types of queries Burgl. Burgl. Earth. Earth. Alarm Alarm News News ? ? WCalls WCalls ? ? ? ? More queries: maximum a posteriori (MAP) Most likely configurations (abductive inference): event that best explains the evidence • Total abduction: search for • Partial abduction:search for • K most likely explanations all the unobserved In general, cannot be computed component-wise, with max P(xi|e) subset. of unobserved (explanation set)
Queries Brute-force VE Message Approx Types of queries More queries: maximum a posteriori (MAP) Use MAP for: • Classification: find most likely label, given the evidence • Explanation: what is the most likely scenario, given the evidence
Queries Brute-force VE Message Approx Types of queries More queries: decision-making Optimal decisions (of maximum expected utility), with influence diagrams
Queries Brute-force VE Message Approx Exact inference [Pearl’88; Lauritzen & Spiegelhalter’88] Brute-force computation of P(X|e) First, consider P(Xi), without observed evidence e. Conceptually simple but computationally complex For a BN with n variables, each with its P(Xj|Pa(Xj)): Brute-force approach But this amounts to computing the JPD, often very inefficient and even intractable computationally CHALLENGE:Without computing the JDP, exploit the factorization encoded by the BN and the distributive law (local computations)
Queries Brute-force VE Message Approx Exact inference ? Improving brute-force Use the JPD factorization and the distributive law Table with 32 inputs (JPD) (if binary variables)
Queries Brute-force VE Message Approx Exact inference Biggest table with 8 (like the BN) • over X4: Improving brute-force Arrange computations effectively, moving some additions • over X5 and X3:
QueriesBrute-force VE Message Approx Exact inference Variable elimination algorithm ONE variable Wanted: A list with all functions of the problem Select an elimination order of all variables (except i) For each Xk from , if F is the set of functions that involve Xk: Delete F from the list Eliminate Xk= combine all the functions that contain this variable and marginalize out Xk Compute Add f’ to the list Output: combination (multiplication) of all functions in the current list
QueriesBrute-force VE Message Approx Exact inference Variable elimination algorithm Repeat the algorithm for each target variable
QueriesBrute-force VE Message Approx Exact inference Smoking (S) Visit to Asia (A) Tuberculosis (T) Lung Cancer (L) Tub. or Lung Canc (E) Bronchitis (B) Dyspnea (D) X-Ray (X) Example with Asia network
QueriesBrute-force VE Message Approx Exact inference Brute-force approach Compute P(D) by brute-force: Complexity is exponential in the size of the graph (number of variables *number of states for each variable)
QueriesBrute-force VE Message Approx Exact inference not necessarily a probability term
QueriesBrute-force VE Message Approx Exact inference 4
QueriesBrute-force VE Message Approx Exact inference Complexity is exponential in the max N. of var. in factors of the summation Size = 8 Variable elimination algorithm Local computations (due to moving the additions) Importance of the eliminationordering, but finding an optimal (minimum cost) is NP-hard [Arnborg et al.’87] (heuristics for good sequences)
QueriesBrute-force VE Message Approx Exact inference Message passing algorithm Operates passing messages among the nodes of the network. Nodes act as processors that receive, calculate and send information. Called propagation algorithms Clique tree propagation, based on the same principle as VE but with a sophisticated caching strategy that: Enables to compute the posterior prob. distr. of all variables in twice the time it takes to compute that of one single variable Works in an intuitive appealing fashion, namely message propagation
QueriesBrute-force VE Message Approx Exact inference Basic operations for a node Ask info(i,j): Target node i asks info to node j. Does it for all neighbors j. They do the same until there are no nodes to ask Send-message(i,j): Each node sends a message to the node that asked him the info… until reaching the target node A message is defined over the intersection of domains of fi and fj. It is computed as: And finally, we calculate locally at each node i: Target combines all received info with his info and marginalize over the target variable
QueriesBrute-force VE Message Approx Exact inference Ask Procedure for X2 CollectEvidence
QueriesBrute-force VE Message Approx Exact inference P(X2) as a message passing algorithm ?
QueriesBrute-force VE Message Approx Exact inference VE as a message passing algorithm Direct correspondence: ? Mess. VE
QueriesBrute-force VE Message Approx Exact inference Computing prob. P(Xi|e) of all (unobserved) variables i at a time We can perform the previous process for each node: but many messages are repeated! Or, we can use 2 rounds of messages as follows: Select a node as a root (or pivot) Ask or collect evidence from the leaves toward the root (messages in downward direction). As VE. Distribute evidence from the root toward the leaves (messages in upward direction) Calculate marginal distributions at each node by local computation, i.e. using its incoming messages This algorithm never constructs tables larger than those in the BN
QueriesBrute-force VE Message Approx Exact inference 1 4 Second sweep: 3 DistributeEvidence 2 5 8 7 6 7 7 8 8 2 2 1 1 Message passing algorithm First sweep: CollectEvidence Root node
QueriesBrute-force VE Message Approx Exact inference If net is not a polytree, it does not work Networks with loops Request/messages go in a cycle indefinitely (info goes through 2 paths and is counted twice) Independence assumptions applied in the algorithm cannot be used here (now “any node separates the graph into 2 unconnected parts (polytrees)” does not hold) Alternatives??
QueriesBrute-force VE Message Approx Exact inference Complexity Complexity of propagation algorithms in polytrees (i.e., without loops, cycles in the underlying undirected graph) is linear in the size (nodes+arcs) of the network [brute-force is exponential] Exact inference in multiply-connected BNs is an NP-complete problem [Cooper 1990]
QueriesBrute-force VE Message Approx Exact inference M States of Z: {tt,ft,tf,ff} M P(Z|M)=P(S|M)P(B|M) since they are c.i. given M Z=S,B S B Create a new node Z, that combines S and B C H C H P(H|Z)=P(H|B) since H c.i. of S given B Metastatic cancer (M) is a possible cause of brain tumors (B) and an explanation for increased total serum calcium (S). In turn, either of these could explain a patient falling into a coma (C). Severe headache (H) is also associated with brain tumors. Method implemented in the main BN software packages Alternative: clusteringmethods[Lauritzen & Spiegelhalter’88] Transform the BN into a probabilistically equivalent polytree by merging nodes, removing the multiple paths between two nodes
QueriesBrute-force VE Message Approx Exact inference Alternative: clusteringmethods Steps for the JUNCTION TREE CLUSTERING ALGORITHM: • Moralize the BN • Triangulate the moral graph and obtain the cliques • Create the junction tree and its separators • Compute new parameters • Message passing algorithm COMPILATION Transform BN into a polytree (slow, much memory if dense, but only once) Belief updating (fast)
QueriesBrute-force VE Message Approx Approximate inference Because exact inference is intractable (NP-complete) with large (+40) and densely connected BNs Inferencia aproximada Why? the associated cliques for the junction tree algorithm or the intermediate factors in the VE algorithm will grow in size, generating an exponential blowup in the number of computations performed Both deterministic and stochastic simulation to find approximate answers
QueriesBrute-force VE Message Approx Approximate inference Inferencia aproximada Stochastic simulation Uses the network to generate a large number of cases (full instantiations) from the network distribution P(Xi|e) is estimated using these cases by counting observed frequencies in the samples. By the Law of Large Numbers, estimate converges to the exact probability as more cases are generated Approximate propagation in BNs within an arbitrary tolerance or accuracy is an NP-complete problem In practice, if e is not too unlikely, convergence is quickly
QueriesBrute-force VE Message Approx Approximate inference Repeat and use the observed frequencies to estimate P(Xi|e) Inferencia aproximada Probabilistic logic sampling [Henrion’88] Given an ancestral ordering of the nodes (parents before children), generate from X once we have generated from its parents (i.e. from the root nodes down to the leaves) When all the nodes have been visited, we have a case, an instantiation of all the nodes in the BN A forward sampling algorithm Use conditional prob. given the known values of the parents
Software genie.sis.pitt.edu
Software http.cs.berkeley.edu/~murphyk/
Software leo.ugr.es/elvira
S3-SEMINAR ON DATA MINING -BAYESIAN NETWORKS- B. INFERENCE Concha Bielza, Pedro Larrañaga ComputationalIntelligenceGroup Departamento de Inteligencia Artificial Universidad Politécnica de Madrid Master Universitario en Inteligencia Artificial