750 likes | 890 Views
An Implementation of Multiply Sectioned Bayesian Networks. Metron, Inc. Chris Boner Thor Whalen. Outline. Multiply sectioned Bayes nets (MSBN) Problem formulation and elements of a solution Using a junction tree to construct an MSBN Matlab tool. Multiply Sectioned Bayes Net.
E N D
An Implementation of Multiply Sectioned Bayesian Networks Metron, Inc. Chris Boner Thor Whalen
Outline • Multiply sectioned Bayes nets (MSBN) • Problem formulation and elements of a solution • Using a junction tree to construct an MSBN • Matlab tool
Multiply Sectioned Bayes Net • What is a Bayes Net? • What is a Multiply Sectioned Bayes Net (MSBN)? • Motivation
What is a Bayes Net? • A Bayes Net is a representation of a probability distribution P(V) on a set V=X1, ..., Xn of variables
What is a Bayes Net? X3 X2 X1 • A Bayes Net is a representation of a probability distribution P(V) on a set V=X1, ..., Xn of variables • A BN consists of • A Directed Acyclic Graph (DAG) • Nodes: Variables of V • Edges: Causal relations X5 X4 X7 X8 X6 X9 X10 X12 X11 X13 Directed cycle A DAG is a directed graph with no directed cycles The above directed graph is a DAG Now this graph IS NOT a DAG because it has a directed cycle
What is a Bayes Net? X3 X2 X1 • A Bayes Net is a representation of a probability distribution P(V) on a set V=X1, ..., Xn of variables • A BN consists of • A Directed Acyclic Graph (DAG) • Nodes: Variables of V • Edges: Causal relations • A list of conditional probability distributions (CPDs); one for every node of the DAG X5 X4 X7 X8 X6 X9 X10 X12 X11 X13 Etc...
What is a Bayes Net? X3 X2 X1 • A Bayes Net is a representation of a probability distribution P(V) on a set V=X1, ..., Xn of variables • A BN consists of • A Directed Acyclic Graph (DAG) • Nodes: variables of V • Edges: Causal relations • A list of conditional probability distributions (CPDs); one for every node of the DAG • The DAG characterizes the (in)dependence structure of P(V) X5 X4 X7 X8 X6 X9 X10 X12 X11 X13 A C B A and B are independent given C - i.e. P(A , B | C) = P(A | C) P(B | C) - i.e. P(A | B, C) = P(A | C) We will say that C separates A and B
What is a Bayes Net? X3 X2 X1 • A Bayes Net is a representation of a probability distribution P(V) on a set V=X1, ..., Xn of variables • A BN consists of • A Directed Acyclic Graph (DAG) • Nodes: variables of V • Edges: Causal relations • A list of conditional probability distributions (CPDs); one for every node of the DAG • The DAG characterizes the (in)dependency structure of P(V) • The CPDs characterize the probabilistic and/or deterministic relations between parent states and children states X5 X4 X7 X8 X6 X9 X10 X12 X11 X13
What is a Bayes Net? Parentless nodes X3 X2 X1 • The prior distributions on the variables of parentless nodes, along with the CPDs of the BN, induce prior distribution—called “beliefs” in the literature—on all the variables • If the system receives evidence on a variable: • this evidence impacts its belief, • along with the beliefs of all other variables X5 X4 X7 X7 X8 X6 X9 X10 Evidence X12 X11 X13
What is a Multiply Sectioned Bayes Net? Evidence Evidence Subnets of the BN are maintained independently Each subnet locally integrates evidence it receives When inter-subnets communication is possible, messages are passed that enable fusion of evidence received by other subnets
Motivation • Multi-agent systems where: • each agent only has partial knowledge of the domain • communication among agents is limited • some agent-agent connections may be impossible, sporadic and/or low bandwidth • decisions must be made by agents based on local observations and limited information from other agents
Motivation (2) • Distributed computing for reusable systems where: • probabilistic knowledge can be captured once and used for multiple cases • queries and evidence will be localized, that is, there are phases when • new evidence and queries are repeatedly directed to small parts of network • only a small part of the network is needed for decision-making
Problem Formulation and elements of a solution • Problem specification • A naïve solution • A less naïve solution • Sufficient information • Communication graph considerations
Problem Specification Given: • A BN on V={X1, ..., Xn} • A number of agents, each having: • Qi: a set of query variables • Ei: a set of evidence variables
Problem Specification Given: • A BN on V={X1, ..., Xn} • A number of agents, each having: • Qi: a set of query variables • Ei: a set of evidence variables Determine: • An agent communication graph • A subset Si of V for each agent • An inference protocol that specifies • How to fuse evidence and messages received from other agents • The content of messages between agents
A Naïve Solution Evidence Every agent has a copy of the entire Bayes net Agents communicate evidence (findings or likelihood functions) that are re-propagated through each copy of the BN
A Naïve Solution Pros • Each agent’s queries are as informed as possible once all the evidence it has received is propagated • Inter-agent communications require relatively low bandwidth Cons • Could be a colossal waste of memory and processing time • Each agent may be able to achieve fully informed queries by representing a much smaller section of BN
A Less Naïve Solution Query variables Query variables • The previous solution allows each agent to compute the posterior prob. of all the variables • But all the agent is interested in is the posterior of its query variables • Hence it is sufficient for every agent to only represent - its query variables, - its evidence variables, - the evidence variables of the other agents • Contra: Could be a colossal waste of memory and processing time Evidence variables Evidence variables Query variables Query variables Evidence variables Evidence variables
B A M D C L E K G J H F I Sufficient information Specifications • A Bayes net • A number of agents, each having • query variables • evidence variables Query variables Agent 1 Agent 4 Agent 2 Agent 3 Evidence variables
B A M D C L E K G J H F I Sufficient information The naïve solution • Agents contain their own query and evidence variables • In order to receive evidence from the other agents, agent 1 must represent variables E, F, G, H, I, J, K, L, and M Specifications • A Bayes net • A number of agents, each having • query variables • evidence variables Agent 1 A B Agent 1 E F G H I J H I J E F G K L M K L M Agent 4 Agent 2 Agent 3 Agent 3 Agent 4 Agent 2 H I J E F G K L M
B A M D C L E K G J H F I Sufficient information The naïve solution • Agents contain their own query and evidence variables • In order to receive evidence from the other agents, agent 1 must represent variables E, F, G, H, I, J, K, L, and M Specifications • A Bayes net • A number of agents, each having • query variables • evidence variables Agent 1 must represent many variables! How else could the other agents communicate their evidence? Agent 1 X Z A B Y Note that H I J E F G K L M Z separates X and Y Agent 4 Agent 2 whether Y is equal to: {K,L,M}, {H,J,I}, or Agent 3 {E,F,G}. H I J Y E F G K L M Y
A D C B E G F Sufficient information → P(Y|Z) = P(Y|X,Z) Z separates X and Y → = Likelihood given Z of evidence on Y Likelihood given X and Z of evidence on Y → It is sufficient for agent 2 to send its posterior on Z to agent 1 for the latter to compute its posterior on X Agent 1 X = {A,B} X P(X,Z) P(X,Z|eY) Z = {C,D} x P(X,Z)P(Z)-1 Z P(Z|eY) ΣY Z = {C,D} Y P(Y,Z|eY) P(Y,Z) evidence eY Y = {E,G,F} Agent 2
A D C B E G F Sufficient information → P(Y|X,Z) = P(Y|Z) Z separates X and Y → = Likelihood given Z of evidence on Y Likelihood given X and Z of evidence on Y P(eY|X,Z) P(eY|Z) → It is sufficient for agent 2 to send its posterior on Z to agent 1 for the latter to compute its posterior on X Agent 1 Because: X = {A,B} P(X,Z) P(Z)-1 P(Z|eY) X P(X,Z|eY) Z = {C,D} = P(X,Z) P(Z)-1 P(Z,eY) P(eY)-1 Z P(Z|eY) = P(X,Z) P(eY|Z) P(eY)-1 = P(X,Z) P(eY)-1 P(eY|X,Z) Z = {C,D} Y P(Y,Z|eY) = P(X,Z,eY) P(eY)-1 P(X,Z|eY) = Y = {E,G,F} Agent 2
Sufficient information The naïve solution • Agents contain their own query and evidence variables • In order to receive evidence from the other agents, agent 1 must represent variables E, F, G, H, I, J, K, L, and M Specifications • A Bayes net • A number of agents, each having • query variables • evidence variables Using separation • Agent 1 only needs to represent two extra variables • Agent 1 may compute its posterior queries faster from CD than from EFGHIJK • Communication lines need to transmit two variables instead of three A B A B H I J C D E F G K L M C D C D C D C D C D C D C D C D C D H I J H I J H I J H I J E F G K L M E F G E F G E F G K L M K L M K L M
Communication Graph Considerations 3 4 1 6 2 5 One solution (often adopted) would be to impose a tree structure to the communication graph A communication graph Agent 6 receives info from agent 1 through both agent 4 and 5. ? How should subnet 6 deal with possible redundancy?
Communication Graph Considerations • When choosing the communication graph, one should take into consideration • The quality of the possible communication lines • The processing speed of the agents • The importance of given queries ...then this communication graph is more appropriate If this is the key decision-making agent … than this one
Communication Graph Considerations • In a tree communication graph every edge is the only communication line between two parts of the network • Hence it must deliver enough information so that the evidence received in one part may convey its impact to the query variables of the other part • We restrict ourselves to the case where every node represented by an agent can be queried or receive evidence • In this case it is sufficient that the set of variables Z, that will be represented in any communication line, separates the set X of variables of one side of the network from the set Y of variables of the other side Z X Y
Using a Junction Tree to construct an MSBN • The junction tree and its use • Building a Junction tree • Moralization • Triangulation • The junction graph • From junction graph to junction tree • Partitioning the junction tree • Adding and removing agents • A note on continuous variables
ae ce ad b d g h a e c f de eg egh abd ade ceg ace def The junction tree and its use Secondary Structure/ Junction Tree • multi-dim. random variables • joint probabilities (potentials) Bayesian Network • one-dim. random variables • conditional probabilities
b g d h e a c f The junction tree and its use • A junction tree is a graphical model of a probability space: • Nodes of a JT are sets of variables • Edges of a JT (called sepsets) are labeled by the intersection of the set of variables of the nodes they join • The set of variables Z of any edge of a JT separates the set of variables of the sub-trees of both sides of this edge {a,b,d,e,f} and{a,c,e,g,h} e.g. {a,e} separates abd ace ad ae ce ade ceg de eg def egh
b h g g d a e e a e c f The junction tree and its use So any partition a junction tree into sub-trees will allow for distributed inference abd a,e ace ad ae ce ceg e,g ade de eg def egh
Example of the Junction Tree Approach Agent 1 query nodes TopCat Feat 7 Feat 8 Feat 1 Feat 3 Feat 2 Feat 4 Sens1a Feat 5 Agent 2 Agent 4 Sens2a query nodes evidence nodes Feat 6 evidence nodes Sens1b query nodes Sens2b query nodes evidence nodes Agent 2
b g d h b g g d d h d a e c e a c e e a c e a e b d g h b h g d a e a e c c f f f f Moral Graph Triangulated Graph Identifying Cliques Junction Tree Building a Junction Tree DAG
Building a Junction Tree: Moralization 1) Add an edge between every node having a common child. 2) Drop the directions of all other edges. TopCat Feat 7 Feat 8 Feat 1 Feat 3 Feat 2 Feat 4 Sens1a Feat 5 Sens2a Feat 6 Sens1b Sens2b
Building a Junction Tree:Triangulation Add edges to the graph to triangulate (induced) cycles of length greater than three. This is the only induced cycle of length greater than three There are only two ways to triangulate it... So we’ll choose this way for our example TopCat Feat 7 Feat 8 Feat 1 Feat 3 Feat 2 Feat 4 Sens1a Feat 5 Sens1a This way can be shown to be problematic Feat 6 Sens1b Sens1b
Building a Junction Tree:Junction Graph A complete subgraph is one with edges between every vertex of the subgraph. A clique is a complete subgraph contained in no other complete subgraph. This is a clique. This is a clique. Now it IS a clique. This is NOT a clique. TopCat Feat 7 Feat 8 Feat 1 Feat 2 Feat 3 Feat 4 Sens1a Feat 5 Sens2a Feat 6 Sens1b Sens2b
Building a Junction Tree:Junction Graph TopCat TopCat TopCat Feat 7 Feat 2 TopCat TopCat Feat 8 Feat 7 TopCat Feat 1 Feat 3 Feat 8 Feat 4 Feat 4 TopCat TopCat Feat 3 Feat 4 Feat 1 Feat 7 Feat 4 Feat 4 Feat 7 Feat 2 Feat 4 Feat 8 TopCat Feat 7 Feat 3 Feat 4 Feat 4 Feat 4 Feat 7 Feat 5 Feat 1 Feat 2 Feat 5 Sens1a Sens1b Feat 5 Feat 6 Sens2a Sens2b 1) Identify Cliques. Every clique corresponds to a node in the JG. TopCat Feat 8 2) Draw an edge between two nodes if they share variables. Feat 7 Feat 1 Feat 2 Feat 3 Sens1a Feat 4 Feat 5 3) Label edges of JG with intersection of cliques. Sens2a Sens1b Feat 6 Sens2b
From Junction Graph to Junction Tree 1 TopCat TopCat TopCat 1 Feat 7 Feat 2 TopCat TopCat Feat 8 Feat 1 TopCat 1 Feat 3 2 Feat 7 1 TopCat Feat 4 2 Feat 8 Feat 4 Feat 7 TopCat Feat 3 The edges of the JT are called sepsets 2 Feat 1 2 Feat 4 2 Feat 4 Feat 4 Feat 2 2 Feat 7 Feat 4 1 TopCat Feat 3 Feat 8 Feat 7 Feat 4 Feat 4 Feat 4 Feat 5 1 Feat 7 Feat 5 Feat 1 Feat 2 Feat 5 Feat 6 Sens1a Sens1b Sens2a Sens2b The cliques are the nodes of the JT 1) Weight every edge with the number of variables it is labeled with 2) Find a maximal weight spanning (i.e. covering all JG nodes) tree 3) The corresponding subgraph of the JG is a Junction Tree (JT)
TopCat TopCat Feat 7 Feat 2 TopCat Feat 8 Feat 7 Feat 1 Feat 3 Feat 8 Feat 4 TopCat Feat 4 Feat 4 Feat 3 TopCat Feat 7 Feat 4 Feat 4 Feat 8 Feat 7 Feat 5 Feat 1 Feat 2 Feat 5 Feat 6 Sens1a Sens1b Sens2a Sens2b Partitioning the Junction Tree TopCat TopCat Feat 3 Feat 4 Feat 1 Feat 5 Feat 2 • Partition the nodes of the JT: Sets of the partition →subnets Edges between two nodes of different subnets →communication lines Desirable for portion of JT inside a subnet to be connected
Partitioning the Junction Tree TopCat TopCat Feat 3 Feat 3 Feat 4 Feat 4 Feat 1 Feat 1 Feat 2 Feat 2 Feat 5 Feat 5 Feat 7 Feat 7 Feat 8 Feat 8 Feat 1 Feat 1 Feat 2 Feat 2 Feat 5 Feat 5 Feat 7 Feat 7 Feat 8 Feat 8 Sens2a Sens1a Feat 6 Feat 6 Evidence Feat 4 Evidence Evidence Sens2b Sens1b Corresponding Subnets
Adding and removing Agents • Here we address the problem of adding and removing agents to the network • Consider the BN given earlier • Is it possible to add and remove agents (containing sensor variables) to the network and perform inference without reconfiguring the network Sens5b TopCat Sens5a Feat 7 Feat 8 Feat 1 Feat 3 Feat 2 Feat 4 Sens1a Feat 5 Sens3a Sens2a Sens4a Feat 6 Sens1b Sens3b Sens6a Sens4b Sens2b Sens6b
TopCat Feat 2 Feat 1 Feat 1 Feat 2 Sens3a Sens3b Adding and removing Agents (2) Adding the cliques containing the new variables does not change the structure of the Junction tree, so new agents containing these variables may easily be added and removed, along with a single communication line to the rest of the network. Etc ... Sens5b TopCat Feat 1 Feat 1 Etc ... Sens5a Feat 2 Feat 2 Feat 1 Feat 2 Feat 1 Feat 2 Feat 1 Feat 2 Feat 1 Feat 2 Sens1a Sens1b Sens5a Sens5b Sens3a Sens1a Sens3b Sens1b
TopCat Feat 7 TopCat Feat 8 Feat 3 Feat 4 TopCat Feat 4 Feat 7 Adding and removing Agents (3) Adding the cliques containing the new variables does not change the structure of the Junction tree, so new agents containing these variables may easily be added and removed, along with a single communication line to the rest of the network. TopCat Etc ... Etc ... Feat 7 Feat 8 TopCat Feat 3 TopCat Feat 4 Feat 3 Feat 7 Feat 4 Feat 4 Feat 3 Feat 5 Feat 4 Sens2a Sens6a Feat 5 Feat 6 Sens2b Sens6b Feat 5 Sens4a Feat 5 Feat 6 Feat 5 Feat 5 Feat 6 Feat 5 Feat 6 Feat 5 Sens2a Sens4a Sens6a Sens2b Sens4b Sens6b Sens4b Feat 6 Feat 6
TopCat Feat 7 TopCat Feat 8 Feat 3 Feat 4 TopCat Feat 4 Feat 7 Adding and removing Agents (4) It is not desirable to have sensors chained as such since evidence received in one agent must pass through other agents to reach the central agent It would be preferable to have the sensor agents communicate with the central agent directly TopCat Etc ... Etc ... Feat 7 Feat 8 TopCat Feat 3 TopCat Feat 4 Feat 3 Feat 7 Feat 4 Feat 4 Feat 3 Feat 5 Feat 4 Sens2a Sens6a Feat 5 Feat 6 Sens2b Sens6b Feat 5 Sens4a Feat 5 Feat 6 Feat 5 Feat 5 Feat 6 Feat 5 Feat 6 Feat 5 Sens2a Sens4a Sens6a Sens2b Sens4b Sens6b Sens4b Feat 6 Feat 6
TopCat Feat 7 TopCat Feat 8 Feat 3 Feat 4 TopCat Feat 4 Feat 7 Adding and removing Agents (3) By adding an extra variable the appropriate clique, we now have a different junction tree structure more fit for our application TopCat Etc ... Etc ... Feat 7 Feat 8 TopCat Feat 3 TopCat Feat 4 Feat 3 Feat 7 Feat 4 Feat 4 Feat 3 Feat 5 Feat 4 Sens2a Sens6a Feat 5 Feat 6 Feat 6 Feat 5 Sens2b Sens6b Feat 5 Feat 5 Feat 6 Feat 6 Sens4a Feat 6 Feat 5 Feat 6 Feat 5 Feat 6 Feat 5 Feat 6 Sens2a Sens4a Sens6a Sens2b Sens4b Sens6b Sens4b
A note on continuous variables • Wish to extend JT inference to handle continuous variables. • JT inference ↔ Potential manipulation • In general, a potential Φ on X=X1, ..., Xn is a function Φ: X1 x ... x Xn → [0,+∞) • We need to define • Multiplication (and division) of two potentials • Use function multiplication for this • marginalization of a potential • use integration instead of summation
A note on continuous variables • Prior distributions and all possible evidence likelihood functions must be represented algebraically by a class of functions closed under multiplication and integration • If the class of functions doesn’t encompass the prior distributions and evidence exactly, the question arises whether approximate inference or discretization might yield better results
A note on continuous variables • Algebraic manipulation is not a straightforward computational task • Exact integration is not always possible • Is numerical integration necessarily better than approximate inference or discretization?
MatLab Tool • Setup • Functionality • Adding and removing subnets
MatLab Tool Inputs • Bayes Net • variables and states • CPDs • entered as simple text format or translated from a .dne Netica file • Junction tree from Bayes Net • entered as simple text format or translated from a .dne Netica file • Partition of the junction tree