160 likes | 318 Views
Bayes-ball—an Efficient Algorithm to Assess D-separation. A Presentation for CSCE 582 (Fall 03) Marco Valtorta. D-separation. Two nodes x and y in a directed acyclic graph (DAG) are d-separated if for each chain between x and y in the DAG, either:
E N D
Bayes-ball—an Efficient Algorithm to Assess D-separation A Presentation for CSCE 582 (Fall 03) Marco Valtorta
D-separation • Two nodes x and y in a directed acyclic graph (DAG) are d-separated if for each chain between x and y in the DAG, either: • there is a serial connection with evidence, or • there is a diverging connection with evidence, or • there is a converging connection with no evidence on it or on any of its descendants • Two sets of nodes X and Y are d-separated if all of pairs x and y, with x in X and y in Y, are d-separated.
Testing D-separation • Direct testing of d-separation may be inefficient, because of the large number of chains in acyclic directed graphs (dags) • How many paths are there in a connected dag? • First consider how to count the number of paths in a dag • Topologically sort the dag. • The number of paths reaching node i is the sum, for all predecessors, of the number of paths that reach each predecessor • This leads to an inefficient recursive algorithm and an efficient dynamic programming algorithm
Example 1 2 2 16 6 10 1 5 8 3 1 3 1 9 1 7 9 4 1 5 A topological order of the nodes is indicated inside the nodes. The number of paths reaching a node from node 1 is written over each node. (Node 1 is a special case.)
Special Case: Staged Networks • Special case: There is an edge from each node in a layer to each node in the next layer, i.e. the layers are stages • Each layer has the same number of nodes • Because x*y subject to x+y=c is maximized by x=y=c/2 • How many stages? • For N nodes and c nodes per stage, there are c exp(N/c) paths • c=3 is the worst case
Worst Case (ctd.) • In fact, x exp(c/x) is maximum for x = e • The general case of DAGs and the special case of staged networks motivate the search for more efficient algorithms to test d-separation
The Bayes-ball Algorithm • There are several efficient algorithms for computing d-separation • Bayes-Ball is both efficient (O(V+E)) and easy to code • The main idea is to pass the Bayes-Ball to nodes in a DAG in different ways, according to who passes the ball (child or parent) and the state of the node that receives the ball (observed or unobserved)
Passing the Ball The ball comes from the left • From parent to unobserved child • From parent to observed child • From child to unobserved parent • From child to observed parent The ball passes through to all children e The ball bounces back to all parents The ball bounces back to children and passes through to all parents e The ball is blocked
Example What is A d-connected to? A e S • Passes through T • (2) Bounces back from E • (2a) Passes through • (3) Passes through and • is blocked by S • (4) Bounces back and is • blocked, because E is • marked on the top (3) (2a) L T (2) (2) B (1) (4) E e X Termination requires a marking scheme. Two visits of each edge are sufficient D
The Marking Scheme • Nodes may marked as: • visited, if they are passed the ball at all • marked on the top, if they pass the ball to their parents • marked on the bottom, if they pass the ball to their children
Insert v in a list of nodes to be visited, as if from one of its children. Call the list LV If LV is empty, stop; else, take a node j out of LV and mark it visited if j is not an evidence node and it is visited from a child if the top of j is not marked, mark its top and insert each of its parents in LV if the bottom of j is not marked, mark its bottom and schedule each of its children to be visited (3) if j is visited from a parent if j is an evidence node and the top of j is not marked, then mark its top and schedule a visit to each of its parents if j is not an evidence node and the bottom of j is not marked, then mark its bottom and schedule each of its children to be visited (4) Go back to 1 Nodes D-separated From v
Example What is A d-connected to? A e S LV={A from child} LV={T from parent} LV={A from child, T, E from parent} LV={E from parent} LV={T from child, L from child} LV={L from child} LV={E from parent, S from child} LV={S from child} LV={} (3) (2a) L T (2) (2) B (1) (4) E e X D Every evidence node that is visited is d-connected to A. Every other node that is marked on the bottom is d-connected to A. Note: Every non-evidence node that is visited is also marked on the bottom, so: every node that is visited is d-connected to A.
Example What is T d-connected to? A e S LV={T from child} LV={A from child, T, E from parent} LV={E from parent} LV={T from child, L from child} LV={L from child} LV={E from parent, S from child} LV={S from child} LV={} L T B E e X D Every evidence node that is visited is d-connected to A. Every other node that is marked on the bottom is d-connected to A. Note: Every non-evidence node that is visited is also marked on the bottom, so: every node that is visited is d-connected to A.
Correctness • In the case in which the start node has evidence, the algorithm does not mark any nodes as visited. This is not in accordance with the definition of d-separation, but it conforms to the intention of the author to identify the nodes that are relevant to assessing the belief in the start node. • A non-evidence node j is marked on the bottom if and only if there is a non-blocked path from the start node to j • Note: for the version given in the presentation, this can be simplified to: any visited node is d-connected to the start node. • Any edge is traversed at most once in each direction, so the algorithm has complexity O(n+m), where n is the number of nodes, and m is the number of edges. • O(n) operations are needed to initialize the markings of the nodes.
Extensions and Reference • Any edge is traversed at most once in each direction, so the algorithm has complexity O(n+m), where n is the number of nodes, and m is the number of edges. • O(n) operations are needed to initialize the markings of the nodes. • Extensions involve explicit functional nodes and influence diagrams (decision networks) • In the • Original Reference: • Ross D. Shachter. “Bayes-Ball: The Rational Pastime (for Determining Irrelevance and Requisite Information in Belief Networks and Influence Diagrams).” Proceedings of UAI-98, pp.480-487 • Soft copy at http://www.stanford.edu/~shachter/pubs/bayesbl.pdf
Extra Credit • Prove correctness of Bayes-ball algorithm with marking. • Two extra credit points if in by September 9 (one week from today). • One extra credit point if in later, and to the discretion of the instructor.