220 likes | 344 Views
A Comparison of Lauritzen-Spiegelhalter, Hugin, and Shenoy-Shafer Architectures for Computing Marginals of Probability Distributions. Scott Langevin Chris Streett Alicia Ruvinsky. History. Computing Marginals of Multivariate Discrete probability distributions in Uncertain reasoning
E N D
A Comparison of Lauritzen-Spiegelhalter, Hugin, and Shenoy-Shafer Architectures for Computing Marginals of Probability Distributions Scott Langevin Chris Streett Alicia Ruvinsky
History • Computing Marginals of Multivariate Discrete probability distributions in Uncertain reasoning • 1986 – Pearl’s architecture • Singly connected Bayes nets • Singly connected = (a.k.a. polytree) there exists at most one path between any 2 nodes • 1988 – Lauritzen and Spiegelhalter create LS • 1990 – Jensen et al. modify LS creating Hugin • 1990 – Inspired by previous work, Shenoy and Shafer propose framework using join trees to produce marginals • 1997 – Shenoy refines Shenoy-Shafer architecture with binary join trees • This will be referred to as the Shenoy-Shafer (SS) architecture
Chest Clinic Problem • Dyspnoea(D) may be causedby Tuberculosis(T), Lung Cancer(L) or Bronchitis(B) • A recent visit to Asia(A) canincrease the chance of T • Smoking(S) increases chance ofL and B • X-ray(X) – Does not discriminatebetween L and T • Either(E) Tuberculosis or Lung Cancercan result in a positive X-ray
Axioms For Local Computation 3 axioms enabling efficient local computation of marginals of the joint valuation: • Order of deletion does not matter • Suppose σ is a valuation for s, and suppose X1, X2εs. Then (s↓(s – {X1})) ↓(s – {X1, X2}) = (s ↓(s – {X2})) ↓(s – {X1, X2}) • Commutativity and associativity of combination • Suppose ρ, σ, and τ are valuations for r, s, and t, respectively. Then ρ×σ = σ×ρ, and ρ× (σ×τ) = (ρ×σ) ×τ. • Distributivity of marginalization over combination • Suppose ρ and σ are valuations for r and s, respectively, suppose X εs, and suppose X εr. Then (ρ×σ)↓((rUs) – {X}) = ρ× (σ↓(s – {X}))
Lauritzen-Spiegelhalter Architecture – Junction Trees • First construct junction tree for BN • Join tree where each node is a clique • Associate each potential Kv with smallest clique that contains {V} U Pa(V). If clique contains more than one potential associate cartesian product of potentials to clique • Evidence is modeled as potentials and associated with smallest clique that includes domain of potential • Pick a node with largest state space in junction tree to be root
A T S L B T L E L E B E X E B D LS Junction Tree for Chest Clinic Problem
Lauritzen-Spiegelhalter Architecture - Calculating Marginals • Two phases: Inward pass, Outward pass • Involves sending messages which are potentials to neighboring nodes • Inward pass • Each node sends message to inward neighbor after it receives messages from all outward neighbors. If no outward neighbors, send message. • When sending, message is computed by marginalizing current potential to intersection with inward neighbor. Message is sent to inward neighbor and current potential is divided by message • When receiving message from outward neighbor, current potential is multiplied by message • Inward pass ends when root has received a message from all outward neighbors
Xj Xj’ = Xj × Xi’↓(c1 ∩ c2) cj cj Xi’ Xi’’ = Xi’ / Xi’↓(c1 ∩ c2) ci ci Before After Lauritzen-Spiegelhalter – Inward Pass
Lauritzen-Spiegelhalter Architecture - Calculating Marginals • Outward pass • Each node sends message to outward neighbors after it receives message from inward neighbor. If no inward neighbors, send message • When sending, message is computed by marginalizing current potential to intersection with outward neighbor. Message is sent to outward neighbor • When receiving message from inward neighbor, current potential is multiplied by message • Outward pass ends when all leaves have received messages from inward neighbors • Final Step • At end of outward pass each clique is associated with potential representing marginal of the posterior for clique • To compute marginal of the posterior for each variable in BN find clique containing variable with the smallest domain and marginalize to compute marginal of variable
Xj’’’ Xj’’’ cj cj Xi’’ Xi’’’ = Xi’’ × Xj’’’↓(c1 ∩ c2) ci ci Before After Lauritzen-Spiegelhalter – Outward Pass
Hugin Architecture – Junction Trees + Separators • Similar to LS method but with computational enhancement: Separators • Construct Junction Tree as in LS but introduce a Separator node between each pair of cliques. Domain of Separator is intersection of two cliques. • The Separator will store the potential of the intersection of the two cliques • Pick any clique node to be root
S L B A T T L B T L E L E B L E E B E E X E B D Hugin Junction Tree for Chest Clinic Problem
Hugin Architecture - Calculating Marginals • Two phases: Inward pass, Outward pass • Involves sending messages which are potentials to neighboring nodes • Inward Pass • Same as LS but sender does not divide current potential by message. Instead message is saved in separator • Outward Pass • Separator divides message by saved potential and result is multiplied by potential of receiving node • Final Step • At end of outward pass each clique and separator is associated with potential representing marginal of the posterior for domain of node • To compute marginal of the posterior for each variable in BN first find separator containing variable with the smallest domain and marginalize to compute marginal of variable. If no such separator exists then calculate marginal for variable as in LS by finding clique containing variable with the smallest domain and marginalize to compute marginal of variable
cj Xj Xj’ = Xj × Xi’’↓(ci ∩ cj) cj Xi’’↓(ci ∩ cj) t Xi’ Xi’’ = Xi’ ci ci Before After Hugin – Inward Pass
Xj’’’ cj Xi’’↓(ci ∩ cj) Xi’’ ci Before Hugin – Outward Pass cj Xj’’’ Xj’’’↓(ci ∩ cj) / Xi’’↓(ci ∩ cj) Xi’’’ = Xi’’ × (Xj’’’↓(ci ∩ cj) / Xi’’↓(ci ∩ cj)) ci After
Shenoy-Shafer Architecture – Binary Join Trees • Setup: • First, arrange elements of the hypergraph (generated from domains of potentials) into a binary join tree. • Binary join tree = join tree where no node has more than 3 neighbors • All combinations done in pairs, i.e. combine functions 2 at a time • All singleton subsets appear in binary join tree (attached at node with smallest subset containing singleton variable to be attached) • Associate each potential with a node containing subset that corresponds to its domain. (see figure)
Shenoy-Shafer Architecture – Calculating Marginals • Each node that is to compute marginal requests a message from each of its neighbors • A node, r, receiving a message request from a neighbor, s, will in turn request messages from its other neighbors • Upon receiving messages from its other neighbors, r will combine all the messages it receives into its own potential. R will then marginalize this potential to r∩s. (note: leaves send reply right away) • Message from r to s, formally: μr→s = ( × {μt→r | t ε (N(r) \ {s}) } × αr}↓r∩s • μr→s : message from r to s • N(r) : neighbors of r • αr : probability potential associated with node r • When marginal computing node receives all replies, it computes marginal • It combines all messages together with its own probability potential and reports results as its marginal. • φ ↓r = ( × {μt→r | t ε (N(r)} × αr (φ denotes joint potential)
αs αr μr→s s r μs→r φ↓s φ↓r Important SS Storage Differences Architecture • No division operations • Input potentials remain unchanged during propagation process • Marginal of the joint probability for a variable is computed at the corresponding singleton variable node of the binary tree
Comparing LS and Hugin • Hugin is more efficient than LS computationally. • Hugin has fewer additions and division. (equal on multiplications) • Computation of marginals is always done from the separator which exploits smaller domain sizes. • Marginals of single variables: • in LS - find clique with smallest domain • In Hugin - search thru separators as well as cliques • Example: using figure 2, find P(T) and P(X) for each architecture • LS is more storage efficient than Hugin. • LS doesn’t utilize separators • In the interest of computational efficiency, comparison of Hugin and SS will be explored.
Comparing Hugin and SS • SS is more computationally efficient than Hugin • SS has no divisions (on average, less computations) • Efficiency of SS increases with larger state spaces • Calculating probability of singleton variables • adds expense in Hugin • always calculated and available in SS. • SS is more flexible than Hugin (as well as LS) • Due to lack of division • CONJECTURE: Hugin is more storage efficient than SS. • To be investigated • SS has a larger data structure, hence most likely uses more space