Probabilistic networks Inference and Other Problems

Probabilistic networksInference and Other Problems Hans L. Bodlaender Utrecht University

Overview • Probabilistic networks • The inference problem • Tree decompositions and an algorithm for probabilistic inference • The Maximum Probable Assignment problem • Monotonicity Probabilistic networks - IPA fall days

Decision support systemsReasoning with uncertainty • Decision support systems (and/or expert systems) • Reasoning with uncertainty • Set of stochastic variables • Observations • Other variables • Variable(s) of interest • In 1980s the probabilistic network model was proposed. • Also called: Bayesian networks, belief networks, graphical models Probabilistic networks - IPA fall days

Probabilistic networks Pr(x1)=0.7 Pr(¬x1)=0.3 • Directed acyclic graph • Each node is a (discrete) stochastic variable • E.g. Boolean variable • Given for each variable is its conditional probability distribution: • Conditional to values for the parents of the node x1 x2 x3 … … x5 x4 Pr(x5|x3)= 0.6 Pr(¬x5|x3)= 0.4 Pr(x5|¬x3)= 0.2 Pr(¬x5|¬x3)=0.8 Pr(x4| x2 and x3) = 0.12 etc. Probabilistic networks - IPA fall days

Probabilistic networks - IPA fall days

Configuration • A configurationc is an assignment of a value to each variable (node). • For set W of variables, or variable v, and configuration c, denote cWand cvfor the restrictions (partial configurations). • Probability of configuration c: Probabilistic networks - IPA fall days

Topological sort of directed acyclic graph • Order of vertices such that edges go from left to right: • List vertices v1, …, vn such that for each arc (vi,vj): i < j. • Always exists for dag, and can be found in O(|V|+|E|) time. Probabilistic networks - IPA fall days

Generating a random configuration Pr(v1)=0.7 Pr(¬v1)=0.3 x1 Make a topological sort of G For i= 1 to n dogenerate a value for viusing the probabilities dictated by values already generated for the parents of i x2 x3 … Pr(v2|v1) = 0.3Pr(¬v2|v1) = 0.7Pr(v2|¬v1) = 0.4Pr(¬v2|¬v1) = 0.6 x4 x5 … Pr(v5|v3)= 0.6 Pr(¬v5|v3)= 0.4 Pr(v5|¬v3)= 0.2 Pr(¬v5|¬v3)=0.8 Probabilistic networks - IPA fall days

Inference problem • Given: values for some variables (observations) cO • Question: probability distribution on one variable conditional to observations, or: Probabilistic networks - IPA fall days

Use of inference problem • Network models information from application domain (medical, agricultural, weather forecasting, …) • User gives values for some variables (symptoms of patient, observed values) and wants to know distribution for other variables (likeliness of success of treatment, diagnostic) • Used nowadays in many applications Probabilistic networks - IPA fall days

Probabilistic networks - IPA fall days

Inference problem is #P-complete • #P-completeness implies NP-hardness. • Proof of #P-hardness: • Number of satisfying truth assignments of 3CNF formula is #P-complete • E.g.: (x1 or x2 or ¬x4) and (x5 or ¬x1 or ¬x3) and … • Transform to probabilistic network Probabilistic networks - IPA fall days

Example transformation (x1 or x2 or ¬ x4) and … Pr(xi) = 0.5 Pr(¬xi) = 0.5 x1 x2 x3 x4 x5 x6 T: Probability 1 when satisfied; otherwise 0 T: Probability 1 whenboth parents true Probability equals #sat / 2n Probabilistic networks - IPA fall days

Using tree decompositions to solve the problem Fast when width of tree decomposition is small Tree decomposition of moralisation of G: Tree with each node a bag: set of variables For all v: Bags with v form connected subtree There is a bag containing v and its parents (bag coversv) Lauritzen-Spiegelhalter algorithm x1 x2 x3 x4 x5 x1 x2 x3 x3 x2 x4 x4 x5 Probabilistic networks - IPA fall days

Here: description without observations. Take node with variable of interest in it as root. Compute for each node i with bag X of T a table For each assignment cX of values to the variables in X, with Y the variables covered in the subtree with root i, compute vi(cx): LS algorithm X Y extends Probabilistic networks - IPA fall days

A table for a node can be computed when the tables of the children are known E.g., compute tables in postorder (bottom up) Computing tables bottom up Probabilistic networks - IPA fall days

Example: node with two children with all bags identical X i X X j1 j2 Y1 Y2 Probabilistic networks - IPA fall days

LS algorithm • For other types of nodes, similarly table can be computed. • Time for one table linear in size of table: bag size k gives time O(2k) for binary variables. • Linear time when bag size bounded by constant (bounded treewidth). Happens often in practice! • Table of root allows to compute distribution for variables in root bag • Similar scheme when observations are given; when variables are discrete but not all binary • Scheme with also moving downwards in tree to compute distribution for all variables: also linear time for bounded treewidth Probabilistic networks - IPA fall days

MAP problem • Given: probabilistic network, some observations • Question: most likely configuration given the observations • Applications: most likely explanation, verification of design of probabilistic networks Probabilistic networks - IPA fall days

MAP is NP-hard Pr(xi) = 0.5 Pr(¬xi) = 0.5 (x1 or x2 or ¬ x4) and … x1 x2 x3 x4 x5 x6 Shimoney, 1994 T: Probability 1 when satisfied or Y is T otherwise 1/2 Probabilistic networks - IPA fall days

Similar algorithm as for inference can solve MAP in linear time when tree decomposition of moralisation with bounded bag size (treewidth) is given Compute for cX: MAP with tree decompositions X Y Probabilistic networks - IPA fall days

Fixed parameter variant of MAP • MAP(p): • Given: probabilistic network • Question: is there a configuration with probability at least p? • Can be solved in O(f(p) n) time, i.e., linear for fixed p. • Joint work with van der Gaag and van den Eijkhof. • Similar result when there are observations (values for some variables), and we look to a configuration consistent with the observations Probabilistic networks - IPA fall days

Look at variables in order of a topological sort Recursive process: Branch for assignment of value to next variable Plus … bounding mechanism Algorithm uses branch and bound Start here v1=T v1=F v1=T, v2=T v1=T, v2=F v1=T, v2=T v1=T, v2=F v1=T, v2=T, v3=T v1=T, v2=T, v3=T … Probabilistic networks - IPA fall days

Bounding • Recall: • Parents of v are before v in topological sort • Compute for a node z in branch and bound tree with assigned values • P(z) can be computed from P(parent(z)) and choice for ith variable • Bound when P(z) < p: this can never be a solution Probabilistic networks - IPA fall days

Recursive scheme • E-MPA-p(values for first i variables, p, pz) • If i=n (we have done all variables), then return true (output the sequence); stop. • Else: For each possible value x for vi+1: • Compute pznew = pz * Pr(vi+1=x | values for first i vertices) • If pznew ³ p, thenE-MPA-p( values for first i variables and then x, p, pznew) Probabilistic networks - IPA fall days

Time analysis • If a node has at least two children in the tree, then • For each child, pznew³p, hence Pr(vi+1=x | values for first i vertices) ³p • Hence: pznew£ pz * (1-Pr(vi+1=x | values for first i vertices)) £pz * (1-p) • After a node in the tree with two children, value of pz is a factor at least (1-p) smaller • Tree has at mostlog p / log (1-p)leaves. (How often can you divide 1 by 1-p till you are smaller than p?) • Time is O(f(p) * n). Probabilistic networks - IPA fall days

Partial MAP • Variant of MAP where we ask for values to subset of variables with maximum probability, given some observations • Park: NPPP-complete, and NP-complete when G is a polytree (underlying undirected graph is a tree) Probabilistic networks - IPA fall days

Monotonicity • Joint work with Linda van der Gaag and Ad Feelders • Monotonicity is often a requested property of a probabilistic network • E.g.: if a patient has more severe symptons, one expects the diagnosis is more severe • Ordering on the values of variables • cX£c’X if for all x in X: cX(x) £c’X(x) • Two observations that are ordered should imply ordering of probabilities of values for variable of interest (formal definition follows). Probabilistic networks - IPA fall days

Monotonicity in mode • Let z be the output variable. • The mode of z given values cX for some other variables X: T(z | cX) is that value for z such that Pr(z| cx) is maximal. (+ tie-breaking rule) • Take ordering on values of each variable. • The probabilistic network with observable nodes X and output variable z is isotone when each pair of value assignments to X, cX,c’X, one has: • cX£ c’X implies T(z | cX) £ T(z | c’X) • Antitone: cX£ c’X implies T(z | cX) ³ T(z | c’X) • Monotone: isotone or antitone • Monotone in distribution: similar, but looking to cumulative distribution. Identical to monotonicity in mode when all variables are binary. Probabilistic networks - IPA fall days

Results • Testing if network is monotone (isotone, antitone) in mode (in distribution) is: • coNPPP complete • coNP-complete for polytrees Probabilistic networks - IPA fall days

Transformation from variant of Partial MAP problem Can we set values for M, such that Pr(E=T|cM) > p ? Pr(A=T| E=T) = 1 Pr(A=T| E=F) = (1/2 –p)/(1-p) Pr(C=T| A, B) = 1 if A and B F, otherwise 0 Proof shows that the new network is monotone in mode, and monotone in distribution, if and only if there is a cM with Pr(E=T|cM) > p Hardness proof (sketch) G: instance of Partial MAP M E A B C M U B set of observable variables; C variable of interest Probabilistic networks - IPA fall days

Conclusions • Probabilistic (belief, Bayesian) networks form mathematical precise model • Used in several decision support system • Use and design of networks pose interesting challenges, many algorithmic • Sometimes special structures help (tree decompositions), also in practice Probabilistic networks - IPA fall days

Probabilistic networks Inference and Other Problems

Probabilistic networks Inference and Other Problems

Presentation Transcript

Probabilistic Networks

Probabilistic Inference Lecture 3

Probabilistic Lexical Models for Textual Inference

Probabilistic Inference Lecture 2

Principled Probabilistic Inference and Interactive Activation

Probabilistic Inference Lecture 5

Probabilistic inference

Probabilistic Inference Lecture 7

Probabilistic Inference Lecture 1

Probabilistic Inference

Lifted First-Order Probabilistic Inference

Probabilistic Inference in PRISM

On Distributing Probabilistic Inference

Probabilistic Inference in Distributed Systems

Probabilistic Inference

Probabilistic Inference: Conscious and Unconscious

Gene Networks Inference

Probabilistic Inference

First-Order Probabilistic Inference

Probabilistic Inference: Conscious and Unconscious