1 / 21

Artificial Intelligence Probabilistic reasoning

Artificial Intelligence Probabilistic reasoning. Fall 2008 professor: Luigi Ceccaroni. Bayesian networks. A simple, graphical notation for conditional independence assertions and hence for compact specification of full joint distributions . Syntax: a set of nodes, one per variable

Download Presentation

Artificial Intelligence Probabilistic reasoning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Artificial IntelligenceProbabilistic reasoning Fall 2008 professor: Luigi Ceccaroni

  2. Bayesian networks • A simple, graphical notation for conditional independence assertions and hence for compact specification of full joint distributions. • Syntax: • a set of nodes, one per variable • a directed, acyclic graph (links ≈ "directly influences") • a conditional distribution for each node given its parents: P (Xi | Parents (Xi)) • In the simplest case, conditional distribution represented as a conditional probability table (CPT) giving the distribution over Xi for each combination of parent values.

  3. Example • Topology of network encodes conditional independence assertions: • Weather is independent of the other variables. • Toothache and Catch are conditionally independent given Cavity.

  4. Example • What is the probability of having a heart attack? • This probability depends on “4 variables”: • Sport • Diet • Blood pressure • Smoking • Knowing the dependency among these variables let us build a Bayesian network.

  5. Constructing Bayesian networks • 1. Choose an ordering of variables X1, … ,Xn • 2. For i = 1 to n • add Xi to the network • select parents from X1, … ,Xi-1 such that P (Xi | Parents(Xi)) = P (Xi | X1, ... Xi-1) This choice of parents guarantees: P (X1, … ,Xn) = πi =1P (Xi | X1, … , Xi-1) (chain rule) = πi =1P (Xi | Parents(Xi)) (by construction) n n

  6. Bp Diet Smoking Diet Sport Sport Sm P(Sm) P(Sp) P(Di) P(Ha=yes) P(Bp = high) P(Bp = normal) P(Ha=no) Blood pressure Heart attack Diet high balanced yes yes 0.4 0.4 0.8 0.2 Sport Smoking yes 0.1 bal. yes 0.01 0.99 norm. unbalanced no yes 0.6 0.6 0.6 0.4 no 0.9 unbal. yes 0.2 0.8 high no 0.7 0.3 bal. no 0.25 0.75 norm. no 0.3 0.7 unbal. no 0.7 0.3 Example

  7. Compactness • A CPT for Boolean Xi with k Boolean parents has 2k rows for the combinations of parent values. • Each row requires one number p for Xi = true(the number for Xi = false is just 1-p). • If each variable (n) has no more than k parents (k<<n), the complete network requires O(n · 2k) numbers.

  8. Representation cost • The network grows linearly with n, vs. O(2n)for the conditional full joint distribution. • Examples: • With 10 variables and at most 3 parents: • 80 vs. 1024 • With 100 variables and at most 5 parents: • 3200 vs. 1030

  9. Semantics The full joint distribution is defined as the product of the local conditional distributions: P (X1, … ,Xn) = πi = 1P (Xi | Parents(Xi)) Example: P (sp ∧ Di=balanced ∧ Bp=high ∧¬sm ∧¬ha) = = P (sp) P (Di=balanced) P (Bp=high | sp, Di=balanced) P (¬sm) P (¬ha | Bp=high, ¬sm) n

  10. Bayesian networks – Joint distribution - Example P(ha ∧ Bp = high ∧ sm ∧ sp ∧ Di = balanced) = P(ha | Bp = high, sm) P(Bp = high | sp, Di = balanced) P(sm) P(sp) P(Di = balanced) = 0.8 x 0.01 x 0.4 x 0.1 x 0.4 = 0.000128

  11. Exact inference in Bayesian networks: example • Inference by enumeration: P(X | e) = α P(X, e) = αy P(X, e, y) • Let’s calculate: P(Smoking | Heart attack = yes, Sport = no) • The full joint distribution of the network is: P(Sp, Di, Bp, Sm, Ha) = = P(Sp) P(Di) P(Bp | Sp, Di) P(Sm) P(Ha | Bp, Sm) • We want to calculate: P(Sm | ha, ¬sp).

  12. Exact inference in Bayesian networks: example P(Sm | ha, ¬sp) = αP(Sm, ha, ¬sp) = = αDi{b, ¬b}Bp{h, n}P(Sm, ha, ¬sp, Di, Bp) = = α P(¬sp) P(Sm) Di{b, ¬b}P(Di) Bp{h,n}P(Bp | ¬sp, Di) P(ha | Bp, Sm) = = α<0.9 * 0.4 * (0.4 * (0.25 * 0.8 + 0.75 * 0.6) + 0.6 * (0.7 * 0.8 + 0.3 * 0.6)), 0.9 * 0.6 * (0.4 * (0.25 * 0.7 + 0.75 * 0.3) + 0.6 * (0.7 * 0.7 + 0.3 * 0.3)> = = α<0.253, 0.274> = <0.48, 0,52>

  13. Variable elimination algorithm • The variable elimination algorithm let us avoid the calculation repetition of inference by enumeration. • Each variable is represented by a factor. • Intermediate results are saved to be later reused. • Non-relevant variables, being constant factors, are not directly computed.

  14. Variable elimination algorithm CALCULA-FACTOR generates the factor corresponding to variable var in the function of the joint probability distribution. PRODUCTO-Y-SUMA multiplies factors and sums over the hidden variable. PRODUCTO multiplies a set of factors.

  15. Variable elimination algorithm - Example α P(¬sp) P(Sm) Di{b, ¬b}P(Di) Bp{h,n}P(Bp | ¬sp, Di) P(ha | Bp, Sm) • Factor for variable Heart attack P(ha | Bp, Sm), fHa(Bp, Sm):

  16. Variable elimination algorithm - Example • Factor for variable Blood pressure P(Bp | ¬sp, Di), fBp(Bp, Di): • To put together the factors just obtained, we calculate the product of fHa(Bp, Sm) x fBp(Bp, Di) = fHa Bp(Bp, Sm, Di)

  17. Variable elimination algorithm - Example fHa Bp(Bp, Sm, Di) = =fHa(Bp, Sm) xfBp(Bp, Di)

  18. Variable elimination algorithm - Example • We sum over the values of variable Bp to obtain factor fHaBp(Sm, Di) • Factor for variable Di, fDi(Di):

  19. Variable elimination algorithm - Example • fHa Di Bp(Sm, Di) =fDi(Di) xfHaBp(Sm, Di) • We sum over the values of variable Di to obtain factor fHaDiBp(Sm)

  20. Variable elimination algorithm - Example • Factor for variable Sm, fSm(Sm): • fHa Sm DiBp(Sm) = fSm(Sm) x fHa DiBp(Sm) • Normalizing, we obtain:

  21. Summary • Bayesian networks provide a natural representation for (causally induced) conditional independence. • Topology + CPTs = compact representation of joint distribution. • Generally easy for domain experts to construct.

More Related