1 / 40

Probabilistic Reasoning

Probabilistic Reasoning. ECE457 Applied Artificial Intelligence Spring 2008 Lecture #9. Outline. Bayesian networks D-separation and independence Inference Russell & Norvig, sections 14.1 to 14.4. Recall the Story from FOL.

ameiners
Download Presentation

Probabilistic Reasoning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Probabilistic Reasoning ECE457 Applied Artificial Intelligence Spring 2008 Lecture #9

  2. Outline • Bayesian networks • D-separation and independence • Inference • Russell & Norvig, sections 14.1 to 14.4 ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 2

  3. Recall the Story from FOL • Anyone passing their 457 exam and winning the lottery is happy. Anyone who studies or is lucky can pass all their exams. Bob did not study but is lucky. Anyone who’s lucky can win the lottery. • Is Bob happy? ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 3

  4. Add Probabilities • Anyone passing their 457 exam and winning the lottery has a 99% chance of being happy. Anyone only passing their 457 exam has an 80%, while someone only winning the lottery has a 60% chance of being happy, and someone who does neither has a 20% chance of being happy. Anyone who studies has a 90% chance of passing their exams. Anyone who’s lucky has a 50% chance of passing their exams. Anyone who’s both lucky and who studied has a 99% chance of passing, but someone who didn’t study and is unlucky has a 1% chance of passing. There’s a 20% chance that Bob studied, but a 75% chance that he’ll be lucky. Anyone who’s lucky has a 40% chance of winning the lottery, while an unlucky person only has a 1% chance of winning. • What’s the probability of Bob being happy? ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 4

  5. Probabilities in the Story • Example of probabilities in the story • P(Lucky) = 0.75 • P(Study) = 0.2 • P(PassExam|Study) = 0.9 • P(PassExam|Lucky) = 0.5 • P(Win|Lucky) = 0.4 • P(Happy|PassExam,Win) = 0.99 • Some variables directly affect others! • Graphical representation of dependencies and conditional independencies between variables? ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 5

  6. Study Lucky PassExam Win Happy Bayesian Network • Belief network • Directed acyclic graph • Nodes represent variables • Edges represent conditional relationships • Concise representation of any full joint probability distribution ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 6

  7. Study Lucky PassExam Win Happy Bayesian Network • Nodes with no parents have prior probabilities • Nodes with parents have conditional probability tables • For all truth value combinations of their parents ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 7

  8. Study Lucky PassExam Win Happy Bayesian Network P(L) = 0.75 P(S) = 0.2 ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 8

  9. m x a b c d g f e j h i k z n l o p q r s t u v w y Bayesian Network ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 9

  10. Chain Rule • Recall the chain rule • P(A,B) = P(A|B)P(B) • P(A,B,C) = P(A|B,C)P(B,C)P(A,B,C) = P(A|B,C)P(B|C)P(C) • P(A1,A2,…,An) = P(A1|A2,…,An)P(A2|A3,…,An)…P(An-1|An)P(An) • P(A1,A2,…,An) = i=1n P(Ai|Ai+1,…,An) ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 10

  11. Chain Rule • A node is conditionally independent of its predecessors given its parents • If we know the value of a node’s parents, we don’t care about more distant ancestors • Their influence is included through the parents • More generally, a node is conditionally independent of its non-descendents given its parents • Update chain rule • P(A1,A2,…,An) = i=1n P(Ai|parents(Ai)) • parents(Ai)  { Ai+1, …, An } ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 11

  12. Chain Rule Example • Probability that Bob is happy because he won the lottery and passed his exam, because he’s lucky but did not study • P(H,W,E,L,S) = P(H|WE) * P(W|L) * P(E|LS) * P(L) * P(S)P(H,W,E,L,S) = 0.99 * 0.4 * 0.5 * 0.75 * 0.8P(H,W,E,L,S) = 0.12 ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 12

  13. Study Lucky PassExam Win Happy Constructing Bayesian Nets • Build from the top-down • Start with root nodes • Add children • Go down to leaves ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 13

  14. Study Lucky PassExam Win Happy Constructing Bayesian Nets • What happens if we build with the wrong order? • Network becomes needlessly complicated • Node ordering is important! ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 14

  15. Connections • We can understand dependence in a network by considering how evidence is transmitted through it • Information entered at one node • Propagates to descendents and ancestors through connected nodes • Provided no node in path already has evidence (in which case we would stop the propagation) ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 15

  16. Study Lucky PassExam Win Happy Serial Connection • Study and Happy are dependent • Study and Happy are independent given PassExam • Intuitively, the only way Study can affect Happy is through PassExam ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 16

  17. Study Lucky PassExam Win Happy Diverging Connection • Win and PassExams are dependent • Win and PassExams are independent given Lucky • Intuitively, Lucky can explain both Win and PassExam. Win and PassExam can affect each other by changing the belief in Lucky ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 17

  18. Study Lucky PassExam Win Happy Converging Connection • Lucky and Study are independent • Lucky and Study are dependent given PassExam • Intuitively, Lucky can be used to explain away Study ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 18

  19. D-Separation • Determine if two variables are independent given some other variables • X is independent of Y given Z if X and Y are d-separate given Z • X is d-separate from Y if, for all (undirected) paths between X and Y, there exists a node Z for which: • The connection is serial or diverging and there is evidence for Z • The connection is converging and there is no evidence for Z or any of its descendents ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 19

  20. X X X Y Y Y D-Separation ZBlocks path if not in evidence ZBlocks path if in evidence ZBlocks path if in evidence Z2 Blocks path if not in evidence ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 20

  21. D-Separation • Can be computed in linear time using depth-first-search algorithm • Fast algorithm to know if two nodes are independent • Allows us to infer whether learning the value of a variable might give us information about another variable given what we already know • All d-separated variables are independent but not all independent variable are d-separated ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 21

  22. f b c d e f g h i d a a j d e f g h i j a b b c j c i h i j a b c d e f e g g j a b c d e f g h h i D-Separation Exercise • If we observe a value for node e • Nodes that are not d-separate need to be updated • The graph becomes split into two independent, d-separate areas ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 22

  23. a b c d e f g h i j D-Separation Exercise • If we observe a value for node g, what other nodes are updated? • Nodes f, h and i • If we observe a value for node a, what other nodes are updated? • Nodes b, c, d, e, f ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 23

  24. a b c d e f g h i j D-Separation Exercise • Given an observation of c, are nodes a and f independent? • Yes • Given an observation of i, are nodes g and j independent? • No ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 24

  25. l n b c d g h i z k p s u v w y x o Other Independence Criteria • A node is conditionally independent of its non-descendents given its parents • Recall from updated chain rule m ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 25

  26. l n b c d g h i z k p s u v w y x o Other Independence Criteria • A node is conditionally independent of all others in the network given its parents, children, and children’s parents • Markov blanket m ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 26

  27. Inference in Bayesian Network • Compute the posterior probability of a query variable given an observed event • P(A1,A2,…,An) = i=1n P(Ai|parents(Ai)) • Observed evidence variables E = E1,…,Em • Query variable X • Between them: nonevidence (hidden) variables Y = Y1…Yl • Belief network is X  E  Y ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 27

  28. Inference in Bayesian Network • P(X|E) Recall Bayes’ Theorem: P(A|B) = P(A,B) / P(B) P(X|E) = αP(X,E) Recall marginalization: P(Ai) = j P(Ai,Bj) P(X|E) = αYP(X,E,Y) Recall chain rule: P(A1,A2,…,An) = i=1n P(Ai|parents(Ai)) P(X|E) = αY A=XEP(A|parents(A)) ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 28

  29. Study Lucky PassExam Win Happy Inference Example P(L) = 0.75 P(S) = 0.2 ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 29

  30. Inference Example #1 • With only the information from the network (and no observations), what’s the probability that Bob won the lottery? • P(W) = l P(W,l)P(W) = l P(W|l)P(l) P(W) = P(W|L)P(L) + P(W|L)P(L)P(W) = 0.4*0.75 + 0.01*0.25P(W) = 0.3025 ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 30

  31. Inference Example #2 • Given that we know that Bob is happy, what’s the probability that Bob won the lottery? • From the network, we know • P(h,e,w,s,l) = P(l)P(s)P(e|l,s)P(w|l)P(h|w,e) • We want to find • P(W|H) = αl s eP(l)P(s)P(e|l,s)P(W|l)P(H|W,e) • P(W|H) also needed to normalize ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 31

  32. Inference Example #2 • P(W|H) = α 0.2516493 ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 32

  33. Inference Example #2 • P(W|H) = α 0.328878 ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 33

  34. Inference Example #2 • P(W|H) = α <0.2516493, 0.328878>P(W|H) = <0.4335, 0.5665> • Note that P(W|H) > P(W|H) because P(W|L)  P(W|L) • The probability of Bob having won the lottery has increased by 13.1% thanks to our knowledge that he is happy! ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 34

  35. Expert Systems • Bayesian networks used to implement expert systems • Diagnostic systems that contains subject-specific knowledge • Knowledge (nodes, relationships, probabilities) typically provided by human experts • System observes evidence by asking questions to user, then infers most likely conclusion ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 35

  36. Pathfinder • Expert system for medical diagnostic of lymph-node diseases • Very large Bayesian network • Over 60 diseases • Over 100 features of lymph nodes • Over 30 features for clinical information • Lot of work from medical experts • 8 hours to define features and diseases • 35 hours to build network topology • 40 hours to assess probabilities ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 36

  37. Pathfinder • One node for each disease • Assumes the diseases are mutually exclusive and exhaustive • Large domain, hard to handle • Several small networks for diagnostic tasks built individually • Then combined into a single large network ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 37

  38. Pathfinder • Testing the network • 53 test cases (real diagnostics) • Diagnostic accuracy as good as a medical expert ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 38

  39. Assumptions • Learning agent • Environment • Fully observable / Partially observable • Deterministic / Strategic / Stochastic • Sequential • Static / Semi-dynamic • Discrete / Continuous • Single agent / Multi-agent ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 39

  40. Assumptions Updated • We can handle a new combination! • Fully observable & Deterministic • No uncertainty (map of Romania) • Fully observable & Stochastic • Games of chance (Monopoly, Backgammon) • Partially observable & Deterministic • Logic (Wumpus World) • Partially observable & Stochastic ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 40

More Related