E N D
1. CS479/679 Pattern RecognitionSpring 2006 – Prof. Bebis Bayesian Belief Networks
Chapter 2 (Duda et al.)
2. Statistical Dependences Between Variables Many times, the only knowledge we have about a distribution is which variables are or are not dependent.
Such dependencies can be represented graphically using a Bayesian Belief Network (or Belief Net).
In essence, Bayesian Nets allow us to represent a joint probability density p(x,y,z,…) efficiently using dependency relationships.
p(x,y,z,…) could be either discrete or continuous.
3. Example of Dependencies State of an automobile
Engine temperature
Brake fluid pressure
Tire air pressure
Wire voltages
Etc.
NOT causally related variables
Engine oil pressure
Tire air pressure
Causally related variables
Coolant temperature
Engine temperature
4. Representative Applications Bill Gates said: (LA Times - 10/28/96)"Microsoft's competitive advantage is its expertise in "Bayesian Nets“
Current Microsoft products:
Answer Wizard
Print Troubleshooter
Excel Workbook Troubleshooter
Office 95 Setup Media Troubleshooter
Windows NT 4.0 Video Troubleshooter
Word Mail Merge Troubleshooter
5. Representative Applications (cont’d) US Army: SAIP (Battalion Detection from SAR, IR etc.)
NASA: Vista (DSS for Space Shuttle)
GE: Gems (real-time monitor for utility generators)
Intel: (infers possible processing problems)
6. Definitions and Notation A belief net is usually a Directed Acyclic Graph (DAG)
Each node represents one of the system variables.
Each variable can assume certain values (i.e., states) and each state is associated with a probability (discrete or continuous).
7. Relationships Between Nodes A link joining two nodes is directional and represents a causal influence (e.g., X depends on A or A influences X)
Influences could be direct or indirect (e.g., A influences X directly and A influences C indirectly through X).
8. Parent/Children Nodes Parent nodes P of X
the nodes before X (connected to X)
Children nodes C of X:
the nodes after X (X is connected to them)
9. Conditional Probability Tables Every node is associated with a set of weights which represent the prior/conditional probabilities (e.g., P(xi/aj), i=1,2, j=1,2,3,4)
10. Learning There exist algorithms for learning these probabilities from data…
11. Computing Joint Probabilities We can compute the probability of any configuration of variables in the joint density distribution:
e.g., P(a3, b1, x2, c3, d2)=P(a3)P(b1)P(x2/a3,b1)P(c3/x2)P(d2/x2)=
0.25 x 0.6 x 0.4 x 0.5 x 0.4 = 0.012
12. Computing the Probability at a Node
E.g., determine the probability at D
13. Computing the Probability at a Node (cont’d)
E.g., determine the probability at H:
14. Computing Probability Given Evidence (Bayesian Inference) Determine the probability of some particular configuration of variables given the values of some other variables (evidence).
e.g., compute P(b1/a2, x1, c1)
15. Computing Probability Given Evidence (Bayesian Inference)(cont’d) In general, if X denotes the query variables and e denotes the evidence, then
where a=1/P(e) is a constant of proportionality.
16. An Example Classify a fish given that we only have evidence that the fish is light (c1) and was caught in south Atlantic (b2) -- no evidence about what time of the year the fish was caught nor its thickness.
17. An Example (cont’d)
18. An Example (cont’d)
19. An Example (cont’d) Similarly,
P(x2/c1,b2)=a 0.066
Normalize probabilities (not needed necessarily):
P(x1/c1,b2)+ P(x2/c1,b2)=1 (a=1/0.18)
P(x1/c1,b2)= 0.63
P(x2/c1,b2)= 0.27
20. Another Example: Medical Diagnosis
Uppermost nodes: biological agents (bacteria, virus)
Intermediate nodes: diseases
Lowermost nodes: symptoms
Given some evidence (biological agents, symptoms), find most likely disease.
21. Naïve Bayes’ Rule When dependency relationships among features are unknown, we can assume that features are conditionally independent given the category:
P(a,b/x)=P(a/x)P(b/x)
Naïve Bayes rule :
Simple assumption but … usually works well in practice.