Advances in Bayesian Learning Learning and Inference in Bayesian Networks

Advances in Bayesian LearningLearning and Inference in Bayesian Networks Irina Rish IBM T.J.Watson Research Center rish@us.ibm.com

“Road map” • Introduction and motivation: • What are Bayesian networks and why use them? • How to use them • Probabilistic inference • How to learn them • Learning parameters • Learning graph structure • Summary

Smoking lung Cancer Bronchitis X-ray Dyspnoea Bayesian Networks P (lung cancer=yes | smoking=no, dyspnoea=yes ) = ?

cause cause • Classification: P(class|data) Text Classification Medicine symptom Bio-informatics Speech recognition symptom Computer troubleshooting Stock market What are they good for? • Diagnosis: P(cause|symptom)=? • Prediction: P(symptom|cause)=? • Decision-making (given a cost function)

P(S) P(C|S) P(B|S) • C B D=0 D=1 • 0 0 0.1 0.9 • 0 1 0.7 0.3 • 1 0 0.8 0.2 • 1 1 0.9 0.1 CPD: P(X|C,S) P(D|C,B) Conditional Independencies Efficient Representation Bayesian Networks: Representation Smoking lung Cancer Bronchitis X-ray Dyspnoea P(S, C, B, X, D) = P(S) P(C|S) P(B|S) P(X|C,S) P(D|C,B)

Example: Printer Troubleshooting

“Road map” • Introduction and motivation: • What are Bayesian networks and why use them? • How to use them • Probabilistic inference • Why and how to learn them • Learning parameters • Learning graph structure • Summary

Combining domain expert knowledge with data • Incremental learning: P(H) or <9.7 0.6 8 14 18> <0.2 1.3 5 ?? ??> <1.3 2.8 ?? 0 1 > <?? 5.6 0 10 ??> ………………. S C • Learning causal relationships: Why learn Bayesian networks? • Efficient representation and inference • Handling missing data: <1.3 2.8 ?? 0 1 >

Known graph – learn parameters • Complete data: P(S) S parameter estimation (ML, MAP) • Incomplete data: P(C|S) P(B|S) B C non-linear parametric optimization (gradient descent, EM) P(X|C,S) P(D|C,B) D X • Unknown graph – learn graph and parameters • Complete data: optimization (search in space of graphs) S S C B B C • Incomplete data: structural EM, mixture models D D X X Learning Bayesian Networks

- decomposable! Multinomial counts C B X • MAP-estimate (Bayesian statistics) Conjugate priors - Dirichlet Equivalent sample size (prior knowledge) Learning Parameters:complete data • ML-estimate:

Non-decomposablemarginal likelihood (hidden nodes) Initial parameters Expectation Inference: P(S|X=0,D=1,C=0,B=1) Expected counts Current model S X D C B <? 0 1 0 1> <1 1 ? 0 1> <0 0 0 ??> <? ? 0 ? 1> ……… S X D C B 1 0 1 0 1 1 1 1 0 1 0 0 0 00 1 0 0 0 1 ……….. Data Maximization Update parameters (ML, MAP) Learning Parameters:incomplete data EM-algorithm: iterate until convergence

Find NP-hard optimization S Add S->B S B C C B Delete S->B S Reverse S->B B C S B C Learning graph structure • Heuristic search: • Greedy local search • Best-first search • Simulated annealing Complete data – local computations Incomplete data (score non-decomposable): Structural EM • Constrained-based methods • Data impose independence relations (constrains)

<9.7 0.6 8 14 18> <0.2 1.3 5 ?? ??> <1.3 2.8 ?? 0 1 > <?? 5.6 0 10 ??> ………………. Scoring functions:Minimum Description Length (MDL) • Learning  data compression • Other: MDL = -BIC (Bayesian Information Criterion) • Bayesian score (BDe) - asymptotically equivalent to MDL DL(Data|model) DL(Model)

Summary • Bayesian Networks – graphical probabilistic models • Efficient representation and inference • Expert knowledge + learning from data • Learning: • parameters (parameter estimation, EM) • structure (optimization w/ score functions – e.g., MDL) • Applications/systems: collaborative filtering (MSBN), fraud detection (AT&T), classification (AutoClass (NASA), TAN-BLT(SRI)) • Future directions: causality, time, model evaluation criteria, approximate inference/learning, on-line learning, etc.

Advances in Bayesian Learning Learning and Inference in Bayesian Networks

Advances in Bayesian Learning Learning and Inference in Bayesian Networks

Presentation Transcript

A Tutorial on Inference and Learning in Bayesian Networks

Recent Advances in Bayesian Inference Techniques

Review: Bayesian learning and inference

Inference in Bayesian Networks

Learning In Bayesian Networks

Learning with Bayesian Networks

Inference in Bayesian Networks

Bayesian Learning and Learning Bayesian Networks

Learning with Bayesian Networks

V11: Structure Learning in Bayesian Networks

Learning With Bayesian Networks

Learning with Bayesian Networks

Bayesian and non-Bayesian Learning in Games

Learning in Bayesian Networks

DEAL: Learning Bayesian Networks in R

Inference in Bayesian Networks

Inference in Bayesian Networks

Learning with Bayesian Networks

Learning Bayesian Networks

Learning Bayesian Networks

Learning Bayesian Networks

Bayesian and non-Bayesian Learning in Games