Bayesian Inference and Networks: Why should a biologist care?

Bayesian Inference and Networks: Why should a biologist care? Paul E. Anderson, Ph.D.

Why should we care? • It let’s us answer the questions we really want to know! http://www.sciencemag.org/content/294/5550/2310.full.pdf P. Anderson, College of Charleston

Introduction • Suppose you are trying to determine if a patient has pneumonia. You observe the following symptoms: • The patient has a cough • The patient has a fever • The patient has difficulty breathing

Introduction You would like to determine how likely the patient has pneumonia given that the patient has a cough, a fever, and difficulty breathing We are not 100% certain that the patient has pneumonia because of these symptoms. We are dealing with uncertainty!

Introduction Now suppose you order a chest x-ray and the results are positive. Your belief that that the patient has pneumonia is now much higher.

Introduction • In the previous slides, what you observed affected your belief that the patient has pneumonia • This is called reasoning with uncertainty • Wouldn’t it be nice if we had some methodology for reasoning with uncertainty? Why in fact, we do...

Bayesian Networks • Bayesian networks help us reason with uncertainty • In the opinion of many AI researchers, Bayesian networks are the most significant contribution in AI in the last 10 years • They are used in many applications eg.: • Spam filtering / Text mining • Speech recognition • Robotics • Diagnostic systems • Syndromic surveillance

Bayesian Networks (An Example) From: Aronsky, D. and Haug, P.J., Diagnosing community-acquired pneumonia with a Bayesian network, In: Proceedings of the Fall Symposium of the American Medical Informatics Association, (1998) 632-636.

The intuition behind the statistics Rephrase the questions in ways we can answer! P. Anderson, College of Charleston

Answering questions about _____ • Fruit on an assembly line • Oranges, grapefruit, lemons, cherries, apples • Sensors measure: • Red intensity • Yellow intensity • Mass (kg) • Approximate volume • At the end of the line, a gate switches to deposit the fruit into the correct bin

Training the algorithm Sensors, scales, etc… Red = 2.125Yellow = 6.143Mass = 134.32Volume = 24.21 Apple

Red = 2.125Yellow = 6.143Mass = 134.32Volume = 24.21 Apple Training (2) Classifier M. Raymer – WSU, FBS

Red = 2.125Yellow = 6.143Mass = 134.32Volume = 24.21 ?? Testing Classifier ! M. Raymer – WSU, FBS

Pattern Matrix M. Raymer – WSU, FBS

Distributions • Bayesian classifiers start with an estimate of the distribution of the features Gaussian Distribution (Continuous) Binomial Distribution (Discrete) M. Raymer – WSU, FBS

Density Estimation • Parametric • Assume a Gaussian (e.g.) distribution. • Estimate the parameters (,). • Non-parametric • Histogram sampling • Bin size is critical • Gaussian smoothingcan help M. Raymer – WSU, FBS

The Gaussian distribution Multivariate (d-dimensional): Univariate:   A parametric Bayesian classifier must estimate  and  from the training samples. M. Raymer – WSU, FBS

Making decisions • Once you have the distributions for • Each feature and • Each class • You can ask questions like… If I have an apple, what is the probability that the diameter will be between 3.2 and 3.5 inches? M. Raymer – WSU, FBS

More decisions… Non-parametric Parametric  Count  Diameter M. Raymer – WSU, FBS

A Simple Example • You are given a fruit with adiameter of 4” – is it a pear or an apple? • To begin, we need to know the distributions of diameters for pears and apples. M. Raymer – WSU, FBS

Maximum Likelihood Class-Conditional Distributions P(x) 1” 2” 3” 4” 5” 6” M. Raymer – WSU, FBS

What are we asking? • If the fruit is an apple, how likely is itto have a diameter of 4”? • If the fruit is a xenofruit from planet Xircon, how likely is it to have a diameter of 4”? Is this the right question to ask? M. Raymer – WSU, FBS

A Key Problem • We based this decision on (class conditional) • What we really want to use is (posterior probability) • What if we found the fruit in a pear orchard? • We need to know the prior probability of finding an apple or a pear! M. Raymer – WSU, FBS

Statistical decisions… • If a fruit has a diameter of 4”, how likely is it to be an apple? 4” Fruit Apples M. Raymer – WSU, FBS

“Inverting” the question Given an apple, what is the probability that it will have a diameter of 4”? Given a 4” diameter fruit, what is the probability that it is an apple? M. Raymer – WSU, FBS

Prior Probabilities • Prior probability + Evidence Posterior Probability • Without evidence, what is the “prior probability” that a fruit is an apple? M. Raymer – WSU, FBS

The heart of it all • Bayes Rule M. Raymer – WSU, FBS

Bayes Rule or M. Raymer – WSU, FBS

Example Revisited • Is it an ordinary apple or an uncommon pear? M. Raymer – WSU, FBS

Bayes Rule Example M. Raymer – WSU, FBS

Solution M. Raymer – WSU, FBS

Marginal Distributions M. Raymer – WSU, FBS

Combining Marginals • Assuming independent features: • If we assume independence and use Bayes rule, we have a Naïve Bayes decision maker (classifier). M. Raymer – WSU, FBS

Bayes Decision Rule • Provably optimal when the features (evidence) follow Gaussian distributions, and are independent. M. Raymer – WSU, FBS

Likelihood Ratios • When deciding between two possibilities, we don’t need the exact probabilities. We only need to know which one is greater. • The denominator for all the classes is always equal. • Can be eliminated • Useful when there are many possible classes M. Raymer – WSU, FBS

Likelihood Ratio Example  M. Raymer – WSU, FBS

Likelihood Ratio Example M. Raymer – WSU, FBS

In-class example: Oranges Grapefruit M. Raymer – WSU, FBS

Example (cont’d) • After observing several hundred fruit pass down the assembly line, we observe that • 72% are oranges • 28% are grapefruit • Fruit ‘x’ • Red intensity = 8.2 • Mass = 7.6 What shall we predict for the class of fruit ‘x’? M. Raymer – WSU, FBS

The whole enchilada and… (Naïve assumption) Repeat for grapefruit and predict the more probable class. M. Raymer – WSU, FBS

The whole enchilada (2) M. Raymer – WSU, FBS

The whole enchilada (3) M. Raymer – WSU, FBS

Conclusion Predict that fruit ‘x’ is a grapefruit, despite the relative scarcity of grapefruits on the conveyor belt. M. Raymer – WSU, FBS

Abbreviated • Since the denominator is the same for all classes, we can just compare: and M. Raymer – WSU, FBS

Likelihood comparison M. Raymer – WSU, FBS

What if we want more complexity? Bayesian Networks P. Anderson, College of Charleston

Bayesian Networks are built upon Independence Variables A and B are independent if any of the following hold: • P(A,B) = P(A)P(B) • P(A | B) = P(A) • P(B | A) = P(B) This says that knowing the outcome of A does not tell me anything new about the outcome of B.

Independence How is independence useful? • Suppose you have n coin flips and you want to calculate the joint distribution P(C1, …, Cn) • If the coin flips are not independent, you need 2n values in the table • If the coin flips are independent, then Each P(Ci) table has 2 entries and there are n of them for a total of 2n values

Conditional Independence Variables A and B are conditionally independent given C if any of the following hold: • P(A, B | C) = P(A | C)P(B | C) • P(A | B, C) = P(A | C) • P(B | A, C) = P(B | C) Knowing C tells me everything about B. I don’t gain anything by knowing A (either because A doesn’t influence B or because knowing C provides all the information knowing A would give)

Bayesian Inference and Networks: Why should a biologist care?

Bayesian Inference and Networks: Why should a biologist care?

Presentation Transcript

Bayesian Networks for the Analysis of Evidence

Bayesian inference, Sampling and Probability Densities

A Tutorial on Inference and Learning in Bayesian Networks

Bayesian Networks

CS 343: Artificial Intelligence Bayesian Networks

Lecture 2

Bayesian Networks

Dynamic Bayesian approaches for inferring gene regulatory networks

S3-SEMINAR ON DATA MINING -BAYESIAN NETWORKS- B. INFERENCE

Reasoning Under Uncertainty: Independence and Inference

Bayesian Networks: Sampling Algorithms for Approximate Inference

CS b553 : A lgorithms for Optimization and Learning

Bayesian Inference

Dynamic Bayesian Network

Bayesian networks

Efficient Inference for General Hybrid Bayesian Networks

Dynamic Bayesian Networks (DBNs)

Overview of Inference Algorithms for Bayesian Networks

Introduction

Bayesian inference

Multi-Entity Bayesian Networks Without Multi-Tears

Using Bayesian Networks to Analayze Expression Data