230 likes | 439 Views
METHODS. DUMMIES. BAYES FOR BEGINNERS. Any given Monday at 12.31 pm. “I’m sure this makes sense, but you lost me about here…”. Bayes for Beginners. What can Bayes do for you? Conditional Probability What is Bayes Theorem? Bayes in SPM2. What can Bayes do for you?.
E N D
METHODS DUMMIES BAYES FOR BEGINNERS
Any given Monday at 12.31 pm “I’m sure this makes sense, but you lost me about here…”
Bayes for Beginners • What can Bayes do for you? • Conditional Probability • What is Bayes Theorem? • Bayes in SPM2
What can Bayes do for you? • Problems with classical statistics approach • All inferences relate to disproving the null hypothesis • Never fully reject H0, only say that the effect you see is unlikely to occur by chance • Corrections for multiple comparisons • Very small effects can be declared significant with enough data • Bayesian Inference offers a solution
What can Bayes’ do for you? • Classical • ‘What is the likelihood of getting these data given no activation occurred (b = 0)?’ • p(y|b) • Bayesian • ‘What is the chance of getting these parameters, given these data?’ • p(b|y) • p(b|y) ≠p(y|b)
Conditional Probability • Last year you were at a conference in Japan • You happen to notice that rather a lot of the professors smoke • At one of socials you met someone at the bar & had a few drinks • The next morning you wake up & it dawns on you that you told the person you were talking to in the bar something rather indiscrete about you supervisor • You remember that the person you were talking kept stealing your cigarettes, and you are start to worry that they might have been a professor (and therefore a friend of your supervisor) • You decide to do a calculation to work out what the chances are that the person you were talking to is a professor, given that you know they are a smoker. • You phone the hotel reception & they give you the following information: • 100 delegates • 40 requested non-smoking rooms • 10 professors • 4 requested non-smoking rooms
p(P|S) = p(P and S) p(S) p(P|S) = p(S|P)*p(P) p(S) = 0.06/0.6 p(P|S) = 0.1 Given that the person you were talking to last night was a smoker, what is the probability of them being a professor? ‘AND’ = multiply p(S|P)= 0.6 p(S and P) = p(S|P)*p(P) = 0.06 p(S’|P) = 0.4 p(P) = 0.1 p(S and P’) = p(S|P’)*p(P’) p(S|P’) = 0.6 p(P’) = 0.9 p(S’|P’) = 0.4 ‘OR’ = add • 100 delegates • 40 non-smokers • 10 professors • 4 non-smokers p(S) = p(S and P) or p(S and P’) p(S) = 0.6*0.1 + 0.6*0.9 = 0.6
The following night you are introduced to a professor who you would very much like to work for after you have finished your PhD. You want to make a good impression. Given that this person is a professor, what are the chances that they are also a smoker, in which case offering them a cigarette won’t harm your career prospects? p(P|S) = p(S|P)*p(P) p(S) = 0.06/0.6 = 0.1 p(S|P) = p(SandP) p(P) = 0.6 * 0.1 / 0.1 = 0.6 p(S|P) = 0.6 p(S’|P) = 0.4 p(P) = 0.1 p(P’) = 0.9 p(S|P’) = 0.6 p(P|S)≠p(S|P) p(S’|P’) = 0.4
What is Bayes Theorem? • p(P|S) = p(S|P)*p(P) / p(S) • Posterior = Likelihood * Prior / Evidence • p(P|S) • Degree of belief in ‘P’, given the data ‘S’ depends on what the data tell you ‘p(S|P)’ and any prior information ‘p(P)’
This year, the conference is held in New York where smokers have to pay extra. This doesn’t deter the professors, but lots of the other participants decide to give up for the week! If you repeat your indiscretion this year, what are the chances of the smoker at the bar being a professor? p(P|S) = p(P and S) p(S) p(P|S) =p(S|P)*p(P) p(S) = 0.06/0.234 = 0.25 ‘AND’ = multiply p(S and P) = p(S|P)*p(P) p(S|P) = 0.6 p(P) = 0.1 p(S’|P) = 0.4 p(S and P’) = p(S|P’)*p(P’) p(S|P’) = 0.2 p(P’) = 0.9 p(S’|P’) = 0.8 ‘OR’ = add 100 participants 80 non-smokers 10 professors 4 non-smokers p(S) = p(S and P) or p(S and P’) p(S) = 0.6*0.1 + 0.2*0.9 = 0.234
Bayes in SPM2 • p(b|y) = p(y|b)*p(b) / p(y) • p(b|y) p(y|b)*p(b) • Posterior Probability Map (PPM) • Posterior Distribution • Likelihood Function (equivalent to normal SPM) • Prior Probabilities of parameters • PPM = SPM * priors
Bayes in SPM2 • Deciding on the priors • Fully specified priors in DCM • Estimating priors in PEB • Computing the Posterior Distribution • Making inferences • Shrinkage priors • Thresholds
Priors • Everyone has prior beliefs about their data • In Bayesian Framework priors are formally tested Bayesian Inference Empirical Bayes Mean & Variance of priors estimated from the data Hierarchical model: Parameters from one level become the priors at next level Between voxel variance over all voxels used as prior on variance at each voxel PPMs: 1st Level = within voxel of interest 2nd Level = between all brain voxels Full Bayes Previous empirical data eg biophysics of haemodynamic response
lpost=ld + lpMpost= ld Md + lp Mp lpostlpost lpost-1 ld-1 lp-1 Mp Mpost Md Computing the Posterior Probability Distribution y = w + e Likelihood: p(y|w) = N(Md, ld-1) w = m + z Prior: p(w) = N(Mp, lp-1) Posterior: p(w|y) = p(y|w)*p(w) = N(Mpost, lpost-1)
The effects of different precisions lp >ld lp =ld lp ≈ 0 lp <ld
Shrinkage Priors Large, variable effect Small, variable effect Large, consistent effect Small, consistent effect
Reporting PPMs • Posterior Distribution describes probability of getting an effect, given the data • Posterior distribution is different for every voxel • Size of effect (Mean) & Variability (Precision) • 2 Steps • Decide what size of effect is physiologically relevant • Each voxel 95% certain that the effect size is greater than threshold • Special extension of Bayesian Inference for PPMs
Thresholding Large, variable effect Small, variable effect g g Large, consistent effect Small, consistent effect g g p(b > g | y) = 0.95
Bayesian Inference in SPM2 • Bayesian Inference offers a solution • All inferences relate to disproving the null hypothesis • There is no null hypothesis • Different hypotheses can be tested formally • Multiple Comparisons & False Positives • Voxel wise inferences are independent • P-values don’t change with search volume • Use of shrinkage priors • Very small effects can be declared significant with enough data • Thresholding of effect size