Scientific Methods 1

Scientific Methods 1 ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 8: Statistical Methods-Significance tests & confidence limits Barry & Goran www.cs.man.ac.uk/~barry/mydocs/MyCOMP80131 COMP80131-SEEDSM8

Introduction • Statistical significance testing has so far been applied on the assumption of a • discrete population with binomial distribution • continuous population with known normal pdf & stdev. • Before proceeding further, take a quick look at a few more prob distributions & pdfs. • Significance testing can be adapted to any of these. COMP80131-SEEDSM8

Exponential pdf • Lifetimes, e.g. of light bulbs, follow an exponential distribution: mean = 2; x = 0:0.1:10; y = exppdf(x,mean); plot(x,y); Mean =  Stdev =  also COMP80131-SEEDSM8

Poisson Distribution • For applications that involve counting number of times a random event occurs in a given amount of time, e.g. number of people walking into a store in an hour. • λ, is both mean & variance of the distribution. • Poisson & exponential distributions are related. • If number of counts follows a Poisson distribution, then interval between individual counts follows exponential distribution. • As λ gets larger, Poisson pdf  normal with µ = λ, σ2 = λ. COMP80131-SEEDSM8

Poisson distributions in MATLAB x=0:60 y = poisspdf(x,20); stem(x,y); x=0:16 y = poisspdf(x,5); stem(x,y); COMP80131-SEEDSM8

Chi-squared distribution • Given apopulation of normally distrib random variables with mean = 0 & stdev =1. • Randomly choose a sample of V observations of them. • Let x be the sum of their squares. • Then pdf of x has the 2 distribution: (‘Gamma function’ (x) is generalisation of x! to non-integers). If s = stdev of the V observations, pdf(s2)  (1/V)V2(s2) If pop mean =  & stdev = , pdf (s2 )  (1/V)V2(s2/2+ 2) COMP80131-SEEDSM8

Plot chi2 pdf with V = 4 x = 0:0.2:15; y = chi2pdf(x,4); plot(x,y) COMP80131-SEEDSM8

Student’s t-distribution pdf Depends on a single parameter V (degrees of freedom). As V, t-pdf approaches standard normal distribution If x is random sample of size n from a normal distribution with mean μ, then the t-statistic has Student's t-pdf with V = n – 1 degrees of freedom. COMP80131-SEEDSM8

0.4 0.35 0.3 0.25 T-pdf(blue) Norm-pdf(red) 0.2 0.15 0.1 0.05 0 -5 -4 -3 -2 -1 0 1 2 3 4 5 Compare t-pdf(V=5) with normal x = -5:0.1:5; y = tpdf(x,5); z = normpdf(x,0,1); plot(x,y,'b',x,z,'r'); COMP80131-SEEDSM8

MATLAB functions for t-dist • pdf for t-distribution with V degrees of freedom: y = tpdf ( t,V); (With samples with n values, V = n-1) • Cumulative df with V degrees of freedom p = tcdf ( t , V) Prob of rand var being  t • Complementary df (area under ‘tail’ from t to ) p = 1 – tcdf ( t , V) Prob of rand var being > t COMP80131-SEEDSM8

t-pdf p x t t-pdf p x t Inverse-cdf in MATLAB • If p = tcdf(t,V) then t = tinv(p,V) Value of t such that prob of rand var being  t is p • If p = normcdf(z,m,) then z = norminv(p,m, ) Value of z such that prob of rand var being  z is p • Complementary version: t = tinv(1-p,V) Value of t such that prob of rand var being > t is p. • Similarly for complementary version of norminv • Inverse of cumulative distrib function COMP80131-SEEDSM8

0.4 0.3 Std Normal pdf 0.2 0.1 0 -2 -1 0 2 1 4 z Significance testing: z-test • Assume Normal population with known stdev = . • Null-hypothesis: pop-mean = 0 • Alternative hyp: pop-mean < 0 • Take one sample of n values & calculate the z-statistic: If pop-mean = 0, dist of z will be standard Normal (mean=0, std=1) If mean of z is 0, how likely is a value  z as just calculated? p-value = prob (x  z) = 1-normcdf(z,0,1) If p-value < significance level alpha () reject null-hyp. COMP80131-SEEDSM8

Alternative formulation Assuming we need 95% confidence,  = 0.05 Let z() = norminv(1-, 0, 1) = 1.65 Prob of getting rand var  1.65 is less than 0.05 If z  1.65, it is outside our 95% ‘confidence limit’ that the null-hyp may be true. So reject null-hyp. Confidence limit is for z is - to 1.65 Neglect possibility that z may be negative.(1-tailed test) Confidence limit for sample-mean is - to 1.65/n + 0 COMP80131-SEEDSM8

2-tailed test Assuming we need 95% confidence,  = 0.05. Allowing possibility that z < 0, extreme portions of tails are for z > z(/2)) and for z < -z(/2)). prob(z  z(/2)) + prob(z -z((/2) ) = 2 prob(z  z(/2)) Now, z(/2) = norminv(1-/2,0,1) = 1.96 Prob of getting rand var  1.96 or  -1.96 is 0.05 If z > 1.96 or z < - 1.96, it is outside our 95% ‘confidence limit’ that the null hyp may be true. So reject null-hyp. Confidence limits for z are -1.96 to 1.96 Confidence limits for sample-mean are: 0 - 1.96/n to 0 + 1.96/n COMP80131-SEEDSM8

0.4 0.3 T-pdf(blue) Norm-pdf(red) 0.2 t 0.1 0 -5 -4 -3 -2 -1 0 1 2 3 4 5 Significance testing: t-test • Assume Normal population with unknown stdev. • Null-hypothesis: pop-mean =0 • Alternative hyp: pop-mean < 0 • Take one sample of n values & calculate the t-statistic: If pop-mean = 0, dist of t will be standard t-pdf (blue) with V=n-1. How likely is calculated value of t? ‘1-tailed’ p-value = prob (x  t) = 1 - tcdf(t , n-1) If p-value < significance level alpha () reject null hyp. COMP80131-SEEDSM8

Alternative formulation (2-tailed) • Null-Hyp is that pop-mean is 0 • Assuming we need 95% confidence,  = 0.05 • Confidence limits for 0 is: If value of 0 is outside these limits, reject the null-hyp that population mean is 0 If 0 is within these confidence limits, cannot reject null-hyp. COMP80131-SEEDSM8

Difference betw z-test & t-test(2-tailed) • With z-test pop-std () is known; with t-test  is unknown. For z-test, p-value = prob ( x  z) = 1- normcdf(z,0,1) For t-test, p-value = prob( x  t) = 1 – tcdf(t,n-1) Same Null-hyp: pop-mean = 0 : reject if 0 outside conf limits Confidence limits for z-test: Confidence limits for t-test: COMP80131-SEEDSM8

Non-Gaussian populations • If samples of size n are ‘randomly’ chosen from a pop with mean  & std , the pdf of their sample-means approaches a Normal (Gaussian) pdf with mean  & stdev /n as n ∞. • Regardless of whether the population is Gaussian or not! • This is Central Limit Theorem • Tests can be made to work for non-Gaussian populations provided n is ‘large enough’. COMP80131-SEEDSM8

Meaning of confidence limits If =0.5, there is 95% probability that the confidence limits for a given sample will contain the true population statistic  say. COMP80131-SEEDSM8

A really subtle point • Does this mean that there a 95% probability that  lies within the 95% confidence limits for the given sample? COMP80131-SEEDSM8

A really subtle point • Does this mean that there a 95% probability that  lies within the 95% confidence limits for the given sample? • No! A common mistake! • We have just one sample – we have no idea whether it is one whose confidence limits contain  or not. • Only 95% of possible samples will have conf limits which contain . COMP80131-SEEDSM8

P-values & confidence limits in MATLAB • Come for free with most measurements. For example: x= [1;2;3;4;5;6]; y =[1.1; 3;2;4;6;4]; [R, p_value, Rlo, Rup] = corrcoef(x,y) • Returns Pearson corr coeff R= 0.79, • p_value = 0.061, • Also 95% confidence limits: Rlo=-0.06, Rup = 0.98 • 95% prob that the true corr lies between -0.06 & 0.98 • “ Returns p-values for testing the hypothesis of no correlation. Each p-value is probability of getting a correlation as large as the observed value by random chance, when the true correlation is zero. If p_value is small, say < 0.05, then the correlation is significant”. COMP80131-SEEDSM8

Credibility limits • Baysian equivalent of ‘confidence limits’ • If limits are C1 to C2, &  = 0.05 • Now there is 95% probability of the statistic,  say, lying between C1 & C2. • ‘Confidence limits are ‘frequentist’ • Jonas explained why many people distrust the frequentist approach and consider the Bayesian approach to be much more reliable. COMP80131-SEEDSM8

0.2 0.16 True probability of getting that no of heads 0.12 0.1 0.04 0.02 No of heads obtainable with n coin-tosses 0 0 2 4 6 8 10 12 14 16 18 20 Reminder: Binomial distribution • If p=prob(Heads), prob of getting Heads exactly r times in n independent coin-tosses is: nCr pr (1-p)(n-r) • For a fair coin. p=0.5,  this becomes nCr /2n COMP80131-SEEDSM8

Binomist dist with n=6 0.4 0.35 0.3 True probability of getting that no of heads 0.25 0.2 0.15 0.0156 0.1 0.05 0 0 1 2 3 4 5 6 No of heads obtainable with n coin-tosses COMP80131-SEEDSM8

MATLAB Script p = 0.5; % for coin tossing n=6; for r=0:n nCr = prod(n:-1:(n-r+1))/prod(1:r); Prob(1+r) = nCr * (p^r) * (1-p)^(n-r); end; Prob figure(2); stem(0:n,Prob); COMP80131-SEEDSM8

Geometric distribution • p(x) = (1-p)px-1 (p = prob of success). • Number of trials (coin tosses) up to & including that in which first failure occurs p = 0.5 x=1:10; prob = (1-p)*p.^(x-1); stem(x,prob); COMP80131-SEEDSM8

0.5 0.4 0.3 prob of first failure at x 0.2 0.1 0.05 0 1 2 3 4 5 6 7 8 9 10 x: number of trials Geometric distribution (again) prob(6) = 0.0156 prob(5) = 0.0313 COMP80131-SEEDSM8

Barry’s Assignment • Deadline 20 Dec 2012 • Email to barry@man.ac.uk with ‘SEEDSM12’ in title • or • Hand in paper copy to SSO • Exam statistics are in examdata.dat and examdata.xls in • www.cs.man.ac.uk/~barry/mydocs/MyCOMP80131 • (or navigate from www.cs.man.ac.uk/~barry) COMP80131-SEEDSM8

Scientific Methods 1