1 / 44

Scientific Methods 1

Scientific Methods 1. ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 4: Statistical Methods-Probability. Barry & Goran. www.cs.man.ac.uk/~barry/mydocs/myCOMP80131. Probability. There are two useful definitions of probability:

Download Presentation

Scientific Methods 1

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scientific Methods 1 ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 4: Statistical Methods-Probability Barry & Goran www.cs.man.ac.uk/~barry/mydocs/myCOMP80131 COMP80131-SEEDSM2-4

  2. Probability There are two useful definitions of probability: • Baysian probability is a person’s belief in the truth of a statement S, quantified on a scale from 0 (definitely not true) to 1 (definitely true). 2. Experimental (or frequentist) probability is determined by the number, M, of times that a statement S will be found to be true if it is tested a large number, N, of times . The probability, P(S), may then be defined as the limit of M / N as more and more experiments are carried out & N tends to infinity. COMP80131-SEEDSM2-4

  3. Different language • By either definition, probability P(S) is a number in range 0 to 1. • Multiply by 100 to express as a percentage. • Or express as odds: e.g. ‘4 to 1 against’ means 1/5 = 0.2 = 20%. • What do odds of ‘4 to 1 on’ mean? • What does ‘50-50’ mean ? COMP80131-SEEDSM2-4

  4. Calculating probability • The 2 definitions of probability usually mean the same thing. • By examining a coin, we could give ourselves good reason for believing that tossing it just once will give an even chance of getting heads, i.e. that the Baysian definition of P(S) = 0.5 where S = ‘get heads’. • If the coin is then tossed N = 100 times we would expect about M = 50 occurrences of heads meaning that M/N  0.5. • Increasing N to 1000 and then to 1000000 would be expected to produce closer & closer approximations to P(S) = 0.5. • If this does not happen, our ‘a-priori’ belief may be wrong. • The coin may be ‘weighted’ after all. COMP80131-SEEDSM2-4

  5. Random process • Tossing a coin is a random process. • It generates a ‘random variable’ Heads or Tails. • It is random because the outcome cannot be predicted exactly. • If 1= heads and 0 = tails we have a random binary number. • Throwing a dice generates a random integer in range 1-6. • Spinning a Roulette wheel generates a random no. in range 0-36. • Setting & marking an exam produces random nos in range 0-100 • These are all random processes producing discrete variables. • Some random processes produce continuous variables. e.g. measuring people’s heights. COMP80131-SEEDSM2-4

  6. Simulating random process • MATLAB has functions that generate pseudo-random numbers. • ‘rand’ produces a pseudo-random number ‘uniformly distributed’ in the range 0 to 1. • May be considered ‘continuous’ since floating pt is very accurate. • Calling ‘rand’ repeatedly produces numbers evenly distributed across the range 0 to 1. • They are ‘pseudo-random’ because if we know the algorithm used, we can predict the numbers. • So we pretend we do not know the algorithm. • ‘rand’ may be considered to simulate some random process that generates truly random numbers, uniformly distributed.. COMP80131-SEEDSM2-4

  7. Simulating coin tossing in MATLAB for n=1:20 R = rand; if R > 0.5, Heads(n)=1 else Heads(n) = 0; end; end; % of n loop Heads 10110001110101011101 - 12 heads & 8 tails When I changed 20 to 10,000, I got 5066 heads: P(Heads)  0.5066 When I ran it again, I got 4918 heads : P(Heads)  0.4918 COMP80131-SEEDSM2-4

  8. Using an unfair coin for n=1:20 R = rand; if R > 0.4, Heads(n)=1 else Heads(n) = 0; end; end; % of n loop Heads 00101001110101010101 - 10 heads & 10 tails • When I changed 20 to 10,000, I got 6012 heads: P(Heads)  0.6012 • When I ran it again, I got 5979 heads : P(Heads)  0.5979 COMP80131-SEEDSM2-4

  9. Estimating probability experimentally • We cannot measure probability with 100% accuracy. • All measurements are estimates that may be slightly or totally wrong. • According to experimental definition, we have to perform an experiment an infinite number of times to measure a probability. • This is clearly impossible. • In practice, we have to perform the experiment a finite number of times • (Cannot spend all our lives tossing coins) • Accept resulting measurement as an estimate of true probability. COMP80131-SEEDSM2-4

  10. Baysian Definition • According to Baysian definition of probability, a person’s belief in the truth of a statement may be affected by one or more assumption (hypotheses). • “I assume it is a fair coin” • Different people may have different beliefs. • Can only estimate probability using information we have at hand, though we can modify this estimate later if we get new information. COMP80131-SEEDSM2-4

  11. Conditional probability • P(S  S1) means the probability of ‘statement S’ being true given that we know that another statement, S1, is definitely true. • If S stands for ‘get heads’ we may at first believe that P(S) = 0.5. • But what if someone tells us that the statement S1: ‘coin is weighted with heavier metal on one side’, is true? • We may change our measurement of probability to P(S  S1). • P(S) is then referred to as the ‘prior’ probability • P(S  S1) is the ‘conditional’ or ‘posterior’ probability. COMP80131-SEEDSM2-4

  12. Bayes Theorem • Expresses the probability of some fact ‘A’ being true when we know that some other fact ‘B’ is true: • P(A) is ‘prior’ as it does not take into account any information about B. • Similarly P(B) is ‘prior’. • P(A|B) and P(B|A) are conditional or ‘posterior’ probabilities. • Let A = ‘coin is fair’ & B = ‘getting 12 heads out of 20’ • P(A B) = P(B A)  P(A) / P(B) COMP80131-SEEDSM2-4

  13. What is prob of getting 12 heads out of 20? clear all; % WITH FAIR COIN HIS=zeros(21,1); for rep=1:1000 for n=1:20 R = rand; % Unif random number between 0 & 1 if R > 0.5, Heads(n)=0; else Heads(n)=1; end; end; % of n loop Count = sum(Heads); HIS(1+Count) = HIS(1+Count)+1; end; % of rep loop figure(1); stem(0:20,HIS); COMP80131-SEEDSM2-4

  14. Histogram for 1000 trials FAIR COIN COMP80131-SEEDSM2-4

  15. Estimate of probability distribution FAIR COIN COMP80131-SEEDSM2-4

  16. Probability estimate (fair coin) Estimated probabilities: for 0:9 heads 0 0 0 0 0.008 0.011 0.024 0.087 0.119 0.160 for 10:19 heads 0.194 0.157 0.115 0.076 0.03 0.012 0.003 0.003 0.001 0 for 20 heads 0 So our estimate of the probability of getting 12 heads out of 20 with a fair coin is 0.115. COMP80131-SEEDSM2-4

  17. What is prob of getting 12 heads out of 20? clear all; %WITH 60-40 WEIGHTED COIN HIS=zeros(21,1); for rep=1:1000 for n=1:20 R = rand; % Unif random number between 0 & 1 if R > 0.4, Heads(n)=1; else Heads(n)=0; end; end; % of n loop Count = sum(Heads); HIS(1+Count) = HIS(1+Count)+1; end; % of rep loop figure(1); stem(0:20,HIS); COMP80131-SEEDSM2-4

  18. HISTOGRAM for ‘60-40’ weighted coin COMP80131-SEEDSM2-4

  19. Prob distribution estimate for ‘60-40’ weighted coin COMP80131-SEEDSM2-4

  20. Estimate Cumulative Prob Distrib CDF(1)= HIS(1)/1000; for n=2:21, CDF(n)=CDF(n-1)+HIS(n)/1000; end; figure(3); stem(0:20,CDF); Easily derived from a Histogram or Prob Distribution. Estimate prob of getting between 0 and n Heads COMP80131-SEEDSM2-4

  21. Estimate of Cumulative Prob Dist FAIR COIN Usually an S shaped function COMP80131-SEEDSM2-4

  22. 4 coin-tosses: how many possible outcomes? How many with 0 heads? 1 How many with 1 heads? 4 = 4C1 How many with 2 heads? 6 = 4C2 = 43/ (2!) How many with 3 heads? 4 = 4C3 How many with 4 heads? 1 Combinations: nCr = no of ways of choosing r from n = n(n-1) …(n-r+1) / (r!) 0000 0001 0010 0011 0100 0101 0110 0111 1111 1001 1010 1011 1100 1101 1110 1111 COMP80131-SEEDSM2-4

  23. Binomial Prob Distribution • Distributions have up to now been estimated. • For random processes with just 2 outputs, we can derive a true distribution: • If p=prob(Heads), prob of getting Heads exactly r times in n independent coin-tosses is: nCr pr (1-p)(n-r) • For a fair coin. p=0.5,  this becomes nCr /2n • For a fair dice, the prob of throwing 3 sixes in five throws is: [54/(3 2 1)] (1/6)3  (5/6)2 COMP80131-SEEDSM2-4

  24. Implementing formula (fair coin) • p = 0.5; % for fair coin tossing • n=20; • for r=0:n • nCr = prod(n:-1:(n-r+1))/prod(1:r); • P(1+r) = nCr * (p^r) * (1-p)^(n-r); • end; • figure(4); stem(0:20,P); • axis([0 20 0 0.2]); grid on; COMP80131-SEEDSM2-4

  25. True prob distribution (n=20) Fair coin COMP80131-SEEDSM2-4

  26. True probability from formula For 0-9 heads: 0 0 0.0002 0.0011 0.0046 0.015 0.037 0.074 0.12 0.16 For 10-19 heads: 0.176 0.16 0.12 0.074 0.037 0.015 0.0046 0.0011 0.0002 0 For 20 heads: 0 True prob of getting 12 heads with a fair coin is 0.12. Changing p to 0.4, we find that the true probability of getting 12 heads out or 20 with a ‘60-40’ weighted coin is: 0.18 COMP80131-SEEDSM2-4

  27. Back to Bayes Theorem • There are 2 coins a fair one & a ‘60-40’ weighted one. • We chose a coin at random & toss it 20 times. • What is the probability of having a weighted coin when I get 12 heads out of 20? • A = ‘coin is weighted 60-40’ & B = ‘get 12 heads out of 20’ • We know that P(B Fair coin) is 0.12 & P(B A) is 0.18. • So P(B) will be the average of 0.12 & 0.18 = 0.15 • P(A B) = P(B A)  P(A) / P(B) • = 0.18  0.5 /0.15 = 0.6 COMP80131-SEEDSM2-4

  28. Further illustration of Bayes Theorem • At a college there are: 10 students from France 5 girls & 5 boys 15 from UK 5 girls & 10 boys 20 from Canada 5 girls & 15 boys COMP80131-SEEDSM2-4

  29. Calculation • If we choose a student at random, the a-priori probability that this student is French is P(French) = 10/45 = 2/9  0.22 • Now if we notice that this student is a boy, how does this change the probability that the student is French? • Use Bayes’ Theorem as follows: • = 0.5  (10/45) / (30/45) = 1/6  0.167 • The fact that we notice that the chosen student is a boy gives us additional information that changes the probability that the student chosen at random will be French. COMP80131-SEEDSM2-4

  30. Check the calculation • We can check the previous result by common sense, noticing that out of 30 boys, in the college 5 are from France. Therefore, P(FB) = 5/30 = 1/6. COMP80131-SEEDSM2-4

  31. Usefulness of Bayes Theorem • In general Bayes’ theorem allows us to take additional information into account when calculating probabilities. Without the additional information, we have a ‘prior’ probability and with it we have a ‘conditional’ or ‘posterior’ probability. COMP80131-SEEDSM2-4

  32. Bayes Theorem in medicine • A patent goes to a doctor with a bad cough & a fever. The doctor needs to decide whether he has ‘swine flu’. • Let statement S = ‘has bad cough and fever’ and statement F = ‘has swine flu’. • The doctor consults his medical books and finds that about 40% of patients with swine-flu have these same symptoms. • Assuming that, currently, about 1% of the population is suffering from swine-flu and that currently about 5% have bad cough and fever (due to many possible causes including swine-flu), we can apply Bayes theorem to estimate the probability of this particular patient having swine-flu. COMP80131-SEEDSM2-4

  33. Another problem to solve • A doctor in another country knows form his text-books that for 40% of patients with swine-flu, the statement S, ‘has bad cough and fever’ is true. He sees many patients and comes to believe that the probability that a patient with ‘bad cough and fever’ actually has swine-flu is about 0.1 or 10%. If there were reason to believe that, currently, about 1% of the population have a bad cough and fever, what percentage of the population is likely to be suffering from swine-flu? COMP80131-SEEDSM2-4

  34. Some questions from Lecture 2 • Analyse the ficticious exam results & comment on features. • Compute means, stds & vars for each subject & histograms for the distributions. • Make observations about performance in each subject & overall • Do marks support the hypothesis that people good at Music are also good at Maths? • Do they support the hypothesis that people good at English are also good at French? • Do they support the hypothesis that people good at Art are also good at Maths? • If you have access to only 50 rows of this data, investigate the same hypotheses • What conclusions could you draw, and with what degree of certainty? COMP80131-SEEDSM2-4

  35. pdf(x) 1 x a b 1 pdf(x) m a b x m- m+ Continuous random processes • Characterised by probability density functions (pdf) Uniform pdf: Prob of the random variable x lying between a and b is: Gaussian (Normal) pdf with mean m & std dev . 95.5% for m  299.7% for m  3 68% COMP80131-SEEDSM2-4

  36. pdf & Histograms • Ru = rand(10000,1); %10000 unif samples • hist(Ru,20); • Rg=randn(10000,1); %Gaussian with m=0, std=1 • hist(Rg,20); COMP80131-SEEDSM2-4

  37. Converting histogram to estimate of pdf • Divide each column by number of samples • Then multiply by number of bins. • For better approximation, increase number of bins COMP80131-SEEDSM2-4

  38. Concept of a ‘null-hypothesis’ • A null-hypothesis is an assumption that is made and then tested by a set of experiments designed to reveal that it is likely to be false, if it is false. • Testing is done by considering how probable the results are, assuming the null hypothesis is true. • If the results appear very improbable the researcher may conclude that the null-hypothesis is likely to be false. • This is usually the outcome the researcher hopes for when he or she is trying to prove that a new technique is likely to have some value. COMP80131-SEEDSM2-4

  39. An example • Assume we wish to find out if a proposed technique designed to benefit users of a system is likely to have any value. • Divide the users into two groups and offer the proposed technique to one group and something different to the other group. • The null-hypothesis would be that the proposed technique offers no measurable advantage over the other techniques. COMP80131-SEEDSM2-4

  40. The testing • This would be carried out by looking for differences between the sets of results obtained for each of the two groups. • Careful experimental design will try to eliminate differences not caused by the techniques being compared. • Must take a large number of users in each group & randomize the way the users are assigned to groups. • Once other differences have been eliminated as far as possible, any remaining difference will hopefully be indicative of the effectiveness of the techniques being investigated. • The vital question is whether they are likely to be due to the advantages of the new technique, or the inevitable random variations that arise from the other factors. • Are the differences statistically significant? • Can employ a statistical significance to find out. COMP80131-SEEDSM2-4

  41. Failure of the experiment • If the results are not found to look improbable under the null-hypothesis, i.e. if the differences between the two groups are not statistically significant, then no conclusion can be made. • The null-hypothesis could be true, or it could still be false. • It would be a mistake to conclude that the ‘null-hypothesis’ has been proved likely to be true in this circumstance. • It is quite possible that the results of the experiment give insufficient evidence to make any conclusions at all. COMP80131-SEEDSM2-4

  42. P-Value • Probability of obtaining a test result at least as extreme as the one that was actually observed, assuming that the null hypothesis is true. • Reject the null hypothesis if the p-value is less than some value α (significance level) which is often 0.05 or 0.01. • When the null-hypothesis is rejected, the result is said to be statistically significant. COMP80131-SEEDSM2-4

  43. Checking whether a coin is fair Suppose we obtain heads 14 times out of 20 flips. The p-value for this test result would be the probability of a fair coin landing on heads at least 14 times out of 20 flips. This is: (20C14 + 20C15+20C16+20C17+20C18+20C19+20C20) / 220 = 0.058 This is probability that a fair coin would give a result as extreme or more extreme than 14 heads out of 20. COMP80131-SEEDSM2-4

  44. Significance test • Reject null hypothesis if p-value  α . • If α= 0.05, the rejection of the null hypothesis is at the 5% (significance) level. • The probability of wrongly rejecting the null-hypothesis (Type 1 error) will be equal to α. • This is considered sufficiently low. • In this case, p-value > 0.05, therefore observation is consistent with null hypothesis and we cannot reject it. • Cannot conclude that coin is likely to be unfair. • But we have NOT proved that coin is likely to be fair. • 14 heads out of 20 flips can be ascribed to chance alone • It falls within the range of what could happen 95% of the time with a fair coin. COMP80131-SEEDSM2-4

More Related