1 / 17

Medical Statistics as a science

Medical Statistics as a science. Why Do Statistics?. Extrapolate from data collected to make general conclusions about larger population from which data sample was derived Allows general conclusions to be made from limited amounts of data

Pat_Xavi
Download Presentation

Medical Statistics as a science

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Medical Statisticsas a science

  2. Why Do Statistics? • Extrapolate from data collected to make general conclusions about larger population from which data sample was derived • Allows general conclusions to be made from limited amounts of data • To do this we must assume that all data is randomly sampled from an infinitely large population, then analyse this sample and useresults to make inferences about the population

  3. Statistical Analysisin a Simple Experiment • Define population of interest • Randomly select sample of subjects to study(clinical trials do not enrol a randomly selected sample of patients due to inclusion/exclusion criteria but define a precise patient population) • Half the subjects receive one treatment and the other half another treatment (usually placebo) • Measure baseline variables in each group(e.g. age, Apache II to ensure randomisation successful) • Measure trial outcome variables in each group (e.g. mortality) • Use statistical techniques to make inferences about the distribution of the variables in the general population and about the effect of the treatment

  4. Data • Categorical data: values belong to categories • Nominal data: there is no natural order to the categoriese.g. blood groups • Ordinal data: there is natural order e.g. Adverse Events (Mild/Moderate/Severe/Life Threatening) • Binary data: there are only two possible categoriese.g. alive/dead • Numerical data: the value is a number(either measured or counted) • Continuous data: measurement is on a continuume.g. height, age, haemoglobin • Discrete data: a “count” of events e.g. number of pregnancies

  5. Descriptive Statistics: concerned with summarising or describing a sample eg. mean, median • Inferential Statistics: concerned with generalising from a sample, to make estimates and inferences about a wider population eg. T-Test, Chi Square test

  6. Statistical Terms • Mean: the average of the data sensitive to outlying data • Median: the middle of the data not sensitive to outlying data • Mode: most commonly occurring value • Range: the spread of the data • IQ range: the spread of the data commonly used for skewed data • Standard deviation: a single number which measures how much the observations vary around the mean • Symmetrical data: data that follows normal distribution  (mean=median=mode) report mean & standard deviation & n • Skewed data: not normally distributed (meanmedian mode) report median & IQ Range

  7. Standard Normal Distribution

  8. Standard Normal Distribution Mean +/- 1 SD  encompasses 68% of observations Mean +/- 2 SD  encompasses 95% of observations Mean +/- 3SD  encompasses 99.7% of observations

  9. Steps in Statistical Testing • Null hypothesisHo: there is no difference between the groups • Alternative hypothesisH1: there is a difference between the groups • Collect data • Perform test statistic eg T test, Chi square • Interpret P value and confidence intervals P value  0.05 Reject Ho P value > 0.05 Accept Ho • Draw conclusions

  10. Meaning of P • P Value: the probability of observing a result as extreme or more extreme than the one actually observed from chance alone • Lets us decide whether to reject or accept the null hypothesis • P > 0.05 Not significant • P = 0.01 to 0.05 Significant • P = 0.001 to 0.01 Very significant • P < 0.001 Extremely significant

  11. T Test • T test checks whether two samples are likely to have come from the same or different populations • Used on continuous variables • Example: Age of patients in the APC study (APC/placebo) PLACEBO: APC: mean age 60.6 years mean age 60.5 years • SD+/- 16.5 SD +/- 17.2 • n= 840 n= 850 • 95% CI 59.5-61.7 95% CI 59.3-61.7 • What is the P value? • 0.01 • 0.05 • 0.10 • 0.90 • 0.99 • P = 0.903  not significant  patients from the same population(groups designed to be matched by randomisation so no surprise!!)

  12. T Test: SAFE “Serum Albumin” PLACEBO ALBUMIN n 3500 3500 mean 28 30 SD 10 10 95% CI 27.7-28.3 29.7-30.3 Q: Are these albumin levels different?Ho = Levels are the same (any difference is there by chance)H1 =Levels are too different to have occurred purely by chance Statistical test:T test  P < 0.0001 (extremely significant)Reject null hypothesis (Ho) and accept alternate hypothesis (H1) ie. 1 in 10 000 chance that these samples are both from the same overall group therefore we can say they are very likely to be different

  13. Effect of Sample Size Reduction PLACEBO ALBUMIN n 350 350 mean 28 30 SD 10 10 95% CI 27.0-29.0 29.0-31.0 • smaller sample size (one tenth smaller) • causes wider CI (less confident where mean is) • P = 0.008 (i.e. approx 0.01  P is significant but less so) • This sample size influence on ability to find any particular difference as statistically significant is a major consideration in study design

  14. Reducing Sample Size (again) PLACEBO ALBUMINn 35 35 mean 28 30 SD 10 10 95% CI 24.6-31.4 26.6-33.4 • using even smaller sample size (now 1/100) • much wider confidence intervals • p=0.41 (not significant anymore) •  SMALLER STUDY has LOWER POWER to find any particular difference to be statistically significant (mean and SD unchanged) • POWER: the ability of a study to detect an actual effect or difference

  15. Reduction in death rate = 30.8%-24.7%= 6.1% ie 6.1% less likely to die in APC group Chi Square Test • Proportions or frequencies • Binary data e.g. alive/dead • PROWESS Study: Primary endpoint: 28 day all cause mortality ALIVE DEAD TOTAL % DEAD PLACEBO 581 (69.2%) 259 (30.8%) 840 (100%) 30.8 DEAD 640 (75.3%) 210 (24.7%) 850 (100%) 24.7 TOTAL 1221 (72.2%) 469 (27.8%) 1690 (100%) • Perform Chi Square test  P = 0.006 (very significant) • 6 in 1000 times this result could happen by chance 994 in 1000 times this difference was not by chance variation

  16. Reduction in death rate = 6.1% (still the same) Reducing Sample Size • Same results but using much smaller sample size (one tenth) ALIVE DEAD TOTAL % DEAD PLACEBO 58 (69.2%) 26 (30.8%) 84 (100%) 30.8 DEAD 64 (75.3%) 21 (24.7%) 85 (100%) 24.7 TOTAL 122 (72.2%) 47 (27.8%) 169 (100%) • Perform Chi Square test  P = 0.39 39 in 100 times this difference in mortality could have happened by chance therefore results not significant • Again, power of a study to find a difference depends a lot on sample size for binary data as well as continuous data

  17. Summary • Size matters=BIGGER IS BETTER • Spread matters=SMALLER IS BETTER • Bigger difference=EASIER TO FIND • Smaller difference=MORE DIFFICULT TO FIND • To find a small difference you need a big study

More Related