620 likes | 830 Views
Basic Statistics. “I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who go to sleep in church were laid end to end they would be a lot more comfortable.” [Mrs Robert A Taft].
E N D
“I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who go to sleep in church were laid end to end they would be a lot more comfortable.” [Mrs Robert A Taft]
“Data! Data! Data!”he cried impatiently.“I can’t make bricks without clay” [Sherlock Holmes]
Qualitative a) Nominal data (dead/alive, blood group O,A,B,AB) b) Ordered categorical/ranked data (mild/moderate/severe)
Quantitative a) Numerical discrete (no. of deaths in a hospital per year) b) Numerical continuous (age, weight, blood pressure)
Presenting data • Graphs • Summary statistics • Tables
Graphical methods Piechart Barchart Histogram Scattergram
Summary statistics Qualitative data • Percentages • Numbers
Summary StatisticsQuantitative data • Non-normal median range inter-quartile range • Normal mean standard deviation variance
Summary StatisticsNormal data Approximately 95% of observations lie between the mean plus or minus 2 standard deviations
How to test for Normality • Mean = Median • (mean-2sd, mean+2sd) reasonable range • -1 < skewness < 1 • -1 < kurtosis < 1 • Histogram shows symmetric bell shape
Secondary prevention of coronary heart disease [* Median (range)]
Natural log transformation • Can transform +vely skewed data to ‘Normal’ data • Use transformed data in analysis • Resulting mean value transformed back (using ex) to give geometric mean • Present geometric mean and range
Effect of loge transformation [Geometric mean = e 2.2 = 9.0]
Secondary prevention of coronary heart disease [* Median (range), # Geometric mean (range)]
Confidence Interval “ The estimated mean difference in systolic blood pressure between 100 diabetic and 100 non-diabetic men was 6.0 mmHg with 95% confidence interval (1.1mmHg, 10.9mmHg)”
Confidence Interval • Contains information about the (im)precision of the estimated effect size • Presents a range of values, on the basis of the sample data, in which the population value for such an effect size may lie
Confidence Interval95% CI for mean = mean +/- 1.96 SEM90% CI for mean = mean +/- 1.64 SEMSEM = sd / sqrt(n)
Confidence Interval • The 95% CI is a range of values which we are 95% confident covers the true population mean • There is a 5% chance that the ‘true’ mean lies outside the 95% CI
Significance/hypothesis tests Measure strength of evidence provided by the data for or against some proposition of interest Eg. Is the survival rate after X better than after Y?
Significance/hypothesis tests Null hypothesis: “Effects of X and Y are the same” Alternative hypothesis: “Effects of X and Y are different”
Significance/hypothesis tests One-sided : “X is better than Y” Two-sided: “ X and Y have different effects”
P-value P is the probability of how true is the null hypothesis
P-value P <= 0.05 • null hypothesis is not true • there is a difference between X and Y • result is statistically significant
P-value P > 0.05 • null hypothesis may be true • there is probably no difference between X and Y • result is not statistically significant
P-value Power of study • probability of rejecting null hypothesis when false • increased by increasing sample size • increased if true difference between treatments is large
P-value Statistical significance does not imply clinical significance
A statistician is a person whose lifetime ambition is to be wrong 5% of the time
Types of significance tests Chi-square test: “28 out of 70 smokers have a cough compared with 5 out of 50 non-smokers - is there a significant difference?” [28/70 = 40% compared with 5/50=10%]
Chi-square test result “P=0.001” There is a significant relationship between smoking and cough
Types of significance tests Two-sample t-test: “Is there a difference in the 24 hour energy expenditure between groups of lean and obese women?”
Types of significance tests Mann-Whitney U-test: “Is there a difference in the nausea score between chemo patients receiving an active anti-emetic treatment and those receiving placebo?”