140 likes | 227 Views
Statistical Power. The power of a test is the probability of detecting a difference or relationship if such a difference or relationship really exists. Anything that decreases the probability of a Type II error increases the Power. Statistical Power.
E N D
Statistical Power • The power of a test is the probability of detecting a difference or relationship if such a difference or relationship really exists. Anything that decreases the probability of a Type II error increases the Power
Statistical Power • Understanding statistical power helps you interpret scientific studies more objectively. Power analysis, as it is called in research reports, helps you to determine what influence the statistical power had on the effect of an intervention. In other words, it helps you to determine if there were true differences between groups. Power is determined by: significance level, sample size, and effect size.
Statistical Power-Significance Level • Significance level: The fixed probability of wrongly rejecting the null hypothesis is usually set at 0.05 or 0.01, i.e., a = 0.05. If the null hypothesis is wrongly rejected, it is termed a Type I error. If the researcher fails to reject the null hypothesis when it is false, a Type II error occurs. Nurses tend to think that far more Type II errors occur. The probability of making a type II error is called b. The power of a test is 1- b. b is usually set between .05 and .20. Using 1-.20 or .80, would mean that the researcher is willing to accept a 20% chance that there were no group difference in the results when the intervention actually did have an effect.
Statistical Power-Sample Size • Sample size: The number of subjects in the study must be large enough so that the sample outcomes are truly representative of the outcomes that would occur if the entire population were studied. Increasing the sample size increases the power of the study – the larger the sample, the more likely the null hypothesis is to be rejected.
Statistical Power – Effect Size • Effect size: The extent of the differences in the effect of an intervention on an experimental group as compared to a control group or the effect of the independent variable on a dependent variable. Since the researcher does not know this before the study, he or she must estimate it through: • a.evaluation of existing studies that are similar and making a guess • b.selecting the smallest effect size thought to be clinically significant • c.conducting a pilot study to get an idea of what it might be d. by using conventions – post hoc power analyses on a large number of nursing studies showed that the effect size averaged .35.
Statistical Power – Effect Size • Cohen’s text (1988), the benchmark for conventions, in determining differences between two groups, estimates effect size for small effects at .20, for medium effects at .50 and for large effects at .80. Note: If you expect large effects, you can use a smaller sample size. Medium effects should be able to be detected by the “naked eye.” • Using test scores with a standard deviation of 100, a small effect would be 100 x 0.2 or 20 points, a medium effect would be 100 x 0.5 or 50 points and a large effect would be 100 x 0.80 or 80 points.
How Many Subjects? From H.C. Kramer & S. Thiemann Sage Publications
Introduction • Science requires that researchers proposing an hypothesis put that hypothesis to a test. • The researcher’s hypothesis is considered false until demonstrated “beyond a reasonable doubt” to be true. • A reasonable doubt is the significance level – usually .05 or .01 doubt is allowed to remain. • How likely the evidence is to be “convincing” depends on the tests used and the number of subjects – the power
Introduction cont. • To compute the power, the researchers must develop a “critical effect size” – a measure of how strong an hypothesis must be to be “important to society” – This comes from analysis of the literature, knowledge of the subject, the population and the measurement tool or data source. • Remember, “post hoc” power calculations do not change the outcomes.
The Power Table • These authors present a Master Table that has four sections: • For one-tailed tests with ą = .05 • For one-tailed tests with ą = .01 • For two-tailed tests with ą = .05 • For two-tailed tests with ą = .01
The Power Table cont. • The columns of each section relate to various levels of power (10%-99%) • The rows relate to the critical effect size, Δ, with ranges from 0 (the researcher’s hypothesis is false) to 1.0 (there is no doubt about the truth of the researcher’s hypothesis) • The body of the table has numbers,v, that correspond to the sample size as determined by the design.
The Power Table cont. • More subjects are needed for a .01 significance level than for a .05 level • More subjects are needed for a two-tailed test than for a one-tailed test • The smaller the critical effect size, the larger the necessary sample size • The larger the power required, the larger the necessary sample size • The smaller the sample size, the smaller the power ( the greater the chance of failure)
Disciplines and Sample Size • Minimum sample size differs from field to field. Some may be chosen based on custom. • An opinion survey usually has at least 1,000 subjects. • Sociological or epidemiological studies usually have several hundred subjects. • Medical clinical trials may have as few as 10-20. • Behavioral studies may have one subject.
Values in the Master Table • Different statistical tests require different ways of calculating power and the results may be a little different from those in the table. • The values in the Master Table should be regarded as approximate. Use any two of the three entries to find the third • An estimation of critical effect size is required – that is the minimum effect considered important to detect – like a 20% decrease in a person’s mental aptitude in less than six months.