Clinical Research Management 512

Clinical Research Management 512 Leslie McIntosh lmcintosh at path.wustl.edu

Lecture 7 • Homework • Complete problems from slides • Review demo • Part I • Presentations • Part II • p-value and statistical significance • Part III • Hypothesis testing • CI & Statistical Significance

Homework Homework Demonstration

Distribution of Weight • Mean weight (kg) = 51.3 • Median weight (kg) = 49.5 • Minimum weight (kg) = 25.7 • Maximum weight (kg) = 98.1

Samples from Distribution

Statistics from Samples

Confidence Intervals from Samples Population Mean = 51.3 Sample A Sample B Sample C

Demonstrations • http://www.amstat.org/publications/jse/v16n3/pvalueapplet.html • Schulz, Eric. "Decisions Based on P-Values and Significance Levels" from the Wolfram Demonstrations Project? http://demonstrations.wolfram.com/DecisionsBasedOnPValuesAndSignificanceLevels/

Part II Presentations

Part II p-values and statistical significance

Statistical Significance • The statistical significance is the probability that the observed relationship (e.g., between variables) or a difference (e.g., between means) in a sample occurred by pure chance ("luck of the draw"), and that in the population from which the sample was drawn, no such relationship or differences exist. • The statistical significance of a result tells us something about the degree to which the result is "true" (in the sense of being "representative of the population").

p-values • The value of the p-value represents a decreasing index of the reliability of a result (see Brownlee, 1960). • The higher the p-value, the less we can believe that the observed relation between variables in the sample is a reliable indicator of the relation between the respective variables in the population. • The p-value represents the probability of error that is involved in accepting our observed result as valid, that is, as "representative of the population."

p-values (example) • A p-value of .05 (i.e.,1/20) indicates that there is a 5% probability that the relation between the variables found in our sample is a "fluke." • Meaning: assuming that in the population there was no relation between those variables whatsoever, and we were repeating experiments like ours one after another, we could approximately expect that in every 20 replications of the experiment there would be 1 in which the relation between the variables in question would be equal or stronger than in ours. • Note that this is not the same as saying that, given that there IS a relationship between the variables, we can expect to replicate the results 5% of the time or 95% of the time.

Conclusions from P-values • If the p-value is less than α: • The difference between samples is statistically significant. • Reject the null hypothesis (H0). • If the p-value is greater than α: • The difference between samples is not statistically significant. • Do not reject the null hypothesis (H0).

Pros of Saying “Statistically Significant” • It is sometimes necessary to make an efficient answer. • An exact p-value is not always obtainable. • Sounds less ambiguous than saying, “Random sampling would create a difference this big or bigger in 5% of experiments if the null hypothesis should not be rejected.”

Part III Hypothesis Testing CI & Statistical Significance

Hypothesis • What are you trying to answer? • Do you have a secondary question of interest? • What variables will you need to answer your question?

Error Types H0 = Null Hypothesis

Error Types • Type I • False Positive • Occurs when the null hypothesis is rejected, but it should not have been rejected • Type II • False Negative • Occurs when the null hypothesis is not rejected and it should have been rejected

Analogies for Hypothesis Test Defendant is innocent Defendant is guilty Gathering of evidence Summary of evidence Jury deliberation and decision Null hypothesis Alternative hypothesis Gathering of data Calculation of test statistic Application of the decision rule

Analogies for Hypothesis Test Verdict Verdict is to acquit Verdict is to convict Presumption of innocence Decision Failure to reject the null hypothesis Rejection of the null hypothesis Assumption that the null hypothesis is true

Analogies for Hypothesis Test Conviction of an innocent person Acquittal of a guilty person Beyond reasonable doubt High probability of convicting a guilty person Type I error (false positive) Type II error (false negative) Fixed (small) probability of Type I error High power

Relationship Between:Confidence Interval & Statistical Significance When a null hypothesis contains a value… • If a 95% CI does not contain the value of the H0, then the result must be statistically significant with p < 0.05. • If a 95% CI does contain the value of the H0, then the result must not be statistically significant (p > 0.05).

Relationship Between:Confidence Interval & Statistical Significance When a null hypothesis contains a value… • If a 90% CI does not contain the value of the H0, then the result must be statistically significant with p < _____. • If a 90% CI does contain the value of the H0, then the result must not be statistically significant (p > ____).

Clinical Research Management 512