520 likes | 860 Views
Objective of Statistical Analysis. To answer research questions using observed data, using data reduction and analyzing variabilityTo make an inference about a population based on information contained in a sample from that populationTo provide an associated measure of how good the inference is.
E N D
1. Probability and Statistical InferenceGehlbach: Chapter 8
3. Basic Concepts of Statistics
4. General Approach to Statistical Analysis
5. Outline Probability
Definition
Probability Laws
Random Variable
Probability Distributions
Statistical Inference
Definition
Sample vs. Population
Sampling Variability
Sampling Problems
Central Limit Theorem
Hypothesis Testing
Test Statistics
P-value Calculation
Errors in Inference
P-value Adjustments
Confidence Intervals
6. We disagree with Stephen A working understanding of P-values is not difficult to come by.
For the most part, Statistics and clinical research can work well together.
Good collaborations result when researchers have some knowledge of design and analysis issues
7. Probability
8. Probability and the P-value You need to understand what a P-value means
P-value represents a probabilistic statement
Need to understand concept of probability distributions
More on P-values later
9. Definition of Probability An experiment is any process by which an observation is made
An event (E or Ei) is any outcome of an experiment
The sample space (S) is the set of all possible outcomes of an experiment
Probability: a measure based on the sample S; in the simplest case is empirically estimated by # times event occurs / total # trials
E.g.: Pr(of a red car) = (# red cars seen) / (total # cars)
Probability is the basis for statistical inference
10. Axiomatic Probability (laying down “the laws”)
For any sample space S containing events E1, E2, E3,…; we assign a number, P(Ei), called the probability of Ei such that:
0 = P(Ei) = 1
P(S) = 1
If E1, E2, E3,…are pairwise mutually exclusive events in S then
11. Union and Intersection:Venn Diagrams
12. Laws of Probability (the sequel)
Let (“E complement”) be set of events in S not in E, then P( )= 1-P(E)
P(E1U E2) = P(E1) + P(E2) – P(E1n E2)
The conditional probability of E1 given E2 has occurred:
Events E1 and E2 are independent if
P(E1nE2) = P(E1)P(E2)
13. Conditional Probability Restrict yourself to a “subspace” of the sample space
14.
Categorical data analysis:
odds ratio = ratio of odds of two conditional probabilities
Survival analysis, conditional probabilities of the form :
P(alive at time t1+t2 | survive to t1) Conditional probability examples
15. Random Variables (where the math begins)
A random variable is a (set) function with domain S and range (i.e., a real-valued function defined over a sample space)
E.g.: tossing a coin, let X=1 of heads, X=0 if tails
P(X=0) = P(X=1) = ½
Many times the random variable of interest will be the realized value of the experiment (e.g., if X is the b-segment PSV from RDS)
Random variables have probability distributions
16. Probability Distributions Two types:
Discrete distributions (and discrete random variables) are represented by a finite (or countable) number of values
P(X=x) = p(x)
Continuous distributions (and random variables) are be represented by a real-valued interval
P(x1<X<x2) = F(x2) – F(x1)
17. Expected Value & Variance Random variables are typically described using two quantities:
Expected value = E(X) (the mean, usually “µ”)
Variance = V(X) (usually “s2”)
Discrete Case:
E(X) = V(X) =
Continuous Case:
18. Discrete Distribution Example Binomial:
Experiment consists of n identical trials
Each trial has only 2 outcomes: success (S) or failure (F)
P(S) = p for a single trial; P(F) = 1-p = q
Trials are independent
R.V. X = the number of successes in n trials
19. Continuous Distribution Example Normal (Gaussian):
The normal distribution is defined by its probability density function, which is given as
for parameters µ and s, where s > 0.
22. Statistical Inference
23. Statistical Inference Is there a difference in the population?
You do not know about the population. Just the sample you collected.
Develop a Probability model
Infer characteristics of a population from a sample
How likely is it that sample data support null hypothesis
24. Statistical Inference
Mean = ?
25. Definition of Inference Infer a conclusion/estimate about a population based on a sample from the population
If you collect data from whole population you don’t need to infer anything
Inference = conducting hypothesis tests (for p-values), estimating 95% CI’s
26. Sample vs. Population (example) “The primary sample [involved] students in the 3rd through 5th grades in a community bordering a major urban center in North Carolina… The sampling frame for the study was all third through fifth-grade students attending the seven public elementary schools in the community (n=2,033). From the sampling frame, school district evaluation staff generated a random sample of 700 students.”
Source: Bowen, NK. (2006) Psychometric properties of Elementary School Success Profile for Children. Social Work Research, 30(1), p. 53.
27. Philosophy of Science Idea: We posit a paradigm and attempt to falsify that paradigm.
Science progresses faster via attempting to falsify a paradigm than attempting to corroborate a paradigm.
(Thomas S. Kuhn. 1970. The Structure of Scientific Revolutions. University of Chicago Press.)
28. Philosophy of Science Easier to collect evidence to contradict something than to prove truth?
The fastest way to progress in science under a paradigm of falsification is through perturbation experiments.
In epidemiology,
often unable to do perturbation experiments
it becomes a process of accumulating evidence
Statistical testing provides a rigorous data-driven framework for falsifying hypothesis
29. What is Statistical Inference? A generalization made about a larger group or population from the study of a sample of that population.
Sampling variability: repeat your study (sample) over and over again. Results from each sample would be different.
30. Sampling Variability
Mean = ?
31. Sampling Variability
Mean = ?
32. Sampling Problems Low Response Rate
Refusals to Participate
Attrition
33. Low Response Rate Response rate = % of targeted sample that supply requested information
Statistical inferences extend only to individuals who are similar to completers
Low response rate ? Nonresponse bias, but is a possible symptom
34. Low Response Rate (examples) “One hundred six of the 360 questionnaires were returned, a response rate of 29%.”
Source: Nordquist, G. (2006) Patient insurance status and do-not-resuscitate orders: Survival of the richest? Journal of Sociology & Social Welfare, 33(1), p. 81.
“At the 7th week, we sent a follow-up letter to thank the respondents and to remind the nonrespondents to complete and return their questionnaires. The follow-up letter generated 66 additional usable responses.”
Source: Zhao JJ, Truell AD, Alexander MW, Hill IB. (2006) Less success than meets the eye? The impact of Master of Business Administration education on graduates’ careers. Journal of Education for Business, 81(5), p. 263.
“The response rate, however, was below our expectation. We used 2 procedures to explore issues related to non-response bias. First, there were several identical items that we used in both the onsite and mailback surveys. We compared the responses of the non-respondents to those of respondents for [both surveys]. No significant differences between respondents and non-respondents were observed. We then conducted a follow-up telephone survey of non-respondents to test for potential non-response bias as well as to explore reasons why they had not returned their survey instruments…”
Source: Kyle GT, Mowen AJ, Absher JD, Havitz ME. (2006) Commitment to public leisure service providers: A conceptual and psychometric analysis. Journal of Leisure Research, 38(1), 86-87.
35. Refusals to Participate Similar kind of problem to having low response rates
Statistical inferences may extend only to those who agreed to participate, not to all asked to participate
Compare those who agree to refusals
36. Refusals to Participate (example) “Participants were 38 children aged between 7 and 9 years. Children were from working- or middle-class backgrounds, and were drawn from 2 primary schools in the north of England. Letters were sent to the parents of all children between 7 and 9 in both schools seeking consent to participate in the study. Around 40% of the parents approached agreed for their children to take part.”
Source: Meins E, Fernyhough C, Johnson F, Lidstone J. (2006) Mind-mindedness in children: Individual differences in internal-state talk in middle childhood. British Journal of Developmental Psychology, 24(1), p. 184.
37. Attrition Individuals who drop out before study’s end (not an issue for every study design)
Differences between those who drop out and those who stay in are called Attrition bias.
Conduct follow-up study on dropouts
Compare baseline data
38. Attrition (example) “…Of the 251 men who completed an assigned intervention, about a fifth (19%) failed to return for a 1-month assessment and more than half (54%) for a 3-month assessment… Conclusions also cannot be generalized beyond the sample [partly because] attrition in the evaluation study was relatively high and it was not random. Therefore, findings cannot be generalized to those least likely to complete intervention sessions or follow-up assessments.”
Source: Williams ML, Bowen AM, Timpson SC, Ross MW, Atkinson JS. (2006) HIV prevention and street-based male sex workers: An evaluation of brief interventions. AIDS Education & Prevention, 18(3), pp.207-214.
“The 171 participants who did not return for their two follow-up visits represent a significant attrition rate (34%). A comparison of demographic and baseline measures indicated that [those who stayed in the study versus those who did not] differed on age, BMI, when diagnosed, language, ethnicity, HbA1c, PCS, MCS and symptoms of depression (CES-D).”
Source: Maljanian R, Grey N, Staff I, Conroy L. (2005) Intensive telephone follow-up to a hospital-based disease management model for patients with diabetes mellitus. Disease Management, 8(1), p. 18.
39. Back to Inference….
40. Motivation Typically you want to see if there are differences between groups (i.e., Treatment vs. Control)
Approach this by looking at “typical” or “difference on average” between groups
Thus we look at differences in central tendency to quantify group differences
Test if two sample means are different (assuming same variance) in experiment
42. Central Limit Theorem The CLT states that regardless of the distribution of the original data, the average of the data is Normally distributed
Why such a big deal?
Allows for hypothesis testing (p-values) and CI’s to be estimated
43. Central Limit Theorem If a random sample is drawn from a population, a statistic (like the sample average) follows a distribution called a “sampling distribution”.
CLT tells us the sampling distribution of the average is a Normal distribution, regardless of the distribution of the original observations, as the sample size increases.
45. What is the P-value? The p-value represents the probability of getting a test statistic as extreme or more under the null hypothesis
That is, the p-value is the chances you obtained your data results under the assumption that your null hypothesis is true.
If this probability is low (say p<0.05), then you conclude your data results do not support the null being true and “reject the null hypothesis.”
46. Hypothesis Testing & P-value P-value is: Pr(observed data results | null hypothesis is true)
If P-value is low, then conclude null hypothesis is not true and reject the null (“in data we trust”)
How low is low?
47. Statistical Significance If the P-value is as small or smaller than the pre-determined Type I error (size) ?, we say that the data are statistically significant at level ?.
What value of ? is typically assumed?
50. Why P-value < 0.05 ? This arbitrary cutoff has evolved over time as somewhat precedent.
In legal matters, courts typically require statistical significance at the 5% level.
51. The P-value The P-value is a continuum of evidence against the null hypothesis.
Not just a dichotomous indicator of significance.
Would you change your standard of care surgery procedure for p=0.049999 vs. p=0.050001?
52. Gehlbach’s beefs with P-value Size of P-value does not indicate the [clinical] importance of the result
Results may be statistically significant but practically unimportant
Differences not statistically significant are not necessarily unimportant ***
53. Any difference can become statistically significant if N is large enough
Even if there is statistical significance is there clinical significance?
54. Controversy around HT and P-value “A methodological culprit responsible for spurious theoretical conclusions”
(Meehl, 1967; see Greenwald et al, 1996)
“The p-value is a measure of the credibility of the null hypothesis. The smaller the P-value is, the less likely one feels the null hypothesis can be true.”
55. HT and p-value “It cannot be denied that many journal editors and investigators use P-value < 0.05 as a yardstick for the publishability of a result.”
“This is unfortunate because not only P-value, but also the sample size and magnitude of a physically important difference determine the quality of an experimental finding.”
56. HT and p-value “[We] endorse the reporting of estimation statistics (such as effect sizes, variabilities, and confidence intervals) for all important hypothesis tests.”
Greenwald et al (1996)
57. Test Statistics Each hypothesis test has an associated test statistic.
A test statistic measures compatibility between the null hypothesis and the data.
A test statistic is a random variable with a certain distribution.
A test statistic is used to calculate probability (P-value) for the test of significance.
58. How a P-value is calculated A data summary statistic is estimated (like the sample mean)
A “test” statistic is calculated which relates the data summary statistic to the null hypothesis about the population parameter (the population mean)
The observed/calculated test statistic is compared to what is expected under the null hypothesis using the Sampling Distribution of the test statistic
The Probability of finding the observed test statistic (or more extreme) is calculated (this is the P-value)
59. Hypothesis Testing Set up a null and alternative hypothesis
Calculate test statistic
Calculate the P-value for the test statistic
Based on P-value make a decision to reject or fail to reject the null hypothesis
Make your conclusion
60. Errors in Statistical Inference
61. The Four Possible Outcomesin Hypothesis Testing
62. The Four Possible Outcomesin Hypothesis Testing
63. Type I Errors
64. Type II Errors
65. P-value adjustments
66. P-value adjustments Sometimes adjustments for multiple testing are made
Bonferroni a = (alpha) / (# of tests)
alpha is usually 0.05 (P-value cutoff)
Bonferroni is a common (but conservative) adjustment; many others exist
67. P-value adjustments (example) “An alpha of .05 was used for all statistical tests. The Bonferroni correction was used, however, to reduce the chance of committing a Type I error. Therefore, given that five statistical tests were conducted, the adjusted alpha used to reject the null hypothesis was .05/5 or alpha = .01.”
Source: Cumming-McCann A. (2005) An investigation of rehabilitation counselor characteristics, white racial attitudes, and self-reported multicultural counseling competencies. Rehabilitation Counseling Bulletin, 48(3), 170-171.
70. Confidence Intervals INTERNAL USE FIG. 01s03f03
INTERNAL USE FIG. 01s03f03
73. Bayesian vs. Classical Inference There are 2 main camps of Statistical Inference:
Frequentist (classical) statistical inference
Bayesian statistical inference
Bayesian inference incorporates “past knowledge” about the probability of events using “prior probabilities”
Bayesian paradigm assumes parameters of interest follow a statistical distribution of their own; Frequentist inference assumes parameters are fixed
Statistical inference is then performed to ascertain what the “posterior probability” of outcomes are, depending on:
the data
the assumed prior probabilities
74. Schedule