410 likes | 556 Views
Session 8: Paired Samples. (Zar, Chapter 9,24). General: One population of subjects: x 1 , x 2 , …, x n, but a pair of data points on each. Examples: Before and after treatment Left and right Evaluator 1 and Evaluator 2 on the same subject Method 1 vs. Method 2. Paired t-test:. If.
E N D
Session 8: Paired Samples (Zar, Chapter 9,24)
General: One population of subjects: x1, x2, …, xn, but a pair of data points on each. Examples: Before and after treatment Left and right Evaluator 1 and Evaluator 2 on the same subject Method 1 vs. Method 2
If If Two-Sided Test: One-Sided Tests:
Example 9.1: • H0: Hindleg length=Foreleg length • HA:
Sign Test: Chapter 24.7 • H0: same # increases as decreases • HA: increases ≠ decreases • HA1: increases < decreases • HA2: increases > decreases • Form • S+ = # positive signs • S- = # negative signs
For • H0: increases = decreases • HA: increases ≠ decreases • Compare: Min{S+, S-} ≤ Table B.27[n*, a(2)], reject H0 Note: n* = # pos + # neg = S+ + S- Do not include zeros! Sign Test is the same as the test: • where P+ = true proportion of positives
One-Sided Tests: • If S from table ≤ B.27[n*,a(1)], reject Ho • Note: n* = # pos + # neg = S+ + S- • Do not include zeros! • Some statisticians would include zeros if one-sided • as the zeros represent non-support for the alternative
Example 24.11 • The sign test for the paired-sample data of • Examples 9.1 and 9.3
H0: No difference between hindleg and foreleg length. • HA: Difference between hindleg and foreleg length. n* = 10; S+=8;S-=2; B.27[a(2),10]=B.27[0.05(2),10]=1 Therefore, Accept H0 • Using Table B.26b for n=10 and p=0.5, • Since the probability is greater than 0.05, • do not reject H0. 1-Sided Test:
Wilcoxon Signed-Rank Test: • H0 : Ranks decreases = Ranks increases • HA : decreases ≠ increases • HA1: decreases< increases • HA2 : decreases > increases
Rank the data (di’s) without regard to “sign”, • from smallest to largest including ties as in • the Mann-Whitney. • Form • T+ = sum of + ranks. • T-= sum of – ranks. If n*100, Use Table B.12 (App 101): For n*= number of non-zero differences (n*=n++n-). For HA: decreases increases: Min{T+,T-}≤B.12[a(2),n*], reject H0.
Example 9.3 • The Wilcoxon paired –sample test applied to the data • of Example 9.1. • H0: Deer hindleg length is the same as foreleg length. • HA: Deer hindleg length is not the same as foreleg length.
n = 10 • T+ =4.5 + 4.5 + 7 + 7 + 9.5 + 7 + 9.5 + 2 = 51 • T- = 3 + 1 = 4 • Min{4,51}=4 • From Table B.12: T0,05(2), 10 = 8 • Since T- < T0.05(2), 10.H0 is rejected. • 0.01< P(T- or T+ ≤ 4) < 0.02 H0: ranks+ = ranks- HA: ranks+ ≠ ranks- If Min{T+, T-} ≤ Table B.12[a(2),n*]=Ta(2),n*, reject H0
Note: and One-Sided Tests: If we use x1- x2 = d For one-tailed testing we use one-tailed critical values from Table B.12 and either T+ or T- as follows.
For the hypotheses H0: Measurements in reading 1≤ measurements in reading 2 and HA2 : Measurements in reading 1>measurements in reading 2 Decrease 12 For the opposite hypotheses: H0: Measurements in reading 1≥ measurements 2 and HA1: Measurements in reading 1 < measurements in reading 2. Increase 12
If we use x2 – x1: Normal Approximation: No Ties:
diff=x1-x2 For HA: 1 ≠ 2 use either T- or T+ for T. If x2 - x1, reverse the sides. If Z > Ka(sides), reject Ho (Table B.2)
For Ties: For Zero adjustment: (m= #Zeros)
McNemar’s Test: • Analysis of Preference Tests • or “which do you like better – Coke or Pepsi?” • Many Product tests use this technique: Example 9.4 Comparison of Lotions: • H0: The proportion of persons experiencing relief is • the same with both lotions. • HA: The proportion of persons experiencing relief is • not the same with both lotions.
Principle: (Relief, Relief) and • (No Relief, No Relief) • give no information as to which is better! Under Ho, f12 and f21 estimate the same quantity: • Observed f12 f21 Total
estimated value • Degrees of • Freedom 1 + 1 = 2-1 =1 Test: Chi-Square > c2a(1), Reject H0.
Biomedical Applications • 1) Examiner vs Examiner
Comparing Against Truth: (The diagnostic test) Ex: Test: X-ray, MR, CT, CEA, PSA, TGF, … Truth: Pathology (Biopsy, FNA, Surgical section), Time and observation, Panel of experts -- (The Gold Standard)
Other names and parameters: True Positive fraction = TPF = sensitivity True Negative fraction = TNF = specificity False Positive fraction = FPF = 1-TNF False Negative fraction = FNF = 1-TPF Positive Predictive Value = positive accuracy Negative Predictive Value = negative accuracy
Comparisons: Two sensitivities from two different studies: Two sensitivities from same study Select only True Positive or True Negative: Compare to each other
In Summary: 1) Individual McNemar chi-squares 2) Above versus Below -- 1 d.f. Heterogeneity chi-square = Individual-above vs below (1-2) d.f.=#chisquares-1
Ex: Mildness Study Exam 1: after cleaning Exam 2: one month later
Hypothesis Chi-Square D.F. Ho: f12=f21 6.68 1 0.00098 Ho: f13=f31 0.605 1 0.437 Ho: f23=f32 0.600 1 0.439 Total 7.88 3 0.048 H0: below=above 3.62 1 0.057 Heterogeneity 4.27 2 0.119 Conclusions: f12 f21 Above versus Below not significantly different. No Heterogeneity -- Homogeneous.
5) Rating Scale Data • Comparison of Rater to “truth” • Examples: • a) Diagnostic Radiology Systems • Diagnostic value of MR, Xerox, and screen film in detection of Breast Cancer • b) Pathology • Comparison of staining systems to predict relapse (early vs. late) • Monoclonal stains or Micro-satellite probes to predict stage of cancer. • c) Laboratory Medicine • 1) Comparison of machine classification of cells • d) Training • 1) Comparison of novice to standard diagnosis
a) Often created from raters looking “blinded” at packets of cases. b) Easy to set up but requires “truth” from (1) another method, (2) gold standard (3) team of raters.
ROC (Receiver Operating Characteristic) Analysis • (a) Calculate 2 x 2 Tables: • 1: Make Cut point after “Very Likely”
Decide Abnormal Decide Normal
Decide Abnormal Decide Normal And so on to get:
(a) Plot the following points to create an ROC Curve: (0 ,0) 1: (FP1,TP1) 2: (FP2,TP2) 3: (FP3,TP3) 4: (FP4,TP4) (1 ,1)
22 and kk tables • McNemar tests on Likert scales: • (1) Pairwise • (2) Pooled (above vs below) • (3) Heterogeneity chi-square