620 likes | 744 Views
A Unified Approach for Assessing Agreement. Lawrence Lin, Baxter Healthcare A. S. Hedayat, University of Illinois at Chicago Wenting Wu, Mayo Clinic. Outline. Introduction Existing approaches A unified approach Simulation studies Examples. Introduction.
E N D
A Unified Approach for Assessing Agreement Lawrence Lin, Baxter Healthcare A. S. Hedayat, University of Illinois at Chicago Wenting Wu, Mayo Clinic
Outline • Introduction • Existing approaches • A unified approach • Simulation studies • Examples
Introduction • Different situations for agreement • Two raters, each with single reading • More than two raters, each with single reading • More than two raters, each with multiple readings • Agreement within a rater • Agreement among raters based on means • Agreement among raters based on individual readings
Existing Approaches (1) • Agreement between two raters, each with single reading • Categorical data: • Kappa and weighted kappa • Continuous data: • Concordance Correlation Coefficient (CCC) • Intraclass Correlation Coefficient (ICC)
Existing Approaches (2) • Agreement among more than two raters, each with single reading • Lin (1989): no inference • Barnhart, Haber and Song (2001, 2002): GEE • King and Chinchilli (2001, 2001): U-statistics • Carrasco and Jover (2003): variance components
Existing Approaches (3) • Agreement among more than two raters, each with multiple readings • Barnhart (2005) • Intra-rater/ inter-rater (based on means) /total (based on individual observations) agreement • GEE method to model the first and second moments
Unified Approach • Agreement among k (k≥2) raters, with each rater measures each of the n subjects multiple (m) times. • Separate intra-rater agreement and inter-rater agreement • Measure relative agreement, precision, accuracy, and absolute agreement, Total Deviation Index (TDI) and Coverage Probability (CP)
Unified Approach - summary • Using GEE method to estimate all agreement indices and their inferences • All agreement indices are expressed as functions of variance components • Data: continuous/binary/ordinary • Most current popular methods become special cases of this approach
Unified Approach - model • Set up • subject effect • subject by rater effect • error effect • rater effect
Unified Approach - targets • Intra-rater agreement: • overall, are k raters consistent with themselves? • Inter-rater agreement: • Inter-rater agreement (agreement based on mean): overall, are k raters agree with each other based on the average of m readings? • Total agreement (agreement based on individual reading): overall, are k raters agree with each other based on individual of the m readings?
Unified Approach – agreement(intra) • : for over all k raters, how well is each rater in reproducing his readings?
Unified Approach – precision(intra) and MSD • : for any rater j, the proportion of the variance that is attributable to the subjects (same as ) • Examine the absolute agreement independent of the total data range:
Unified Approach – TDI(intra) • : for each rater j, % of observations are within unit of their replicated readings from the same rater. is the cumulative normal distribution is the absolute value
Unified Approach – CP(intra) • : for each rater j, of observations are within unit of their replicated readings from the same rater
Unified Approach – agreement(inter) • : for over all k raters, how well are raters in reproducing each others based on the average of the multiple readings?
Unified Approach – precision(inter) • : for any two raters, the proportion of the variance that is attributable to the subjects based on the average of the m readings
Unified Approach – accuracy(inter) • : how close are the means of different raters:
Unified Approach – TDI(inter) • : for overall k raters, % of the average readings are within unit of the replicated averaged readings from the other rater.
Unified Approach – CP(inter) • : for each rater j, of averaged readings are within unit of replicated averaged readings from the other rater
Unified Approach – agreement(total) • : for over all k raters, how well are raters in reproducing each others based on the individual readings?
Unified Approach – precision(total) • : for any two raters, the proportion of the variance that is attributable to the subjects based on the individual readings
Unified Approach – accuracy(total) • : how close are the means of different raters (accuracy)
Unified Approach – TDI(total) • : for overall k raters, % of the readings are within unit of the replicated readings from the other rater.
Unified Approach – CP(total) • : for each rater j, of readings are within unit of replicated readings from the other rater
Unified Approach is the inverse cumulative normal distribution is a central Chi-squre distribution with df=1
Estimation and Inference • Estimate all means, variance components, and their variances and covariances by GEE method • Estimate all indices using above estimates • Estimate variances of all indices using above estimates and delta method
Estimation and Inference (2) : the covariance of two replications, and ,with coming from rater and coming from rater
Estimation and Inference (3) : the variance from each combination of (i, j), i.e., each cell. Thus is the average of all cells’ variances.
Estimation and Inference (4) : the variance of replication of rater : the covariance of two replications, and , both of them coming from rater .
Estimation and Inference (5) • Using GEE method to estimate all indices through estimating the means and all variance components:
Estimation and Inference (8) • is the working variance-covariance structure of , “working” means assume following normal distribution • is the derivative matrix of expectation of with respective to all the parameters
Estimation and Inference (9) • GEE method provides: • estimates of all means • estimates of all variance components • estimates of variances for all variance components • Estimates of covariances between any two variance components
Estimation and Inference (10) • Delta method is used to estimate the variances for all indices
Estimation and Inference (18) • Transformations for variances • Z-transformation: CCC-indices and precision indices • Logit-transformation: accuracy and CP indices • Log-transformation: TDI indices
Simulation Study • three types of data: binary/ordinary/normal • three cases for each type of data • k=2, m=1 / k=4, m=1 / k=2, m=3 • for each case: 1000 random samples with sample size n=20 • for binary and ordinary data: inferences obtained through transformation vs. no-transformation • For normal data: transformation
Simulation Study (2) • Conclusions: • Algorithm works well for three types of data, both in estimates and in inferences • For binary and ordinary data: no need for transformation • For normal data, Carrasco’s method is superior than us, but for categorical data, our is superior. • For ordinal data, both Carrasco’s method and ours are similar.
Example One • Sigma method vs. HemoCue method in measuring the DCHLb level in patients’ serum • 299 samples: each sample collected twice by each method • Range: 50-2000 mg/dL
Example One – HemoCue method HemoCue method first readings vs. second readings
Example One – Sigma method Sigma method first readings vs. second readings
Example One – HemoCue vs. Sigma HemoCue’s averages vs. Sigma’s averages
Example One – analysis result (2) *: for all CCC, precision, accuracy and CP indices, the 95% lower limits are reported. For all TDI indices, the 95% upper limit are reported.