Remarks on uncertainty calculations for key comparisons with a few examples from CCEM key comparisons

Remarks on uncertainty calculations for key comparisons with a few examples from CCEM key comparisons Thomas J. Witt, Bureau International des Poids et Mesures (BIPM)

Purpose: • Call attention to some common errors seen in the statistical analysis of CCEM key comparison results • Bad analysis slows the processing of key comparisons and can reduce the credibility of the key comparison scheme • Intended for: • Authors of key comparison reports • Participants in key comparisons • Reviewers of CCEM key comparison reports

Outline • Motivation: Appendix B of the MRA; degrees of equivalence with respect to the KCRV and between pairs of participants • Underlying concept; covariance • Application to key comparisons, KCRV is weighted mean • general case with correlations among results • Mutually independent results • example CCEM-K4 (10 pF) • KCRV is weighted mean with equal weights • KCRV is unweighted mean with unequal variances • Examples of treatment of correlations • QHE-derived results in CCEM-K4 • Correlations via PTB calibrations in CCEM-K6a (ac/dc)

Degrees of equivalence with respect to the KCRV Pair wise degrees of equivalence

Underlying concept: covariance General formulas for covariance: The operator E is the expectation operator; e.g., (1) (2) If a, b and c are constants and x, y and z are variables: (3)

(4) In GUM notation and from (13) of GUM section 5.2.2, if q = y - z = f (5)

Applications to key comparisons; key comparison reference value (KCRV) is the weighted mean • Weights, gi, are proportional to the reciprocal of the variance and normalized. • the experimental standard deviation is denoted by s • General case: mutual correlations assumed (6) This relation will be described shortly. (7)

Illustration: Assume only two participants in a key comparison and that their results are correlated Illustration: for three participants in a key comparison and for correlated results

In general, for n participants in a key comparison with mutually correlated results: Variance of the weighted mean of n mutually correlated results (8)

Applications to key comparisons (KCRV is weighted mean) • B. Next assume mutually independent results; i.e. cov(xi,xj)=0 for alli, j Variance of the weighted mean of n mutually independent results (9)

C. Uncertainty in the degree of equivalence between the value, xi, of a participant whose (independent) result contributed to KCRV and the KCRV itself. KCRV is weighted mean, of the results from all participants having independent reference standards. Since laboratory i contributes to the KCRV, its value is correlated with KCRV. Example: key comparison CCEM-K4 (10 pF capacitance). (10) Since xi and xj are uncorrelated, cov(xi,xi) = 0 if ij . Then (11)

Variance of difference from weighted mean, contributor to weighted mean, mutually independent results (12) • Discussion: • Simple, easy to remember • cannot directly generate the variance of the pair wise degrees of equivalence between two participants, var(xi- xj), from the variances of the degrees of equivalence with respect to the KCRV. • May be some contestation if one participant’s uncertainty is small enough to dominate the KCRV. In that case possible solutions are: • use that participant’s value to define the KCRV; • set a “state of the art” uncertainty value defining the minimum acceptable uncertainty. • It is not always true that results from all participants in a key comparison contribute to the KCRV (e.g., CCEM-K4)

KCRV is weighted mean with equal weights: • For mutually independent results and assuming equal weights for all participants, the mean is , si = s for all i and (12) yields: Equal weights, mutual independence. (13) since for equal weights:

If • KCRV is unweighted mean with unequal weights: • It could possibly be decided to use an unweighted mean as the KCRV but to calculate its variance with unequal weights. (14) In general (15) and successive applications give: (16) Unweighted mean, unequal weights, mutual independence. (17)

A common error is to confuse the above expression with the familiar expression for the standard deviation of the mean of n independent, identically distributed observations for which . In that case Expressed as standard deviations, for an unweighted mean with unequal weights (i.e., unequal standard deviations) and mutually independent results (18) Note factor in denominator is n, notn1/2 ! (19)

Continuing with the case of an unweighted mean with unequal weights, if the result from participant i contributes to the KCRV, , then and (20) The results from all participants are assumed to be mutually independent so that and (21)

so that (22) It was just shown that (17) so that, finally, Unweighted mean, unequal weights, mutual independence. (23) For example, for n = 3, this gives Check:

Examples of treatment of correlated results: • CCEM-K4 (10 pF); defined from weighted mean of participants having independent link to calculable capacitor. BIPM, NPL and BNM had links through the QHR combined with CODATA value of RK-90 in terms of the ohm, derived from the link between the ohm and the farad. • In calculating for these participants, the uncertainty, u(RK-90) is included. • When calculating var(xi-xj), the variance of the pair wise degree of equivalence for any pair of these three participants, the uncertainty u(RK-90) is “removed from the uncertainty budgets” of both i and j as is shown formally by Here the covariance is written in terms of the correlation coefficient, r (=1), and the product of the standard deviations associated with the correlated terms.

Another example of correlated results: • Correlations are common in EUROMET comparisons because some participants may have defined their reference standards via calibrations from a major NMI. • Example: CCEM-K6a (ac/dc difference). Consider uncertainty in pairwise degree of equivalence between two such participants who list rather large type-B uncertainties, ui(cal) and uj(cal) associated with the calibration of their standards at the PTB. • When calculating var(xi-xj), the variance of the pair wise degree of equivalence for any pair of these participants, the effect of the correlation may be treated as follows: where the covariance is again written in terms of a correlation coefficient =1, and the product of the standard deviations associated with the correlated terms.

Example: CCEM-K6a (ac/dc difference)…continued Accounting for such correlations in the analysis of CCEM-K6a was controversial; some participants thought it excessively lowers pair-wise degrees of equivalence. To resolve this issue the CCEM agreed to forego listing the pairwise degrees of equivalence.

Conclusions and recommendations: • We are on a “learning curve” in the statistical analysis of key comparisons; this slows down agreement of results for Appendix B but we’ll get better! • It is important to consider correlations, particularly in EUROMET and other RMO comparisons. • In general, one cannot generate a table of uncertainties in pair wise degrees of equivalence var(xi - xj) from the column of uncertainties with respect to the KCRV, var(xi - xKCRV). • Care should be used in applying statistical expressions, particularly the “standard deviation of the mean”. • In situations where uncertainty analysis seems to be intractable, consider the possibility of making simplifying assumptions, provided, of course, that they are stated in the report; the CCEM is flexible

Remarks on uncertainty calculations for key comparisons with a few examples from CCEM key comparisons