1 / 70

An Introduction to the Statistics of Uncertainty

An Introduction to the Statistics of Uncertainty. Tom LaBone April 17, 2009 SRCHPS Technical Seminar MJW Corporation University of South Carolina. Introduction. All physical measurements must be reported with some quantitative measure of the quality of the measurement

Download Presentation

An Introduction to the Statistics of Uncertainty

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Introduction to the Statistics of Uncertainty Tom LaBone April 17, 2009 SRCHPS Technical Seminar MJW Corporation University of South Carolina

  2. Introduction • All physical measurements must be reported with some quantitative measure of the quality of the measurement • needed to decide if the measurement is suitable for a particular purpose • The concept of “uncertainty” was developed in metrology to partially fill this need • The US Guide to the Expression of Uncertainty in MeasurementANSI/NCSL Z540-2-1997 (R2007) provides guidance on calculating and reporting the uncertainty in a measurement • US version of the ISO guide • referred to as the “GUM”

  3. Overview • Illustrate the use of the GUM methodology using a relatively simple physical system • Combined Standard Uncertainty • Type A uncertainty only • Probability Distributions • Expanded Uncertainty • Monte Carlo Methods • Type B Uncertainty • Student’s t Distribution

  4. Example • The SRS Health Physics Instrument Calibration Laboratory “sells” its radiation fields as a product • The uncertainty attached to a radiation field helps the customer decide if the “product” is suitable for their application • This is a rather involved case for such a short talk, so let us work with a less complex example

  5. What is the density of the cube? • Measure the height, width, length, and mass of the cube • Calculate the density r using this formula Single measurements

  6. Measurand • The measurands we directly measure (mass and dimensions) are called input quantities • The measurand we calculate (the density) is called the output quantity • In this discussion the input quantities are assumed to be uncorrelated • e.g., the measurement of the height does not influence the measurement of the length

  7. Variability • If we repeated the measurements again would we expect to see exactly the same result? • Our measurements of dimension and mass will exhibit variability • if we measure the “same thing” repeatedly we are likely get a range of answers that vary in a seemingly random fashion

  8. Why Do Measurements Vary? • Every measurement is influenced by a multitude of quantities that are not under our control and of which we may not even be aware (influence quantities) • Random effects • Measurements also vary because the measurand is not and cannot be specified in infinite detail • For example, I did not specify how the linear measurements of the cube should be made

  9. Errors • Using the input and output quantities we have defined the “true” value of the density • The error in a measurement is defined as • The true value and hence the error are unknowable, but errors can be classified by how they influence the measurement • random and systematic errors error = measured value of density – “true” value of density

  10. Types of Errors • Random errors result from random effects in the measurement • the magnitude and sign of a random error changes from measurement to measurement • measurements cannot be corrected for random errors • …but random errors can be quantified and reduced • Systematic errors results from systematic effects in the measurement • the magnitude and sign of a systematic error is constant from measurement to measurement • measurements can be corrected for known systematic errors • …but the correction introduces additional random errors

  11. What can we do about random errors? • Law of Large Numbers • If you repeat measurements many times and take the mean, this sample mean is a good estimator of the true population mean and is taken to be the best estimate of the thing we defined as the measurand • Plug the sample means into the equation to obtain the best estimate of r sample means

  12. Repeated Measurements Sample Mean central tendency

  13. Precision of Result • Precision is the number of digits with which a value is expressed • The calculations here were performed to the internal precision of the computer (~16 digits) • The density is arbitrarily presented with 9 digits of precision • In which digit do we lose physical significance? ?

  14. Uncertainty • “…parameter associated with the result of a measurement that characterizes the dispersion of the values that could reasonably be attributed to the measurand….” • an interval that we are reasonably confident contains the true value of the measurand • the terms “random” and “systematic” are used with the term “error” but not with the term “uncertainty” • associated with the measurement, not the measurement process

  15. Evaluation of Uncertainty • Type A evaluation of uncertainty • evaluation of uncertainty by the statistical analysis of repeated measurements • called Type A uncertainty • Type B evaluation of uncertainty • evaluation of uncertainty by any other method • called Type B uncertainty

  16. Repeated Measurements dispersion central tendency Sample mean sample standard deviation

  17. Standard Uncertainty of Inputs • The sample standard deviations is a term in statistics with a precise meaning • In metrology the analogous term is standard uncertaintyu • For Type A evaluations the standard deviation is the standard uncertainty • This may not true for Type B evaluations

  18. Significant Digits • Report uncertainty to 2 digits • round to even number if the last digit is 5 • Round the measurement to agree with the reported uncertainty

  19. Uncertainty in Density • We have calculated the standard uncertainty in the input quantities (length, mass, etc) • How do we get the standard uncertainty in the output quantity (density)? • the combined standard uncertainty • Propagation of uncertainty

  20. Combined Standard Uncertainty sensitivity coefficient (often abbreviated as “c”) Given a small change in the length of the cube how much does the density change? Units must match up properly!

  21. Standard Deviation of the Mean describes how repeated estimates of the mean are scattered around their grand mean (mean of the means) describes how individual measurements are scattered around their mean

  22. Which Standard Deviation Should We Use? • Sample standard deviation • If you want to describe how individual measurements are scattered about their mean • Standard deviation of the mean • If you want to describe how multiple estimates of the mean are scattered about their grand mean • also called the standard error of mean • We need to use the standard deviation of the mean in the error propagation

  23. Combined Standard Uncertainty Type A uncertainty only r = 1.46663 g/cm3 with a combined standard uncertainty uc = 8.8 x 10-4 g/cm3

  24. Where We Are • We have calculated the density and its combined standard uncertainty (Type A uncertainty only) • Next, we want to • calculate the expanded uncertainty and • address the Type B uncertainty • But, we need to discuss probability distributions and other such things first

  25. Probability Distributions • Up to this point we have described our data with • the mean (central tendency) • the standard deviation (dispersion) • The mean and standard deviation do not uniquely specify the data • Use a mathematical model that defines the probability of observing any given result • probability density function (pdf)

  26. Uniform (Rectangular) PDF m=5 otherwise a = 1 b = 9 m-s m+s a=1 b=9

  27. Rectangular PDF Notation • f(x) is the rectangular probability density function • the value of the pdf is not the probability • the area under the pdf is probability • note that f(x) has units – probability has no units • m is the population mean • s is the population standard deviation • a is the lower bound of the distribution • a is a parameter in the pdf • the probability of observing a value of x less than a is zero • b is the upper bound of the distribution • b is a parameter in the pdf • the probability of observing a value of x greater than b is zero

  28. P(x < m-1s) =0.2113249 P(x < m+1s) =0.7886751 Probability The area under the curve

  29. Normal PDF m=5 m-s m+s The population parameters are the parameters in the pdf – this is unusual

  30. P(x < m-1s) =0.1586553 P(x < m+1s) =0.8413447 Probability The area under the pdf curve

  31. Normal vs Rectangular P(x < m+1s) =0.7886751 P(x < m+1s) =0.8413447 same mean and standard deviation

  32. Sample Statistics and Population Parameters No matter what the probability distribution is, the sample mean and standard deviation are the best estimates (based on the observed data) of the population mean and standard deviation Sample Statistics Population Parameters

  33. Random Numbers 1000 numbers drawn at random from the rectangular distribution 1000 numbers drawn at random from the normal distribution

  34. Uses of PDFs • We use the rectangular pdf to describe a random variable that is bounded on both sides and has the equal probability of appearing anywhere between the bounds • The normal distribution has a special place in statistics because of the Central Limit Theorem

  35. Central Limit Theorem • As the sample size N gets “large”, the mean of a sample will be normally distributed regardless of how the individual values are distributed • Theorem provides no guidance on what “large” is • The standard deviation of the mean (aka the standard error of the mean) is equal to

  36. So What? • No matter what probability distribution you start with, if the sample is large enough the means of data drawn from that distribution are normally distributed • What are the practical implications of this? • All the input quantities (length, etc) are means • The input quantities are normally distributed • The output quantity (density) is normally distributed

  37. Normal Probabilities m-1s m+1s m-1.96s m+1.96s The area under the normal curve between m-1s and m+1s = 0.6826895 The area under the normal curve between m-1.96s and m+1.96s = 0.95

  38. Expanded Uncertainty • It is often desirable to express the uncertainty as an interval around the measurement result that contains a large fraction of results that might reasonably be observed • This is accomplished by using multiples of the standard uncertainty • the multiplier is called the coveragefactor

  39. Intervals • Confidence interval • interval constructed with standard deviations from known probability distributions • the interval has an exact probability of covering the mean value of the measurand • Coverage interval • interval constructed with uncertainties • the interval does not have an exact probability of covering the mean value of the measurand • only an approximation • uses a coverage factor rather than a standard normal quantile (e.g., the 1.96) • coverage factor of 2 (~95%) or 3 (~99%) is typically used

  40. Expanded Uncertainty for Density expanded uncertainty, Type A uncertainty only 68% Confidence Interval 95% Confidence Interval 95% Coverage Interval

  41. Monte Carlo Methods • Evaluation of measurement data – Supplement 1 to the “Guide to the expression of uncertainty in measurement” – Propagation of distributions using a Monte Carlo method OIML G 1-101 (2008) • Run a statistics experiment using random numbers

  42. Calculate the sample mean and standard deviations of the 106 densities 95% empirical CI

  43. Implementation in R for (i in 1: (10^6)) { d[i] <- rnorm(1,M,s.M) / (rnorm(1,L,s.L) * rnorm(1,W,s.W) * rnorm(1,H,s.H)) } quantile(d,probs=c(0.025,0.975)) mean(d) sd(d) draw a random length, width, and height draw a random mass calculate a density calculate the empirical 95% confidence interval, mean and standard deviation

  44. Advantages of Monte Carlo • Intuitive • Set up the experiment in the computer just like it occurs in the lab • Able to handle • very complex problems • asymmetric probability distributions • No need to mess with the t distribution or effective degrees of freedom • you will see what I am talking about shortly

  45. Type B Uncertainty Assessment • Calipers • Used to measure length, height, and width • “Accuracy” of ± 0.02 mm (± 0.002 cm) for measurements <100 mm • Scale • Used to measure “mass” • “Accuracy” of ± 0.0001 gram • What do they mean by “accuracy” and how do I use this information?

  46. Calipers • The “accuracy” of ± 0.002 cm is taken to mean that if I moved the calipers from 1.500 cm to 1.502 cm the reading could be anywhere from 1.500 cm to 1.504 cm • Assume rectangular distribution with an upper limit of X+0.002 cm and a lower limit of X – 0.002 cm • The standard uncertainty of this distribution is

  47. Scale • The “accuracy” of ± 0.0001 gram is taken to mean that if the weight increased from 5.0000 grams to 5.0001 grams the reading could be anywhere from 5.0000 grams to 5.0002 grams • Assume rectangular distribution with an upper limit of X+0.0001 grams and a lower limit of X – 0.0001 grams • The standard uncertainty of this distribution is

  48. Standard Uncertainty for Length • Combine the Type A and Type B uncertainties in quadrature (i.e., add the variances) Includes Type A and Type B uncertainties The notation uc(L) is used here to indicate the uncertainty includes Type A and B uncertainties

  49. Combined Standard Uncertainty for Density Type A and B uncertainty r = 1.4666 g/cm3 with a combined standard uncertainty uc = 2.1 x 10-3 g/cm3

More Related