290 likes | 491 Views
Today’s lesson. Case study involving a quality control (QC) application using the material that we have covered to date. In class demonstration of how to use SPSS to get the numbers out. Discussion of meaning of results. Today’s lesson.
E N D
Today’s lesson • Case study involving a quality control (QC) application using the material that we have covered to date. • In class demonstration of how to use SPSS to get the numbers out. • Discussion of meaning of results.
Today’s lesson • Confidence interval for the mean of a normal distribution using standard normal and t-distribution. • On Thursday, we finish Chapter Eleven with one sample t and z tests and a review (yet again) of the structure of tests of statistical hypotheses.
Case Study • QC application in which the objective is to specify a probability model for the “manufacturing” process. • That is, specify the A part in an A vs. B comparison. • See Finch et al., Statistics in Medicine, 1999, pages 1279-1289.
Context • Application is the filtering of red blood to remove white blood cells. • Your white blood cells in your blood are good. • My white blood cells in the unit of blood that I donated to be transfused to you are bad for you. • Transfusion reaction • Possible infection vector
Practical Context • Health regulations are that a unit of filtered blood cannot contain more than so many “residual white blood cells” (RWBC). • US standard is 5x106 rwbc in a unit. • European standard is 1x106 rwbc in a unit.
Practical Context • Measured RWBC is product of three numbers. • Constant scaling factor • volume of blood donated in the unit • Nageotte count, which is the number of white blood cells observed in a small volume of sampled filtered blood.
First Question • What should be the dependent variable monitored in the QC application? • Answer is to use the Nageotte count rather than the RWBC. • American standard is then a Nageotte count of 167. • European standard is then a Nageotte count of 33.
Components of Variance • Here used Fisher’s fundamental idea of finding “components of variance.” • Specifically, variance of RWBC has a non-quality related component of variance from the variation of the volume of blood in the donated unit.
Second Question • What actually happened in the QC process when the manufacturer’s staff did the work? • That is, use the descriptive options in a statistical package to describe the data.
Getting numbers out • Enter SPSS • Access correct file (CAREFUL, CAREFUL, CAREFUL!!!) • Statistics menu • Descriptive submenu • Frequencies option
Third Question • Then, specify a probabilistic model that fits the data reasonably well so that predictions can be made. • Nageotte count variable is a ratio scale of measurement. • Nageotte count is discrete, not continuous. • Variance is very much larger than the mean. • Hence focused on the negative binomial distribution (NBD).
Fourth Question • ASS-U-ME a negative binomial distribution. • How well does it fit the data observed? • Use a goodness of fit test (we won’t cover this in detail until after your first exam).
Interpretation • One observation violated American, and two violated European rule. • NBD model does not predict maximum observed value of 205. • NBD model does not fit well but captures the rough order of variation (up to a Nageotte count of 33). • Choose a nonparametric test procedure because null distribution is not obvious.
My Most Common Three Mistakes in Making Predictions • Eliminate “outliers” from the historical data that I am using to make my prediction. • Predict a ratio scale variable without anticipating that the variance of the variable will increase when the mean increases. • ASS-U-ME independence of observations when predicting a time series with autocorrelation.
Fifth Question • How big of a sample is necessary to determine whether a user of this product is “in control.” • Simulation study suggested that under optimistic conditions 20 is a minimal sample size but that 80 may be required. • Client’s practice has evolved to use about 50.
Chapter Eleven: Testing a Hypothesis about a Single Mean • Definition of Student’s t Distribution • Using tables of Student’s t distribution. • Using the observed significance level from Student’s t. • Using the confidence limit from Student’s t.
Historical Background of Student’s t • Origin is quality control in the brewing industry (Guinness). • How can statistical procedures be applied with very small samples? • Nature of the advance is to describe the null distribution of the statistic that is actually used.
Definition of the Student’s t distribution • The pdf is also a bell-shaped curve • Continuous distribution, unimodal, symmetric, less rapid fall-off of probability for values far from mean. • Appendix C table (546-547) gives two sided tail probability by degrees of freedom. • Most statistics texts give percentile points.
Basic numeric facts of Student’s t percentiles • Student t 95-th percentiles are larger than the 95-th standard normal percentile. • Same holds for any percentile greater than 50. • The difference becomes larger as the “degrees of freedom” is smaller.
Example One-Sample Z test Problem • Test the null hypothesis H0: E(Y)=500 with level of significance 0.01 against the alternative hypothesis H1: E(Y)<500. ASS-U-ME Y is normal with known standard deviation 100 using the sample mean of a random sample of four observations. This statistic has value 360.
Solution • Determine the side of the test, here left-sided. • Determine the standard error of the statistic, here 100/40.5=50 • Determine the critical value of the test statistic. • In original form, 500-2.326(50)=383.7 • In standard unit form, -2.326.
Solution Continued • Compare the statistic to the critical value: • In original units, the observed mean of 360 is to the left of the critical value of 383.7. • In standard-score form, the z value of the mean is (statistic-hypothesized expected value)/se of statistic=(460-500)/50=-2.8, to the left of the critical value -2.326. • Make decision: reject H0 at the 0.01 level of significance.
Example One-Sample T test Problem • Test the null hypothesis H0: E(Y)=500 with level of significance 0.01 against the alternative hypothesis H1: E(Y)<500. ASS-U-ME Y is normal with unknown standard deviation. The mean of a random sample of four observations has value 360, and the unbiased estimate of the variance is 6400. Note that the corresponding estimate of the standard deviation is 80.
Solution • Determine the side of the test, here left-sided. • Determine the estimated standard error of the statistic, here 80/40.5=40. • Student’s contribution: determine the degrees of freedom. For a one-sample t-test, it is number of observations minus one, here 4-1=3.
Solution Continued • Determine the critical value of the test statistic. Don’t forget Student’s stretch of the critical value • In original form, 500-t3,2.326(40)=500-4.541(40) =318.36 • In standard unit form, -4.541.
Solution Continued • Compare the statistic to the critical value: • In original units, the observed mean of 360 is to the right of the critical value of 318.36. • In standard-score form, the z value of the mean is (statistic-hypothesized expected value)/se of statistic=(360-500)/40=-3.5, to the right of the critical value -4.541. • Make decision: accept H0 at the 0.01 level of significance
Major points covered • Review of material using a case study that applies descriptive statistics. • Introduction (review) of Student’s t. • The one sample standard normal test. • The one sample Student’s t test.
To come • Finish Chapter 11 with one sample confidence intervals. • Begin Chapter 12, the paired t-test.