1 / 23

Design of Statistical Investigations

Design of Statistical Investigations. Introduction to Sampling. Stephen Senn. Representative Inference. So far in the course we have been interested in comparisons with some sort of causal investigation We now look at the case where we are interested in collecting representative material

hedwig
Download Presentation

Design of Statistical Investigations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Design of Statistical Investigations Introduction to Sampling Stephen Senn SJS SDI_15

  2. Representative Inference • So far in the course we have been interested in comparisons • with some sort of causal investigation • We now look at the case where we are interested in collecting representative material • samples to describe populations • First we consider some possible applications SJS SDI_15

  3. Applications of Sampling Methods • Quality control of manufacturing processes • Financial audit • Opinion polls • Clinical audit • Anthropology • Social surveys • Ecological surveys • capture/recapture SJS SDI_15

  4. An Important Practical Distinction All of these application areas require sampling theory and careful consideration as to how samples are drawn. However some of them have a further difficulty, which is that the opinions of human beings have to be ascertained. In what follows we shall often take opinion polls/social surveys as typical examples of sampling problems. (But our first example is not of this sort.) This will enable us to discuss also the further problems that arise in these contexts. However, first we shall review some very elementary statistical concepts SJS SDI_15

  5. Standard Deviation/Standard Error • There is common confusion between standard deviation and standard error • The standard deviation describes the spread of original values • The standard error is a measure of reliability of some statistic based on the original values SJS SDI_15

  6. An Illustration of This Difference • This will now be illustrated using a simple example • This example is again a medical one • My apologies! • I need a large data set • This one will have to do SJS SDI_15

  7. Example Surv_2 • Cross-over trial in asthma • 790 baseline FEV1 readings • Since baselines unaffected by treatment • Regard as homogenous sample • Ignore fact that they are repeated measures • The following slide shows distribution of readings SJS SDI_15

  8. SJS SDI_15

  9. Distribution • Curve skewed to the right • Clearly not Normal • Statistics • Mean 1.965 • Median 1.820 • Variance 0.462 SJS SDI_15

  10. Sampling • Suppose that we take simple random samples of size 10 • Take these at random from original distribution • With replacment • Calculate mean of these • Study distribution of these means • This is what is called a sampling distribution • Illustrated on next slide SJS SDI_15

  11. SJS SDI_15

  12. Distribution • Curve less obviously skewed to the right • Approximation to Normal is closer • Distribution is narrower • Statistics • Mean 1.961 (very similar to previously) • Median 1.948 (now much closer to mean) • Variance 0.043 (approximately 1/10 of previous value) SJS SDI_15

  13. SJS SDI_15

  14. The Different Variances • Case 1 • Variance of original values • The square root of this is the standard deviation • Case 2 • Variance of means • Square root of these is standard error of the mean (SEM) • In general • Square root of the variance of a statistic (e.g. a mean) is a standard error SJS SDI_15

  15. Standard Deviation v Standard Error • Standard deviation • Used to describe variation of original values • Can be population • Can be sample • Standard error • Used to describe reliability of a statistic. For example • SE of mean • SE of treatment differences SJS SDI_15

  16. Estimating the Standard Error The standard error of a simple random sample of size n drawn from a population with variance 2 is /n. In practice 2, being a population parameter, is unknown so we estimate it using the sample variance, s2. Hence we estimate the standard error of the mean by s/n SJS SDI_15

  17. Transformations • Can be very valuable • Improve accuracy of analysis • Under-utilised • Previous FEV1 example follows • log-transformation • data more nearly Normal • But will not deal with all problems • Outliers ( in particular “bad” values) SJS SDI_15

  18. SJS SDI_15

  19. Normal Distribution • Ideal mathematical representation • Rarely applies in practice to original data • However, many sampling distributions have approximately Normal form • This increases its utility considerably • A combination of transformation of original data plus averaging can frequently make it applicable SJS SDI_15

  20. Technical Terms(Schaeffer, Mendenhall and Ott) • Element • Object on which a measurement is taken • Population • A collection of elements about which we wish to make an inference • Sampling units • Nonoverlapping collection of elements from the population that cover the entire population • Sampling frame • A list of sampling units • Sample • Collection of sampling units drawn from a frame SJS SDI_15

  21. Probability Sampling • Well-defined sampling frame • Probabilistic rule for drawing sample • Knowledge of rule and sampling frame enables probabilistic statements about the population • There are various types of such sample • simple, cluster, stratified SJS SDI_15

  22. Simple Random Sample We shall encounter this in more detail in the next lecture. For the moment we note a definition “Sampling in which every member of the population has an equal chance of being chosen and successive drawings are independent” Mariott, A Dictionary of Statistical Terms Only for simple random sampling is the standard error of the mean equal to s/n SJS SDI_15

  23. Quota Sampling • Sampling frame not used • May have rough idea of population composition • Sampling carries on until various quotas are fulfilled • e.g 100 males, 100 females • Difficult to make probabilistic statements about population SJS SDI_15

More Related