1 / 22

Chapter 5: Producing Data

Chapter 5: Producing Data. “An approximate answer to the right question is worth a good deal more than the exact answer to an approximate question.’ John Tukey. 5.1 Designing Samples (p. 245-261) (Overview).

Download Presentation

Chapter 5: Producing Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 5:Producing Data “An approximate answer to the right question is worth a good deal more than the exact answer to an approximate question.’ John Tukey

  2. 5.1 Designing Samples (p. 245-261)(Overview) • One must design the sampling process very carefully in order to obtain reliable statistical information. • Meaningful and useful results can be produced by good sampling techniques, many of which involve the use of chance. • Worthless data is produced by bad sampling techniques.

  3. Definitions • Voluntary response sample • Consists of people who chose themselves. • Example: Listeners who call in to respond to a talk show question • Two variables are confounded when their effects on a response variable cannot be distinguished from one another. • See Example 5.2 in textbook in which the explanatory variable (the reading of favorable propaganda) and the events of history are confounded.

  4. Definitions (cont’d.) • Statistical Inference: provides ways to provide “reasonable” responses to specific questions by examining data. • Population: group from which information is desired. • Sample: part of the population that is examined in an attempt to obtain information about the population.

  5. Definitions (cont’d.) • Sampling Frame: the list of individuals from which a sample is actually selected. • Example: • Population: adult residents of Delaware County • Sampling Frame: voter registration roll • Design: the method that is used to select the sample.

  6. Definitions (cont’d.) • Convenience Sample: selecting individuals that are easiest to reach. • Examples: • Opinions offered by shoppers entering or leaving a WaWa or Borders in Springfield (used by Daily Times) • Opinions offered by students of a Catholic school( used by Catholic Standard and Times) • Biased Sample: sample that has been systematically chosen because of favoritism of a specific outcome.

  7. Definitions (cont’d.) • Simple random sample (SRS) of size n: sample that is chosen is such a way that every set of n individuals has an equal chance of being selected to be included in it. • Sometimes this is easier said than done! It can be tricky to obtain an SRS. • Probability sample: each member of the population is given a known chance of being chosen.

  8. Definitions (cont’d.) • Stratified Random Sample: • Steps: • Population is divided into groups called strata • A SRS is chosen from each strata • SRS’s are combined into one sample • Reasons: • To reduce the variation of the estimators • Administrative convenience • Less expensive • Estimates need “subgroups” of population

  9. Definitions (cont’d.) • Multi stage sample design: the selection of smaller groups within a population by stages. • Undercoverage occurs when some groups in the population are left out in the process of choosing the sample. • Nonresponse occurs when an individual cannot be contacted or refuses to cooperate. • Response bias refers to a variety of things that can lead to an incorrect or false response.

  10. Final Thoughts: • The wording of the question can greatly influence the response. • A poorly worded question can confuse those who are attempting to answer it.

  11. 5.2 Designing Experiments (p. 265-284)Am Overview • There are good and bad techniques for producing data. • Important and effective statistical practices are the use of random sampling and randomized comparative experiments. • The use of chance is vital in statistical design.

  12. Concepts and Definitions • In an observational study, NOtreatmentis imposed on the individuals in the study. • Variables of interest are measured, usually over a period of time. • In an experiment, treatment is imposed on the individuals in thestudy. • Responses to the treatment are observed.

  13. Definitions (cont’d.) • Experimental units are individuals on which the experiment is performed. • i.e. participants in the experiment • A treatment is a specific experimental condition that is applied to the experimental units. • A placebo is a dummy treatment that can have no physical effect on an experimental unit. • Commonly called a “sugar pill.”

  14. Definitions (cont’d.) • The control group receives the placebo. • This group helps the experimenter to control the effects of any lurking variables. • The treatment group receives the treatment.

  15. Definitions (cont’d.) • Completely randomized experimental design: All experimental units are allocated at random among the treatments • Statistically significant observation: An observed result that is too unusual to be an outcome determined by pure chance.

  16. Three Principals of Experimental Design • CONTROL • Needed to counter the effects of lurking variables. • Comparison is the simplest form of control. • Experiments should compare two or more treatments in order to avoid confounding the effect of the treatment with some other influence. • RANDOMIZATION • Subjects are assigned treatments by pure chance. • Creates groups that are similar (except for chance variation) • Table of random digits can be used to choose the uits for each group • REPLICATION • Experiment should be done on many subjects to reduce any chance variation in the results.

  17. Definitions (cont’d.) • In a double blind experiment, neither the subjects nor the people who have contact with them know which treatment a subject is receiving. • A block design • Minimizes variation. • Block: group of experimental units or subjects that are similar in ways that are expected to affect the response of the treatments. • Treatment is assigned randomly within similar blocks. • A form of control.

  18. Definitions (cont’d.) • Matched pairs: • Common form of blocking • Compares two treatments • The pairs are “alike” • Common forms: • Using random process • In pair, one receives treatment, other receives placebo • Pairs are observed at a later time to see if treatment had any effect • Test scores from a before-after situation • Individual • Takes a before-test • Receives some type of treatment • Takes an after-test • Purpose: to see if treatment improves test performance

  19. 5.3 Simulation Experiments (p. 286-296)An Overview • Empirical probabilities relating to real-life can be obtained • Chance outcomes can be imitated by using • Random number generators • Tables • Calculators • Computers • Dice • Cards • Spinners

  20. Simulation • The imitation of chance behavior in an attempt to gain information about a real-life situation randInt(can be used on your TI-84 plus to generate random integers

  21. Steps in Creating a Simulation Model • State the problem or describe the experiment. • State the assumptions. • Assign digits to represent outcomes. • Simulate your conclusions. • State your conclusions.

  22. When Trials are Completed • Determine empirical probability by calculating the ratios • Number of situations in which you are interested divided by the total number of trials.

More Related