1 / 153

A Mathematical View of Our World

A Mathematical View of Our World. 1 st ed. Parks, Musser, Trimpe, Maurer, and Maurer. Chapter 9. Collecting and Interpreting Data. Section 9.1 Populations, Samples, and Data. Goals Study populations and samples Study data Quantitative data Qualitative data Study bias

wade-durham
Download Presentation

A Mathematical View of Our World

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Mathematical View of Our World 1st ed. Parks, Musser, Trimpe, Maurer, and Maurer

  2. Chapter 9 Collecting and Interpreting Data

  3. Section 9.1Populations, Samples, and Data • Goals • Study populations and samples • Study data • Quantitative data • Qualitative data • Study bias • Study simple random sampling

  4. 9.1 Initial Problem • How can a professor choose 5 students from among 25 volunteers in a fair way? • The solution will be given at the end of the section.

  5. Populations and Samples • The entire set of objects being studied is called the population. • A population can consist of: • People or animals • Plants • Inanimate objects • Events • The members of a population are called elements.

  6. Populations and Samples, cont’d • Any characteristic of elements of the population is called a variable. • When we collect information from a population element, we say that we measure the variable being studied. • A variable that is naturally numerical is called quantitative. • A variable that is not numerical is called qualitative.

  7. Populations and Samples, cont’d • A census measures the variable for every element of the population. • A census is time-consuming and expensive, unless the population is very small. • Instead of dealing with the entire population, a subset, called a sample, is usually selected for study.

  8. Example 1 • Suppose you want to determine voter opinion on a ballot measure. You survey potential voters among pedestrians on Main Street during lunch. • What is the population? • What is the sample? • What is the variable being measured?

  9. Example 1, cont’d • Solution: The population consists of all the people who intend to vote on the ballot measure.

  10. Example 1, cont’d • Solution: The sample consists of all the people you interviewed on Main Street who intend to vote on the ballot measure.

  11. Example 1, cont’d • Solution: The variable being measured is the voter’s intent to vote “yes” or “no” on the ballot measure.

  12. Data • The measurement information recorded from a sample is called data. • Quantitative data is measurements for a quantitative variable. • Qualitative data is measurements for a qualitative variable.

  13. Data, cont’d • Qualitative data with a natural ordering is called ordinal. • For example, a ranking of a pizza on a scale of “Excellent” to “Poor” is ordinal. • Qualitative data without a natural ordering is called nominal. • For example, eye color is nominal.

  14. Data, cont’d • The types of data are illustrated below.

  15. Example 2 • Suppose you survey potential voters among the people on Main Street during lunch to determine their political affiliation and age, as well as their opinion on the ballot measure. • Classify the variables as quantitative or qualitative.

  16. Example 2, cont’d • Solution: • Political affiliation is a qualitative variable. • Age is a quantitative variable. • Opinion on the ballot measure is a qualitative variable.

  17. Question: Suppose you survey potential voters among the people on Main Street during lunch to determine their political affiliation and age, as well as their opinion on the ballot measure. Classify the qualitative variables political affiliation and opinion on the ballot measure as ordinal or nominal. a. Both are ordinal. b. Both are nominal. c. Political affiliation is ordinal and opinion is nominal. d. Political affiliation is nominal and opinion is ordinal.

  18. Samples, cont’d • Statistical inference is used to make an estimation or prediction for the entire population based on data collected from the sample. • If a sample has characteristics that are typical of the population as a whole, we say it is a representative sample. • A bias is a flaw in the sampling that makes it more likely the sample will not be representative.

  19. Common Sources of Bias • Faulty sampling: The sample is not representative. • Faulty questions: The questions are worded to influence the answers. • Faulty interviewing: Interviewers fail to survey the entire sample, misread questions, and/or misinterpret answers.

  20. Common Sources of Bias, cont’d • Lack of understanding or knowledge: The person being interviewed does not understand the question or needs more information. • False answers: The person being interviewed intentionally gives incorrect information.

  21. Example 3 • Suppose you wish to determine voter opinion regarding eliminating the capital gains tax. You survey potential voters on a street corner near Wall Street in New York City. • Identify a source of bias in this poll.

  22. Example 3, cont’d • Solution: One source of bias in choosing the sample is that people who work on Wall Street would benefit from the elimination of the tax and are more likely to favor the elimination than the average voter may be. • This is faulty sampling.

  23. Example 4 • Suppose a car manufacturer wants to test the reliability of 1000 alternators. They will test the first 30 from the lot for defects. • Identify any potential sources of bias.

  24. Example 4, cont’d • Solution: One source of bias could be that the first 30 alternators are chosen for the sample. It may be that defects are either much more likely at the beginning of a production run or much less likely at the beginning. In either case, the sample would not be representative. • This is potentially faulty sampling.

  25. Simple Random Samples • Representative samples are usually chosen randomly. • Given a population and a desired sample size, a simple random sample is any sample chosen in such a way that all samples of the same size are equally likely to be chosen.

  26. Simple Random Samples, cont’d • One way to choose a simple random sample is to use a random number generator or table. • A random number generator is a computer or calculator program designed to produce numbers with no apparent pattern. • A random number table is a table produced with a random number generator. • An example of the first few rows of a random number table is shown on the next slide.

  27. Random Number Table

  28. Example 5 • Choose a simple random sample of size 5 from 12 semifinalists: Astoria, Beatrix, Charles, Delila, Elsie, Frank, Gaston, Heidi, Ian, Jose, Kirsten, and Lex.

  29. Example 5, cont’d • Solution: Assign numerical labels to the population elements, in any order, as shown below:

  30. Example 5, cont’d • Solution, cont’d: Choose a random spot in the table to begin. • In this case, we could choose to start at the top of the third column and to read down, looking at the last 2 digits in each number. This choice is arbitrary. • Numbers that correspond to population labels are recorded, ignoring duplicates, until 5 such numbers have been found.

  31. Example 5, cont’d

  32. Example 5, cont’d • Solution, cont’d: The numbers located are 01, 06, 10, 11, and 07. • The simple random sample consists of Beatrix, Gaston, Heidi, Kirsten, and Lex.

  33. Question: Choose a different simple random sample of size 5 from the 12 semifinalists: Astoria, Beatrix, Charles, Delila, Elsie, Frank, Gaston, Heidi, Ian, Jose, Kirsten, and Lex.

  34. Question, cont’d Use the first 2 digits of each number, reading across the row starting in row 128 of the random number table. a. Delila, Beatrix, Lex, Kirsten, Jose b. Frank, Jose, Elsie, Delila, Ian c. Charles, Ian, Frank, Beatrix, Gaston d. Jose, Beatrix, Ian, Heidi, Lex

  35. Example 6 • Choose a simple random sample of size 8 from the states of the United States of America.

  36. Example 6, cont’d • Solution: Numerical labels can be assigned to the population elements in any order. • In this example we choose to order the states by area. • The labels are shown on the next slide.

  37. Example 6, cont’d

  38. Example 6, cont’d • Solution, cont’d: We randomly choose to start at the top row, left column of the number table and read the last 2 digits of each entry across the row. • The entries are 03918 77195 47772 21870 87122 99445 10041 31795 63857 64569 34893 20429 43537 25368 95237 17707 34280 04755 64301 66836 12201…

  39. Example 6, cont’d • Solution, cont’d: • The numbers obtained from the table are 18, 22, 45, 41, 29, 37, 07, 01. • The states selected for the sample are Washington, Florida, Vermont, West Virginia, Arkansas, Kentucky, Nevada, and Alaska.

  40. 9.1 Initial Problem Solution • To fairly select 5 students from 25 volunteers, a professor could choose a simple random sample. • Solution: Assign the students labels of 00 through 24 according to some ordering. • Pick a starting place in a random number table and read until 5 students have been selected.

  41. Initial Problem Solution, cont’d • Suppose the first 2 digits of each entry in the last column are used. • The first 5 numbers that are 24 or less are 20, 04, 16, 07, and 06. • The students that were assigned these labels are fairly chosen from the 25 volunteers.

  42. Section 9.2Survey Sampling Methods • Goals • Study sampling methods • Independent sampling • Systematic sampling • Quota sampling • Stratified sampling • Cluster sampling

  43. 9.2 Initial Problem • You need to interview at least 800 people nationwide. • You need a different interviewer for each county. • Each interviewer costs $50 plus $10 per interview. • Your budget is $15,000. • Which is better, a simple random sample of all adults in the U.S. or a simple random sample of adults in randomly-selected counties? • The solution will be given at the end of the section.

  44. Sample Survey Design • Simple random sampling can be expensive and time-consuming in practice. • Statisticians have developed sample survey design to provide less expensive alternatives to simple random sampling.

  45. Independent Sampling • In independent sampling, each member of the population has the same fixed chance of being selected for the sample. • The size of the sample is not fixed ahead of time. • For example, in a 50% independent sample, each element of the population has a 50% chance of being selected.

  46. Example 1 • Find a 50% independent sample of the 12 semifinalists: Astoria, Beatrix, Charles, Delila, Elsie, Frank, Gaston, Heidi, Ian, Jose, Kirsten, and Lex.

  47. Example 1, cont’d • Solution: Because a random number table contains 10 digits, there is a 50% chance that one of the five digits 0, 1, 2, 3, or 4 will occur. • Let the digits 0, 1, 2, 3, or 4 represent “select this contestant” and let the remaining digits represent “do not select this contestant”.

  48. Example 1, cont’d • Solution, cont’d:We randomly choose column 6 in the random number table and look at the first 12 digits: 99445 20429 04. • The first 9 indicates that Astoria is not selected. • The second 9 indicates that Beatrix is not selected. • The 4 represents that Charles is selected, and so on… • The 50% independent sample is Charles, Delila, Frank, Gaston, Heidi, Ian, Kirsten, and Lex.

  49. Question: Choose a 40% independent sample from the 12 semifinalists: Astoria, Beatrix, Charles, Delila, Elsie, Frank, Gaston, Heidi, Ian, Jose, Kirsten, and Lex. Use the first 12 digits of row 145 of the random number table and use digits 0, 1, 2, 3 for selection.

  50. Question, cont’d Use the first 12 digits of row 145 of the random number table and use digits 0, 1, 2, 3 for selection. a. Astoria, Beatrix, Charles, Delila b. Charles, Elsie, Frank, Gaston c. Charles, Elsie, Frank, Gaston, Heidi, Jose, Kirsten, Lex. d. Beatrix, Charles, Delila, Frank, Heidi, Ian, Lex.

More Related