1 / 36

Introduction to Statistics

Introduction to Statistics. MATH 124 Sections 29.1-29.3. The name. Data analysis usually refers to an informal approach to statistics. It is a relatively new term in mathematics.

cmarcial
Download Presentation

Introduction to Statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Statistics MATH 124 Sections 29.1-29.3

  2. The name • Data analysis usually refers to an informal approach to statistics. It is a relatively new term in mathematics. • Statistics once referred to numerical information about state or political territories; it comes from the Latin statisticus, meaning “of the state.” Today, much of statistics involves making sense of data.

  3. Statistic and statistics • A statistic is a numerical value of a quantity. • Statistics is the science of obtaining, organizing, describing, and analyzing data for the purpose of making decisions, as well as making predictions.

  4. Note from the book (p.658) • In most instances, statistics are reliable and useful.... But statistics can also be unreliable or misleading. Just as it is possible to lie when giving an account of an event, it is possible to “ lie” with statistics by inappropriately manipulating data, withholding information that is crucial to the interpretation of data, or presenting statistical information in a way that hides important information about the data. A person who understands fundamental ideas in statistics is more likely to recognize these unethical uses of data than a person who does not have this understanding.

  5. More from p.658 • On the surface, statistics as a discipline appears to be a precise science that yields only one “right answer” to a question. One reason for this belief is that statistics is associated with mathematics, which is in turn associated with precision. Another reason is that a statistical analysis produces apparently precise numerical results. (For example, you’ve probably encountered statements such as “Families in that community have on average 2.6 children.”) There are, however, some gray areas in interpreting statistics. As a rule of thumb, expect precision in some aspects of statistics, such as computing statistics or making honest graphs, but expect gray areas in aspects such as interpreting graphs and interpreting statistics.

  6. Relationship between probability and statistics • There is a close relationship between probability and statistics. Much statistical decision making is based on probabilities, as statistics typically deals with population samples instead of entire populations, and therefore there is an element of chance involved.

  7. Conducting data analysis • The following framework holds for statistical problem solving: • Formulate the questions • Collect the data • Analyze the data • Interpret the results

  8. 1. Formulating questions • Conceiving the object to be measured clearly enough to imagine a way to measure is it often the most difficult part of a statistical study of a new concept. Curricula for grades K-8 often have children decide what data should be collected that will answer the questions they may have suggested for a statistics project. For example, how should we measure parents’ tolerance of violence on television?

  9. 2. Collecting data • It is of utmost importance to pick an unbiased sample. • A sample is biased if the process of gathering the sample makes it likely that the sample will not reflect that population of interest. • There are different types of sampling. The book only discusses random sampling. It is one in which every member of the population has equal chance of being selected for the sample. • What can go wrong if a sample is biased? Consider this famous example: the 1936 election.

  10. Types of data • Measurement (or quantitative) • Categorical (or qualitative) For example: test scores are quantitative; favorite colors are qualitative.

  11. 3.&4. Analyzing and interpreting data • In this class, we will discuss the following ways to analyze data: • Creating statistical graphs • Finding averages • We will use these to draw conclusions about the data.

  12. Some terminology • A population is the entire group that is of interest. • A sample is the part of the population that is actually used to collect data. A sample is biased if the process of gathering the sample makes it likely that the sample will not reflect the population of interest. • A sample statistic is the result of a calculation or count based on data gathered from the sample. • A population parameter is the same calculation or count based on the entire population.

  13. Some types of bias: • self- selected or voluntary sample • convenience sample.

  14. Problem to consider (p.660) • An elementary school with grades 1 through 6 has 100 students in each grade. A fifth-grade class is trying to raise some money to go on a field trip to Disneyland. They are considering several options to raise money and decide to do a survey to help them determine the best way to raise the most money. One option is to sell raffle tickets for a Wii U. How could they find out whether or not students were interested in buying a raffle ticket to win the game system?

  15. Proposed surveys • Raffi asked 60 friends. (75% yes, 25% no) • Marta got the names of all 600 students in the school, put them in a hat, and pulled out 60 of them. (35% yes, 65% no) • Spence had blond hair so he asked the first 60 students he found who also had blond hair. (55% yes, 45% no) • Jinfa asked 60 students at an after- school meeting of the Games Club. The Games Club met once a week and played different games— especially computerized ones. Anyone who was interested in games could join. (90% yes, 10% no)

  16. Abby sent out a questionnaire to every student in the school and then used the first 60 that were returned to her. (50% yes, 50% no) • SuLin set up a booth outside the lunchroom, and anyone who wished could stop by and fill out her survey. To advertise her survey, she posted a sign that said WIN A Wii U. She stopped collecting surveys when 60 students had completed the survey. (100% yes) • Jazmine asked the first 60 students she found whose telephone number ended in a 3 because 3 is her favorite number. (25% yes, 75% no)

  17. Dong wanted the same number of boys and girls and some students from each grade. So he asked 5 boys and 5 girls from each grade to get his total of 60 students. (30% yes, 70% no) • Paula didn’t know many boys, so she decided to ask 60 girls. But she wanted to make sure she got some young girls and some older ones, so she asked 10 girls from each grade. (10% yes, 90% no)

  18. Questions • What is the population in this case? • What is the sample? • What is the sample statistic we are trying to find? • What is the population parameter we are trying to find? • What would your estimate for the population parameter be based on these nine surveys?

  19. Discussion 2 (p. 660) • For each sample in Activity 1, why do you think the percentages came out the way they did? • What kinds of biases could show up in the students’ samples? • Do you think the percentages would have changed if the sample size had changed?

  20. Types of samples (p. 664) • Random sampling. It is one in which every member of the population has equal chance of being selected for the sample. • A simple random sample is one in which every possible sample of a particular size has an equal chance of being selected. • A stratified random sample is one where the population is made of different groups, or strata (e.g. age, race, gender, etc.)

  21. Systematic sampling is used when population is already organized in some way, not related to the study. • A cluster sample is one where a random cluster is selected, and all its members surveyed. • For examples of non-random sampling, consider convenience and self-selected samples.

  22. One more look at the surveys • Can you recognize any of the random sampling methods among the nine surveys?

  23. Questions • Why do we need a sample? Why don’t we just use the entire population? • How do we choose a sample? • How big should a sample be? • Can you think of examples of parameters that are estimated by collecting data from a sample? • Can you think of examples of parameters that are calculated/counted by using the entire population? • Why can we assume that the sample statistics is a good estimate for the parameter?

  24. Reasons for using a sample • It is not always possible to include all of the population. • Gathering information is costly in terms of both time and money. • Results are more timely. • The discipline of statistics allows us to interpret results from samples and to make assertions about the whole population. A result from a relatively small, but carefully chosen, sample can give information about the whole population. In this statement relatively small does not necessarily mean small in number, but rather it means far fewer than the whole population.

  25. Random sampling • When statisticians want to find a random sample, they do not usually draw names from a hat, spin a spinner, or toss a die, although these are legitimate ways to sample randomly. To obtain a large sample, these methods would be very time-consuming. Instead, statisticians might use computer simulation software, a table of random numbers, or a computer or calculator with the capability of providing random numbers. • This is a common middle school topic, but we will skip it in this class.

  26. Conducting a survey • Now it’s your turn. Turn to page 672 and work with your group to answer the questions 1-5 in Section 29.5. Be creative! • It will also help to look at the eight considerations at the beginning of the section. • Note that I will ask you to conduct your survey at PLU over the weekend, so be reasonable in your question, population, and sampling method.

  27. Types of data: qualitative

  28. Frequency tables

  29. Bar graphs

  30. Pie charts

  31. Types of data: quantitative

  32. Stem and leaf plots • Stem and leaf plots can be used to represent measurement data.

  33. Line plots • These are prominent in the Common Core Standards and are also used to represent measurement data. An example is below:

  34. Line graphs • Usually represent the change of a quantity over time. These graphs are commonly used, but the book doesn’t give too much attention to them, nor do the CCSM.

  35. Histograms • A histogram is similar to a bar graph, but not at all the same. The main differences are that the data categories have to be quantitative, the bars have to follow the order of the categories, and the widths of the bars must have specific meaning. • At the K-12 level, it is assumed that all the bars have the same width.

More Related