540 likes | 620 Views
Elementary Statistics. Chapter 1 Introduction to Statistics. Method of analysis a collection of methods for planning experiments, obtaining data, and then then organizing, summarizing, presenting, analyzing, interpreting, and drawing conclusions based on the data. Statistics.
E N D
Elementary Statistics Chapter 1 Introduction to Statistics
Method of analysis a collection of methods for planning experiments, obtaining data, and then then organizing, summarizing, presenting, analyzing, interpreting, and drawing conclusions based on the data Statistics
USA Today, December 10, 1997--The biggest study ever of the health effects of alcohol concludes that a drink a day can cut your risk of death by 20%…The researchers gave questionnaires to 490,000 men and women and then followed up nine years later, after 46,000 of them had died…[However], the benefits decreased as people drank more. Among those who averaged four or five drinks a day, the risk of death among men was 10% lower, while among women it was 7% lower. What is Statistics?
New York Times, September 17, 1996--Millions of Americans routinely ignore one of mom’s most important pieces of advice: Wash your hands after you go to the bathroom. This unsettling item of news was gathered in the only way possible--by actually watching what people do (or don’t do) in public restrooms. The researchers--if that’s what they should be called--hid in stalls or pretended to comb their hair while observing 6,333 men and women do their business in five cities…Just 60% of those using restrooms in Penn Station (New York City)washed up afterward. What is Statistics?
Statistics is the science of data. It involves collecting, classifying, summarizing, organizing, analyzing, and interpreting data.
A population is the complete collection of all elements (scores, people, measurements, and so on) to be studied. The collection is complete in the sense that it includes all subjects to be studied. A population is the totality of all subjects possessing certain common characteristics that are being studied. In a statistical study, the researcher must define the population being studied. Population
Potential advertisers value television’s well-known Nielsen ratings as a barometer of a TV show’s popularity among viewers. The Nielsen rating of a certain TV program is an estimate of the proportion of viewers, expressed as a percentage, who tune their sets to the program on a given night at a given time. A typical Nielsen survey consists of 165 families selected nationwide who regularly watch television. Suppose we are interested in the Nielsen ratings for the latest episode of ER. Identify the population of interest. Describe the sample. Population and Sample
Census the collection of data from every element in a population Definitions
A parameter is a numerical measurement describing some characteristic of the population. Example: when Lincoln was first elected to the presidency, he received 39.82% of he 1,865,908 votes cast. If we consider the collection of all of those votes to be the population being considered, then 39.82% is a parameter, not a statistic. A statistic is a numerical measurement describing some characteristic of the sample. Example: Based on a sample of 877 surveyed executives, it was found that 45% of them would not hire anyone whose job application contained a typographical error. Parameters and Statistics
sample statistic Definitions • Statistic • a numerical measurement describing some characteristic of a sample
Are state lottery winners who win big payoffs likely to quit their jobs within one year of winning? No, according to a study published in the Journal of the Institute for Socioeconomic Studies (Sept. 1985). The researcher mailed questionnaires to over 2,000 lottery winners who won at least $50,000 between 1975 and 1985. Of the 576 who responded, only 11% had quit their jobs during the first year after striking it rich. In this study, identify The population The sample The inference made about the population Population, Sample, and Inference
Data are obtained by measuring some characteristic or property of the objects (usually people or things) of interest to us. A variable is a characteristic (Property) that differs or varies from one observation from the next. All data (and, consequently, the variables we measure) are either quantitative or qualitative. Data
Quantitative data are observations measured on a natural numerical scale. Nonnumeric data that can only be classified into one of a group of categories are qualitative data. State whether each of the following variables measured on graduating high school students is quantitative or qualitative. National Honor Society member or not Scholastic Assessment Test (SAT) score Number of colleges applied to Part-time job status Qualitative/Quantitative Data
Definitions • Quantitative data • numbers representing counts or measurements • Qualitative (or categorical or attribute) data • can be separated into different categories that are distinguished by some nonnumeric characteristics
Definitions • Quantitative data • the incomes of college graduates • Qualitative (or categorical or attribute) data • the genders (male/female) of college graduates
Discrete data result when the number of possible values is either a finite number or a “countable” number. Continuous (numerical) data result from infinitely many possible values that correspond to some continuous scale hat covers a range of values without gaps, interruptions, or jumps. Continuous data is measurable. Discrete vs Continuous Data
Discrete data result when the number of possible values is either a finite number or a ‘countable’ number of possible values 0, 1, 2, 3, . . . Definitions
Discrete data result when the number of possible values is either a finite number or a ‘countable’ number of possible values 0, 1, 2, 3, . . . Continuous (numerical) data result from infinitely many possible values that correspond to some continuous scale that covers a range of values without gaps, interruptions, or jumps Definitions 2 3
Determine whether the given values are from a discrete or continuous data set. A statistics professor counts 3 absent students. A statistics professor finds that on the first test, the first paper is turned in 39.627 minutes after the test began. In a survey of 1068 Americans, 73 state that they own answering machines. A manufacturer of rechargeable calculator batteries finds that one batch consists of 850 good batteries and 7 that are defective.
Definitions • nominal level of measurement • characterized by data that consist of names, labels, or categories only. The data cannot be arranged in an ordering scheme (such as low to high) • Example: survey responses yes, no, undecided
Definitions • ordinal level of measurement • involves data that may be arranged in some order, but differences between data values either cannot be determined or are meaningless • Example: Course grades A, B, C, D, or F
Definitions • interval level of measurement • like the ordinal level, with the additional property that the difference between any two data values is meaningful. However, there is no natural zero starting point (where none of the quantity is present) • Example: Years 1000, 2000, 1776, and 1492
Definitions • ratio level of measurement • the interval level modified to include the natural zero starting point (where zero indicates that none of the quantity is present). For values at this level, differences and ratios are meaningful. • Example: Prices of college textbooks
Levels of Measurement • Nominal- categories only • Ordinal- categories with some order • Interval- differences but no natural starting point • Ratio- differences and a natural starting point
Uses of Statistics • Almost all fields of study benefit from the application of statistical methods
Abuses of Statistics • Bad Samples
Definitions • self-selected survey • (or voluntary response sample) • one in which the respondents themselves decide whether to be included
Abuses of Statistics • Bad Samples • Small Samples • Loaded Questions • Misleading Graphs
Figure 1-1 Salaries of People with Bachelor’s Degrees and with High School Diplomas $40,500 $40,500 $40,000 $40,000 30,000 35,000 $24,400 30,000 20,000 $24,400 25,000 10,000 20,000 0 Bachelor High School Degree Diploma Bachelor High School Degree Diploma (a) (b)
We should analyze the numericalinformation given in the graph instead of being mislead by its general shape.
Abuses of Statistics • Bad Samples • Small Samples • Loaded Questions • Misleading Graphs • Pictographs
Double the length, width, and height of a cube, and the volume increases by a factor of eight Figure 1-2
Abuses of Statistics • Bad Samples • Small Samples • Loaded Questions • Misleading Graphs • Pictographs • Precise Numbers • Distorted Percentages • Partial Pictures
“Ninety percent of all our cars sold in this country in the last 10 years are still on the road.”
Abuses of Statistics • Bad Samples • Small Samples • Loaded Questions • Misleading Graphs • Pictographs • Precise Numbers • Distorted Percentages • Partial Pictures • Deliberate Distortions
Definitions • Observational Study • observing and measuring specific characteristics without attempting to modify the subjects being studied
Definitions • Experiment • apply some treatment and then observe its effects on the subjects
Designing an Experiment • Identify your objective • Collect sample data • Use a random procedure that avoids bias • Analyze the data and form conclusions
Definitions • Confounding • occurs in an experiment when the effects from two or more variables cannot be distinguished from each other
Definitions • Replication • used when an experiment is repeated on a sample of subjects that is large enough so that we can see the true nature of any effects (instead of being misled by erratic behavior of samples that are too small)
Definitions • Random Sample • members of the population are selected in such a way that each has an equal chance of being selected
Definitions • Random Sample • members of the population are selected in such a way that each has an equal chance of being selected • Simple Random Sample (of size n) • subjects selected in such a way that every possible sample of size n has the same chance of being chosen
Random Sampling - selection so that each has an equalchance of being selected
Systematic Sampling - Select some starting point and then select every K th element in the population
Convenience Sampling - use results that are readily available Hey! Do you believe in the death penalty?
Stratified Sampling - subdivide the population into subgroups that share the same characteristic, then draw a sample from each stratum
Cluster Sampling - divide the population into sections (or clusters); randomly select some of those clusters; choose all members from selected clusters
Methods of Sampling • Random • Systematic • Convenience • Stratified • Cluster