860 likes | 873 Views
Explore the art of statistics, sampling methods & common misuses. Learn how to analyze data, draw conclusions, & avoid misleading information.
E N D
Chapter 13 Statistics
13.1 Sampling Techniques
Statistics • Statistics is the art and science of gathering, analyzing, and making inferences from numerical information (data) obtained in an experiment. • Statistics are divided into two main braches. • Descriptive statistics is concerned with the collection, organization, and analysis of data. • Inferential statistics is concerned with the making of generalizations or predictions of the data collected.
Statisticians • A statistician’s interest lies in drawing conclusions about possible outcomes through observations of only a few particular events. • The population consists of all items or people of interest. • The sample includes some of the items in the population. • When a statistician draws a conclusion from a sample, there is always the possibility that the conclusion is incorrect.
Types of Sampling • A random sampling occurs if a sample is drawn in such a way that each time an item is selected, each item has an equal chance of being drawn. • When a sample is obtained by drawing every nth item on a list or production line, the sample is a systematic sample. • A cluster sample is referred to as an area sample because it is applied on a geographical basis.
Types of Sampling continued • Stratified sampling involves dividing the population by characteristics such as gender, race, religion, or income. • Convenience sampling uses data that is easily obtained and can be extremely biased.
Example: Identifying Sampling Techniques • A raffle ticket is drawn by a blindfolded person at a festival to win a grand prize. • Students at an elementary are classified according to their present grade level. Then, a random sample of three students from each grade are chosen to represent their class. • Every sixth car on highway is stopped for a vehicle inspection.
Example: Identifying Sampling Techniques continued • Voters are classified based on their polling location. A random sample of four polling locations are selected. All the voters from the precinct are included in the sample. • The first 20 people entering a water park are asked if they are wearing sunscreen. Solution: a)Random d)Cluster b)Stratified e)Convenience c)Systematic
13.2 The Misuses of Statistics
Misuses of Statistics • Many individuals, businesses, and advertising firms misuse statistics to their own advantage. • When examining statistical information consider the following: • Was the sample used to gather the statistical data unbiased and of sufficient size? • Is the statistical statement ambiguous, could it be interpreted in more than one way?
An advertisement says, “Fly Speedway Airlines and Save 20%”. Here there is not enough information given. The “Save 20%” could be off the original ticket price, the ticket price when you buy two tickets or of another airline’s ticket price. A helped wanted ad read,” Salesperson wanted for Ryan’s Furniture Store. Average Salary: $32,000.” The word “average” can be very misleading. If most of the salespeople earn $20,000 to $25,000 and the owner earns $76,000, this “average salary” is not a fair representation. Example: Misleading Statistics
Charts and Graphs • Charts and graphs can also be misleading. • Even though the data is displayed correctly, adjusting the vertical scale of a graph can give a different impression. • A circle graph can be misleading if the sum of the parts of the graphs do not add up to 100%.
Example: Misleading Graphs While each graph presents identical information, the vertical scales have been altered.
13.3 Frequency Distributions
0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 3 3 3 3 4 4 Example • The number of pets per family is recorded for 30 families surveyed. Construct a frequency distribution of the following data:
Number of Pets Frequency 0 6 0 0 0 0 0 0 1 1 1 1 1 1 1 10 1 1 1 1 2 2 2 8 2 2 2 2 2 2 3 4 3 3 3 3 4 4 4 2 Solution
Rules for Data Grouped by Classes • The classes should be of the same “width.” • The classes should not overlap. • Each piece of data should belong to only one class.
Definitions • Midpoint of a class is found by adding the lower and upper class limits and dividing the sum by 2.
6.8 5.3 9.7 3.8 8.7 0.5 5.9 0.8 5.7 1.3 4.8 9.6 1.5 7.4 0.2 Example • The following set of data represents the distance, in miles, 15 randomly selected second grade students live from school. Construct a frequency distribution with the first class 0 2.
First, rearrange the data from lowest to highest. # of miles from school Frequency 0.2 0.5 0.8 0 - 2 5 1.3 1.5 3.8 2.1 - 4.1 1 4.2 - 6.2 4 4.8 5.3 5.7 6.3 - 8.3 2 5.9 6.8 7.4 8.4 -10.4 3 8.7 9.6 9.7 15 Solution
13.4 Statistical Graphs
Circle Graphs • Circle graphs (also known as pie charts) are often used to compare parts of one or more components of the whole to the whole.
Aspirin 56 Ibuprofen 104 Acetaminophen 16 Other 24 200 Example • According to a recent hospital survey of 200 patients the following table indicates how often hospitals used four different kinds of painkillers. Use the information to construct a circle graph illustrating the percent each painkiller was used.
Painkiller Number of Patients Percent of Total Measure of Central Angle Aspirin 56 0.28 360 = 100.8 Ibuprofen 104 0.52 360 = 187.2 Acetaminophen 16 0.08 360 = 28.8 Other 24 0.12 360 = 43.2 Total 200 100% 360 Solution • Determine the measure of the corresponding central angle.
Solution continued • Use a protractor to construct a circle graph and label it properly.
# of pets Frequency 0 6 1 10 2 8 3 4 4 2 Histogram • A histogram is a graph with observed values on its horizontal scale and frequencies on it vertical scale. • Example: Construct a histogram of the frequency distribution.
# of pets Frequency 0 6 1 10 2 8 3 4 4 2 Solution
Stem-and-Leaf Display • A stem-and-leaf display is a tool that organizes and groups the data while allowing us to see the actual values that make up the data. • The left group of digits is called the stem. • The right group of digits is called the leaf.
12 18 3 8 12 25 21 3 15 4 17 27 43 21 16 12 26 35 14 9 Example • The table below indicates the number of miles 20 workers have to drive to work. construct a stem-and-leaf display.
Data 0 33489 12 18 3 8 12 25 21 3 15 4 1 22245678 17 27 43 21 16 2 11567 12 26 35 14 9 3 5 4 3 Solution
13.5 Measures of Central Tendency
Definitions • An average is a number that is representative of a group of data. • The arithmetic mean, or simply the mean is symbolized by or by the Greek letter mu, .
Mean • The mean, is the sum of the data divided by the number of pieces of data. The formula for calculating the mean is • where represents the sum of all the data and n represents the number of pieces of data.
Example-find the mean • Find the mean amount of money parents spent on new school supplies and clothes if 5 parents randomly surveyed replied as follows: $327 $465 $672 $150 $230
middle value (median) Median • The median is the value in the middle of a set of ranked data. • Example: Determine the mean of $327 $465 $672 $150 $230. Rank the data from smallest to largest. $150 $230 $327 $465 $672
7 8 Example: Median (even data) • Determine the median of the following set of data: 8, 15, 9, 3, 4, 7, 11, 12, 6, 4. Rank the data: 3 4 4 6 7 8 9 11 12 15 There are 10 pieces of data so the median will lie halfway between the two middle pieces the 7 and 8. The median is (7 + 8)/2 = 7.5 3 4 4 6 9 11 12 15
Mode • The mode is the piece of data that occurs most frequently. • Example: Determine the mode of the data set: 3, 4, 4, 6, 7, 8, 9, 11, 12, 15. • The mode is 4 since is occurs twice and the other values only occur once.
Midrange • The midrange is the value halfway between the lowest (L) and highest (H) values in a set of data. • Example: Find the midrange of the data set$327, $465, $672, $150, $230.
Example • The weights of eight Labrador retrievers rounded to the nearest pound are 85, 92, 88, 75, 94, 88, 84, and 101. Determine the • a) mean b) median • c) mode d) midrange • e) rank the measures of central tendency from lowest to highest.
Example--dog weights 85, 92, 88, 75, 94, 88, 84, 101 • Mean • Median-rank the data • 75, 84, 85, 88, 88, 92, 94, 101 • The median is 88.
Example--dog weights 85, 92, 88, 75, 94, 88, 84, 101 • Mode-the number that occurs most frequently. The mode is 88. • Midrange = (L + H)/2 = (75 + 101)/2 = 88 • Rank the measures 88.375, 88, 88, 88
Measures of Position • Measures of position are often used to make comparisons. • Two measures of position are percentiles and quartiles.
To Find the Quartiles of a Set of Data • Order the data from smallest to largest. • Find the median, or 2nd quartile, of the set of data. If there are an odd number of pieces of data, the median is the middle value. If there are an even number of pieces of data, the median will be halfway between the two middle pieces of data.
To Find the Quartiles of a Set of Data continued • The first quartile, Q1, is the median of the lower half of the data; that is, Q1, is the median of the data less than Q2. • The third quartile, Q3, is the median of the upper half of the data; that is, Q3is the median of the data greater than Q2.
Example: Quartiles • The weekly grocery bills for 23 families are as follows. Determine Q1, Q2, and Q3. 170 210 270 270 280 330 80 170 240 270 225 225 215 310 50 75 160 130 74 81 95 172 190
Example: Quartiles continued • Order the data: 50 75 74 80 81 95 130 160 170 170 172 190 210 215 225 225 240 270 270 270 280 310 330 Q2 is the median of the entire data set which is 190. Q1 is the median of the numbers from 50 to 172 which is 95. Q3 is the median of the numbers from 210 to 330 which is 270.
13.6 Measures of Dispersion
Measures of Dispersion • Measures of dispersion are used to indicate the spread of the data. • The range is the difference between the highest and lowest values; it indicates the total spread of the data.