500 likes | 519 Views
Explore statistical analysis methods such as Modified Boxplot, Chebyshev’s Rule, and Empirical Rule. Understand Normal Distributions and their significance in data analysis. Practice problems and scenarios included.
E N D
8, 10, 22, 24, 25, 25, 26, 27, 45, 72 Graph & Describe
Modified Boxplot Mild outliers are represented by shaded circles. Extreme outliers are represented by open circles Whiskers are only extended to largest values that are not outliers.
An article on peanut butter reported the following scores (quality ratings on a scale of 0 to 100) for various brands. Construct a comparative stem-and-leaf plot and compare the graphs. Creamy: 56 44 62 36 39 53 50 65 45 40 56 68 41 30 40 50 56 30 22 Crunchy: 62 53 75 42 47 40 34 62 52 50 34 42 36 75 80 47 56 62
20, 22, 23, 24, 24, 25, 25, 27, 35 • Are there any outliers? • Draw a skeleton boxplot. • Draw a modified boxplot.
Describing Data in terms of the Standard Deviation. Test Mean = 80 St. Dev. = 5
Chebyshev’s Rule The percent of observations that are within k standard deviations of the mean is at least
Facts about Chebyshev • Applicable to any data set – whether it is symmetric or skewed. • Many times there are more than 75% - this is a very conservative estimation.
# St. Dev. % w/in k st. dev. of mean 2 3 4 4.472 5 10
Interpret using Chebyshev Test Mean = 80 St. Dev. = 5 • What percent are between 75 and 85? • What percent are between 60 and 100?
Collect wrist measurements (in) • Create distribution • Find st. dev & mean. • What percent is within 1 deviation of mean
Practice Problems • Using Chebyshev, solve the following problem for a distribution with a mean of 80 and a st. dev. Of 10. • a. At least what percentage of values will fall between 60 and 100? • b. At least what percentage of values will fall between 65 and 95?
Normal Distributions • These are special density curves. • They have the same overall shape • Symmetric • Single-Peaked • Bell-Shaped • They are completely described by giving its mean () and its standard deviation (). • We abbreviate it N(,)
Normal Curves…. • Changing the mean without changing the standard deviation simply moves the curve horizontally. • The Standard deviation controls the spread of a Normal Curve.
Standard Deviation • It’s the natural measure of spread for Normal distributions. • It can be located by eye on a Normal curve. • It’s the point at which the curve changes from concave down to concave up.
Why is the Normal Curve Important? • They are good descriptions for some real data such as • Test scores like SAT, IQ • Repeated careful measurements of the same quantity • Characteristics of biological populations (height) • They are good approximations to the results of many kinds of chance outcomes • They are used in many statistical inference procedures.
Empirical Rule • Can only be used if the data can be reasonably described by a normal curve. • Approximately • 68% of the data is within 1 st. dev. of mean • 95% of the data is within 2 st. dev. of mean • 99.7% of data is within 3 st. dev. of mean
Empirical Rule • What percent do you think…… • www.whfreeman.com/tps4e
Empirical Rule (68-95-99.7 Rule) • In the Normal distribution with mean () and standard deviation (): • Within 1 of ≈ 68% of the observations • Within 2 of ≈ 95% of the observations • Within 3 of ≈ 99.7% of the observations
The distribution of batting average (proportion of hits) for the 432 Major League Baseball players with at least 100 plate appearances in the 2009 season is normally distributed defined N(0.261, 0.034). • Sketch a Normal density curve for this distribution of batting averages. Label the points that are 1, 2, and 3 standard deviations from the mean. • What percent of the batting averages are above 0.329? • What percent are between 0.227 and .295?
Scores on the Wechsler adult Intelligence Scale (a standard IQ test) for the 20 to 34 age group are approximately Normally distributed. N(110, 25). • What percent are between 85 and 135? • What percent are below 185? • What percent are below 60?
A sample of the hourly wages of employees who work in restaurants in a large city has a mean of $5.02 and a st. dev. of $0.09. • a. Using Chebyshev’s, find the range in which at least 75% of the data will fall. • b. Using the Empirical rule, find the range in which at least 68% of the data will fall.
The mean of a distribution is 50 and the standard deviation is 6. Using the empirical rule, find the percentage that will fall between 38 and 62.
A sample of the labor costs per hour to assemble a certain product has a mean of $2.60 and a standard deviation of $0.15, using Chebyshev’s, find the values in which at least 88.89% of the data will lie.
Measures of Position Percentiles Z-scores
The following represents my results when playing an online sudoku game…at www.websudoku.com. 30 min 0 min
Introduction • A student gets a test back with a score of 78 on it. • A 10th-grader scores 46 on the PSAT Writing test Isolated numbers don’t always provide enough information…what we want to know is where we stand.
Where Do I Stand? • Let’s make a dotplot of our heights from 58 to 78 inches. • How many people in the class have heights less than you? • What percent of the dents in the class have heights less than yours? • This is your percentile in the distribution of heights
Finishing…. • Calculate the mean and standard deviation. • Where does your height fall in relation to the mean: above or below? • How many standard deviations above or below the mean is it? • This is the z-score for your height.
Let’s discuss • What would happen to the class’s height distribution if you converted each data value from inches to centimeters. (2.54cm = 1 in) • How would this change of units affect the measures of center, spread, and location (percentile & z-score) that you calculated.
National Center for Health Statistics • Look at Clinical Growth Charts at www.cdc.gov/nchs
Percentiles • Value such that r% of the observations in the data set fall at or below that value. • If you are at the 75th percentile, then 75% of the students had heights less than yours.
Test scores on last AP Test. Jenny made an 86. How did she perform relative to her classmates? 6 7 7 2334 7 5777899 8 00123334 8 569 9 03 Her score was greater than 21 of the 25 observations. Since 21 of the 25, or 84%, of the scores are below hers, Jenny is at the 84th percentile in the class’s test score distribution.
6 7 7 2334 7 5777899 8 00123334 8 569 9 03 Find the percentiles for the following students…. • Mary, who earned a 74. • Two students who earned scores of 80.
Interpreting… Why does it get very steep beginning at age 50? When does it slow down? Why? What percent were inaugurated before age 70? What’s the IQR? Obama was 47….
Interpreting Cumulative Relative Frequency Graphs Describing Location in a Distribution Use the graph from page 88 to answer the following questions. Was Barack Obama, who was inaugurated at age 47, unusually young? Estimate and interpret the 65th percentile of the distribution 65 11 58 47
Z-Score – (standardized score) • It represents the number of deviations from the mean. • If it’s positive, then it’s above the mean. • If it’s negative, then it’s below the mean. • It standardized measurements since it’s in terms of st. deviation.
Discovery: Mean = 90 St. dev = 10 Find z score for 80 95 73
Compare…using z-score. History Test Mean = 92 St. Dev = 3 My Score = 95 Math Test Mean = 80 St. Dev = 5 My Score = 90
Compare Math: mean = 70 x = 62 s = 6 English: mean = 80 x = 72 s = 3
Be Careful! Being better is relative to the situation. What if I wanted to compare race times?
Find the following percentiles. • 40th percentile? • 17th percentile? • 70th percentile? • 25th percentile?
Homework • Worksheet