430 likes | 556 Views
AP Statistics Overview. Statistics is the science of learning from data. Ex. Take a sample of 50 seniors and record the number of AP classes they are taking. Use this to make a prediction, or educated guess, about how many AP classes ALL seniors are taking.
E N D
AP Statistics Overview
Statistics is the science of learning from data. Ex. Take a sample of 50 seniors and record the number of AP classes they are taking. Use this to make a prediction, or educated guess, about how many AP classes ALL seniors are taking. Parameter – summary measurement (ex: p, µ) that describes the population Statistic – summary measurement (ex: ) that describes the sample What is Statistics?
AP Statistics – At a Glance • Exploring Data (Chapters 1 – 4) • Create Distributions (graph of data) • Describe / Compare Distributions • Observational Studies and Experiments (Ch 5) • Anticipating Patterns (Chapters 6 – 9) • Statistical Inference (Chapters 10 – 15)
The key to AP Stats:THINK—SHOW—TELL Think first! Know where you’re headed and why. It will save you a lot of work. Show is what most people think Statistics is about. The mechanics of calculating statistics and making displays is important, but not the most important part of Statistics. Tell what you’ve learned. Until you’ve explained your results so that someone else can understand your conclusions, the job is not done. STAY FOCUSED!
WHO is being described? How many? Individuals are the objects described by a set of data. These individuals go by different names depending on the situation.
WHATare the variables? Units? Variables – characteristics recorded about each individual
CHAPTER 1 Exploring Data
Summarize Categorical Data using aBar Chart orPie Chart AP Scores 30 28 26 24 22 20 18 16 14 12 10 8 6 4 2 FREQUENCY 1 2 3 4 5 AP SCORES
StemplotforQuantitative Data 3 | 4 indicates 34 years old Stem Leaf Ages of Death of U.S. First Ladies 3 | 4, 6 4 | 3 5 | 2, 4, 5, 7, 8 6 | 0, 0, 1, 2, 4, 4, 4, 5, 6, 9 7 | 0, 1, 3, 4, 6, 7, 8, 8 8 | 1, 1, 2, 3, 3, 6, 7, 8, 9, 9 9 | 7 Leaf – single digit Do not skip stems Leafs – smallest to largest
Split Stemplot 1 | 7 1 | 8, 9, 9, 9, 9, 9 2 | 0, 0, 0, 0, 1, 1, 1, 1, 1, 1 2 | 2, 2, 2, 3, 3 2 | 4, 5 2 | 2 | 8 3 | 0, 1 Stem is split for every 2 leaves— (0, 1), (2, 3), (4, 5), (6, 7), and (8, 9) Age of 27 students randomly selected from Stat 303 at A&M
Split Stemplot 1 | 1 | 7, 8, 9, 9, 9, 9, 9 2 | 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 3, 3, 4 2 | 5, 8 3 | 0, 1 3 | Stem is split for every 5 leaves—(0 thru 4) AND ( 5 thru 9) Age of 27 students randomly selected from Stat 303 at A&M
Back-to-back Stemplot Babe Ruth Roger Maris | 0| 8 | 1 | 3, 4, 6 5, 2 | 2 | 3, 6, 8 5, 4 | 3 | 3, 9 9, 7, 6, 6, 6, 1, 1 | 4 9, 4, 4 | 5 | 0 | 6 | 1 Number of home runs in a season
Frequency - # of times something occurs • Cumulative Frequency – keep adding • Relative Frequency – percents • Cumulative Relative Frequency – add percents (AKA ogive) See graphs on page 62
Histogram—Univariate Quantitative data • Classes should be equal width • Reasonable width • Reasonable starting point • Roughly 7 bars • Bars should touch • This is not a bar graph! Frequency Count Univariate Variable Age
HistogramsDiscrete vs. Continuous Continuous Discrete
Location—pthPercentile The pth percentile of a distribution (set of data) is the value such that p percent of the observations fall at or below it. Suppose your Math SAT score is at the 80th percentile of all Math SAT scores. This means your score was higher than 80% of all other test takers.
5 Number SummaryMinimum, Q1, Median, Q3, Maximum Q1 (Quartile 1) is the 25th percentile of ordered data or median of lower half of ordered data Median (Q2) is 50th percentile of ordered data Q3 (Quartile 3) is the 75th percentile of ordered data or median of upper half of ordered data Range= Maximum – minimum IQR = Interquartile Range (Q3 – Q1) middle 50%
Calculating OUTLIERS “1.5IQR above Q3 or below Q1” IQR(Interquartile Range) = Q3 – Q1 Any point that falls outside the interval calculated by Q1- 1.5(IQR) and Q3 + 1.5(IQR) is considered an outlier.
Calculate the 5 Number Summary 121, 132, 134, 154, 164, 175, 188, 192, 201, 203, 203 3, 4, 4, 5, 10, 12, 13, 24 Calculate the 5 Number Summary and Check for Outliers
Boxplot- Using 5 Number Summary 5# Summary of Computers: 250, 1000, 2950, 5400, 8600 1000 2950 5400 250 8600 Q3 Max min Q1 median
Boxplot and Modified Boxplot Modified – show outliers 25% of data in each section
Robust (Resistant) Statistic Median is resistant to extreme values (outliers) in data set. Mean is NOT robust against extreme values. Mean is pulled away from the center of the distribution toward the extreme value (“tails of graph”).
Of the 2 segments, where is the Mean with respect to the Median? Remember the mean is pulled toward extreme values.
Describing Spread: Standard Deviation Roughly speaking, standard deviation is the average distance values fall from the mean (center of graph).
2 population variance s2sample variance Population and SampleStandard Deviation What is Variance???
What is Variance? Variance = (Standard deviation)2
Calculated Standard Deviationis a measure of Variation in data
To describe a distribution: LET’S CUSS! Center Unusual Features Spread Shape
CENTER Mean(, ) —add up data values and divide by number of data values Median—list data values in order, locate middle data value Data Set: 19, 20, 20, 21, 22 Mean is Median is 20 since it is the middle number of the ranked (ordered) data values.
UNUSUAL FEATURES Cluster---Gaps---Potential Outliers
SHAPE “Tail” points to right Skewed Right Normal – bell-shaped The shape can also be skewed left or symmetric or uniform.
SPREAD The spread can be described using: Standard Deviation (about 10) or Range (80 – 150 or 70) or IQR (about 100 – 130)
Summary Features of Quantitative Variables Center – Location Unusual Features – Outliers, Gaps, Clusters Spread – Variability Shape – Distribution Pattern
How to Choose Measures of Center and Spread? NON - SKEWED DISTRIBUTIONS – use mean and standard deviation SKEWED DISTRIBUTIONS – use 5# Summary
Comparing Distributions • CUSS • COMPARE in CONTEXT • GENERAL CONCLUSION
Linear Transformationsusing the height of all LHS Seniors (in inches) What happens to center and spread if everyone is put in 3 inch heels (add 3 inches)? What happens to the center and spread if we change everyone height to feet (divide by 12)?
Summary of Linear Transformations • Multiplying each observation by a positive number b multiplies both measures of center (mean and median) and measures of spread (IQR and standard deviation) by b. • Adding the same number a (either positive, negative, or zero) to each observation adds a to measures of center and to quartiles but does not change measures of spread. • NOTE: The shape NEVER changes!