80 likes | 164 Views
AP Statistics Review. Analyzing Data (C2-5 BVD) C5: Numerical Descriptions of Data. Shape – might note particular concentrations or large spreads of data in box plots
E N D
AP Statistics Review Analyzing Data (C2-5 BVD) C5: Numerical Descriptions of Data
Shape – might note particular concentrations or large spreads of data in box plots • Outliers – If distribution is skewed, use 1.5IQR above Q3 and below Q1. If distribution is unimodal and symmetric, you may use more than 2 or 3 standard deviations from mean OR 1.5IQR. When in doubt use 1.5IQR. • Center – Use mean if unimodal/symmetric, otherwise use median. • Spread – Use standard deviation if unimodal/symmetric, otherwise use IQR. SOCS by the Numbers
Box plots are useful as distributions of quantitative data. The TI is good at making comparative box plots. • 1. Order data from least to greatest. • 2. Find median (middle number, or average of middle two). • 3. Median of data below the median is Q1. Median of data above the median is Q3. • 4. Five number summary: low, Q1, median, Q3, high. • 5. Box is Q1 to Q3 with bar at median. Whiskers go from low to Q1 and Q3 to high. • TI Tip – Choose the box plot that shows outliers Making Box Plots
Remember: a box plot divides data into equal 4ths. If each fourth is about equal in size, that means the data are not concentrated over any particular subrange. • If a fourth if particularly skinny, that means 1/4th of the data is concentrated in a small range. “Lots” of data points are about the same. • If a fourth is particularly long, that means that 1/4th of the data is very spread out. • Q1 corresponds to the 25th percentile, median to the 50th percentile, and Q3 to the 75th percentile. • Therefore the “box” or IQR is the middle 50% of the data. Interpreting box plots
Plot two or more box plots on the same number line to compare and contrast. • What portion/percent of the data in one box plot is above/below the other? Or above/below median of other? • Is one more consistent or more spread out than the other? • Look for big picture similarities and differences between the two groups. Parallel box plots
Calculating Percentiles: • 1. Order data from least to greatest. • 2. The percentile for a particular value is the number of values below that one / the number of data points. • 3. It is not possible to be at the 100th percentile even if you’re the best. Percentiles
The percentile graph • Data values on x-axis • Percentiles (cumulative relative frequency) on y-axis • Example: if 50 mph is the speed at the 45th percentile, put point at (50,45) • Shape is typically as s-curve. Cumulative Relative Frequency Graphs (Ogives)
Standard deviation is a preferred measure of spread, particularly for unimodal/symmetric data. It is roughly like an average of how far a typical value in the data set is from the mean. • To calculate standard deviation: • 1. Find the mean of the data. • 2. Take each data point minus the mean. This is called a deviation. • 3. Square all the deviations. • 4. Sum them up. • 5. Divide by (n-1) • 6. Take the square root. • 1-var stats => sigma for populations, Sx for samples. When in doubt use Sx. Standard Deviation