150 likes | 164 Views
Learn how to analyze and interpret data using graphs, including stemplots, boxplots, histograms, and bar charts. Understand the measures of center, shape, and spread, and how to identify outliers.
E N D
1st Semester Final Review Day 1: Exploratory Data Analysis Hardest title, easiest problems!
Graphs – Quantitative Data Stemplot Back to Back Stemplot Calf Weight Supplement No Supplement Parts per million ClO2
Graphs – Quantitative Data Boxplot Histogram Modified Boxplot
Graphs – Quantitative Data Side by Side Boxplots
Graphs – Categorical Data Pie Chart Bar Chart 29.0% 24.8% 70 60 50 40 46.2% Frequency 30 20 10 0 Liberal Moderate Conservative Political Identification
Graphs – Categorical Data Side by Side Bar Chart Segmented Bar Chart
Describing or comparing distributions • Center – mean, median (generally use median when looking at a graph) • Shape (symmetry/skew, modes/peaks, unusual features) • Spread – standard deviation, range, IQR (generally use range or IQR when looking at a graph)
Center • Mean: add up and divide by n • Strongly affected by outliers & skew (pulled in the direction of the skew or outliers) • Median: order the numbers and find the middle • Resistant to skew/outliers (not strongly affected)
Shape Approximately Symmetric Skewed Left Skewed Right
Shape Unimodal Bimodal Uniform Multimodal
Spread • Range: Max – Min (affected by outliers) • Quartiles: (resistant to skew/outliers) Q1 = 25th percentile (median of the bottom half) Median = 50th percentile Q3 = 75th percentile (median of the top half) • IQR: Interquartile range = Q3 – Q1 • Standard Deviation: average distance of individual values away from the mean • Strongly affected by outliers/skew (not resistant)
Are there outliers in a data set? • Outliers are values which are outside the upper or lower fence • Upper fence = Q3 + 1.5(IQR) • Lower fence = Q1 – 1.5(IQR)
What happens to summary statistics when we… • Add a constant to a data set? • Measures of position (center, minimum, maximum, quartiles) change • Measures of spread (standard deviation, range, IQR) do not change • Multiplya data set by a constant? • Measures of position and measures of spread change
Standardized Score • z-score = the number of standard deviations a value falls above or below the mean • For an Individual:
Normal Distribution (68-95-99.7 Rule) • Also called the Empirical Rule Standard deviation = 1st inflection point