100 likes | 316 Views
Outliers and Measures of Central Tendency / Dispersion. Wib Leonard Megan Marchini Nick Pajewski. Learning Objectives. Explain how outlying observations effect numeric summary measures of central tendency and dispersion
E N D
Outliers and Measures of Central Tendency / Dispersion Wib Leonard Megan Marchini Nick Pajewski
Learning Objectives • Explain how outlying observations effect numeric summary measures of central tendency and dispersion • Explain why rank-based statistics are more robust to outlying observations • Recognize outlying observations graphically through box-plots
Context • Calculate summary measures of central tendency and dispersion • Mean, median, & mode • Standard deviation, Inter-quartile range • Graphically represent data distributions in the form of a box-plot
Basic Activity Description • Students are put in groups of 4-5 • Students collect their own ages and the ages of any siblings • They then add in an outlying observation, the age of the oldest grandparent in the group, creating a second dataset • NOTE: Activity could be adjusted using coins, die, etc.
Activity cont’d. • The “hands-on” portion then involves computing for each of the datasets • Mean, median, mode, standard deviation, IQR • Constructing a boxplot • See attached worksheet • Group results are then collected (on the board, etc. ) to illustrate major objectives and to discuss how the effect of outliers diminishes with sample size
Formal Computer Presentations • After the “hands-on” portion, major concepts could be formalized using an example like the Sharks dataset (Agresti page 45) • Data contains shark attacks worldwide • Florida represents an outlier • 289 attacks vs 64 for next highest • Construct an Excel spreadsheet that contains data and automatically computes summary measures
Applet Presentation http://standards.nctm.org/document/eexamples/chap6/6.6/index.htm#inst1
Summary of Objectives • When using the ages of your group and its siblings, the mean, median and mode should be similar. However, when adding the age of the grandparent, we would expect the mean to be greater than the median. • Summary measures like the mean & sample standard deviation are more sensitive to outliers than rank-based measures like the median and IQR. • The effect of outliers diminishes as the sample size increases.
Follow-up Topics • Dealing with outliers • Exclusion • Data transformations • Hypothesis Testing • Parametric tests vs. rank-based statistics • Regression Models • Residual analysis, influential observations, etc