Statistical Analysis

Statistical Analysis How do we make sense of the data we collect during a study or an experiment?

Two Kinds of Statistical Analysis Descriptive Statistics: • Organize and summarize data Inferential Statistics: • Interpret data and draw conclusions • Used to test validity of hypothesis

Descriptive Statistics • Numbers that summarize a set of research data obtained from a sample • Organized into a frequency distribution (orderly arrangement of scores) • Can be pictured as a histogram (bar graph) • Can be pictured as a frequency polygon (line graph that replaces the bars with single points and connects the points with a line)

Measures of Central Tendency • Central Tendency: single score that represents a whole set of scores • Describe the average or most typical scores for a set of research data • There are various terms we use to measure/interpret scores

Measures of Central Tendency • Mode – the most frequently occurring score (least used measure of C.T.) • Bimodal – if two scores appear most frequently • Multimodal – if three or more scores appear most frequently • Median – the middle score when the set of data is organized by size • Mean – the arithmetic average of the set of scores (most commonly used)

Normal Distribution (also called normal curve or bell-curve) • A “normal distribution” of data means that most of the examples in a set of data are close to the mean (average), while relatively few example tend to one extreme or the other. • Scores are often normally distributed. When this happens, the mode, median, and mean are all the same (in this case, 100).

Measures of Central Tendency in Dunder Mifflin Salaries • Watch out for extreme scores or outliers. • Let’s look at the salaries of the employees of the Dunder Mifflin Paper Company in Scranton: $25,000-Pam $25,000- Kevin $25,000- Angela $100,000- Andy $100,000- Dwight $200,000- Jim $300,000- Michael • The median salary looks good at __________ • The mean salary also looks good at about _______ • But the mode salary is _______

Skewed Distributions • Skewed Distributions: When a few extreme scores (called outliers) significantly affect the mean. • Distributions where most of the scores are squeezed into one end are skewed. • In very skewed distributions, the median is a better measure of central tendency than the _______.

Nolan Ryan $1500 Billy Williams $8 Luis Aparicio $5 Harmon Killebrew $5 Orlando Cepeda $3.50 Maury Wills $3.50 Jim Bunning $3 Tony Conigliaro $3 Tony Oliva $3 Lou Pinella $3 Mickey Lolich $2.50 With Ryan: Median = $2.50 Mean = $74.14 Elston Howard $2.25 Jim Bouton $2 Rocky Colavito $2 Boog Powell $2 Luis Tiant $2 Tim McCarver $1.75 Tug McGraw $1.75 Joe Torre $1.5 Rusty Staub $1.25 Curt Flood $1 W/O Ryan: Median = $2.38 Mean = $2.85 Central Tendency 1968 TOPPS Baseball Cards

Skews • A few of the scores stretch out away from the group like a tail. The skew is named for the direction of the tail. • Tail going to the left – negatively skewed • Tail going to the right – positively skewed

Look at the above figure and note that when a variable is normally distributed, the mean, median, and mode are the same number. • When the data are negatively skewed, this happens: mean < median < mode. • When the data are positively skewed, this happens: mean > median > mode. • If you go to the end of the curve, to where it is pulled out the most, you will see that the order goes mean, median, and mode as you “walk up the curve” for negatively and positively skewed curves. • You can use the following two rules to provide some information about skewness even when you cannot see a line graph of the data (i.e., all you need is the mean and the median): • 1. Rule One. If the mean is less than the median, the data are skewed to the left. • 2. Rule Two. If the mean is greater than the median, the data are skewed to the right.

Statistical Analysis