550 likes | 681 Views
Statistics. Organizing and A nalyzing D ata. Types of statistical analysis. DESCRIPTIVE STATISTICS : Organizes data measures of central tendency mean, median, mode measures of variability range, standard deviation INFERENTIAL STATISTICS: Analyzes data measures of association
E N D
Statistics Organizing and Analyzing Data
Types of statistical analysis • DESCRIPTIVE STATISTICS: Organizes data • measures of central tendency • mean, median, mode • measures of variability • range, standard deviation • INFERENTIAL STATISTICS: Analyzes data • measures of association • correlation coefficient • measures of causation • Statistical significance
Descriptive Statistics: Measures of Central Tendency • Mode • the most frequently occurring score in a distribution • Median • the middle score in a distribution • half the scores are above it and half are below it • Mean • the arithmetic average of a distribution • obtained by adding the scores and then dividing by the number of scores
Central Tendency PRACTICE Using the data set below, compute the (3) measures of central tendency. 2, 4, 5, 7, 7, 8, 9, 9, 9, 10, 11
Descriptive Statistics: Measures of Variability Range - the difference between the highest and lowest score in a set of data 2, 4, 5, 7, 7, 8, 9, 9, 9, 10, 11 Standard deviation-a computed measure of how much scores vary around the mean • calculated by finding the square root of the variance • Defines the shape of the normal distribution curve
The red area represents the first standard deviant. 68% of the data falls within this area. The green area represents the second standard deviant. 95% of the data falls within the green PLUS the red area. Calculated by The blue area represents the third standard deviant. 99% of the data falls within blue PLUS the green PLUS the red area.
Two classes took a recent quiz. There were 10 students in each class, and each class had an average score of 81.5
Since the averages are the same, can we assume that the students in both classes all did pretty much the same on the exam? Why or why not?
The answer is… No.The average (mean) does not tell us anything about the distribution or variation in the grades.
So, we need to come up with some way of measuring not just the average, but also the spread of the distribution of our data.
Why not just give an average and the range of data (the highest and lowest values) to describe the distribution of the data?
Well, for example, lets say from a set of data, the average is 17.95 and the range is 23. But what if the data looked like this:
Here is the average But really, most of the numbers are in this area, and are not evenly distributed throughout the range. And here is the range
The Standard Deviation is a number that measures how far away each number in a set of data is from their mean.
If the Standard Deviation is large, it means the numbers are spread out from their mean.If the Standard Deviation is small, it means the numbers are close to their mean. large, small,
Here are the scores on the math quiz for Team A: Average: 81.5
The Standard Deviation measures how far away each number in a set of data is from their mean. For example, start with the lowest score, 72. How far away is 72 from the mean of 81.5? 72 - 81.5 = - 9.5 - 9.5
Or, start with the lowest score, 89. How far away is 89 from the mean of 81.5? 89 - 81.5 = 7.5 - 9.5 7.5
So, the first step to finding the Standard Deviation is to find all the distances from the mean. Distance from Mean
So, the first step to finding the Standard Deviation is to find all the distances from the mean. Distance from Mean
Next, you need to square each of the distances to turn them all into positive numbers Distance from Mean Distances Squared
Next, you need to square each of the distances to turn them all into positive numbers Distance from Mean Distances Squared
Add up all of the distances Distance from Mean Distances Squared Sum: 214.5
Divide by (n - 1) where n represents the amount of numbers you have. Distance from Mean Distances Squared Sum: 214.5 (10 - 1) = 23.8
Finally, take the Square Root of the average distance Distance from Mean Distances Squared Sum: 214.5 (10 - 1) = 23.8 = 4.88
This is the Standard Deviation Distance from Mean Distances Squared Sum: 214.5 (10 - 1) = 23.8 = 4.88
The Standard Deviation for the other class grades is 15.91 Distance from Mean Distances Squared Sum: 2280.5 (10 - 1) = 253.4 = 15.91
Now, lets compare the two classes again 81.5 81.5 4.88 15.91
Which is the “smarter” class and why? Class A St. Dev = 4.88 Class B St. Dev = 15.91
The red area represents the first standard deviant. 68% of the data falls within this area. The green area represents the second standard deviant. 95% of the data falls within the green PLUS the red area. Calculated by The blue area represents the third standard deviant. 99% of the data falls within blue PLUS the green PLUS the red area.
INFERENTIAL STATISTICS Correlational design Correlation Coefficient: How strong is the relationship between the two variables? As one goes up does the other go slightly or more extremely up or down? Experimental design Statistical Significance: How confident am I that the difference between my experimental group and control group is a result of the treatment?
Correlation Coefficient • A statistic that quantifies a relation between two variables • Can be either positive or negative • Falls between -1.00 and 1.00 • The value of the number (not the sign) indicates the strength of the relation
Positive Correlation Association between variables such that high scores on one variable tend to have high scores on the other variable A direct relation between the variables
Negative Correlation Association between variables such that high scores on one variable tend to have low scores on the other variable An inverse relation between the variables
Correlational Research • The correlation technique indicates the degree of association between 2 variables • Correlations vary in direction: • Positive association: increases in the value of variable X are associated with increases in the value of variable Y • Negative association: increases in the value of variable 1 are associated with decreases in the value of variable 2 • No relation: values of variable 1 are not related to variable 2 values
Correlation • Correlation Coefficient • a statistical measure of the extent to which two factors vary together, and thus how well either factor predicts the other Indicates direction of relationship (positive or negative) Correlation coefficient r = +.37 Indicates strength of relationship (0.00 to 1.00)
Check Your Learning • Which is stronger? • A correlation of 0.25 or -0.74?
Misleading Correlations:Correlation is NOT Causation • Something to think about • There is a 0.91 correlation between ice cream consumption and drowning deaths. • Does eating ice cream cause drowning? • Does grief cause us to eat more ice cream?
Correlation Correlation is NOT causation -e.g., armspan and height 45
The Limitations of Correlation • Correlation is not causation. • Invisible third variables Three Possible Causal Explanations for a Correlation
Inferential statistics Statistical Significance: Computation that determines degree of confidence that your experimental results occurred due to the treatment and not other factors • How likely/probable are results like mine to occur by chance? • a statistical computation and statement of how likely it is that an obtained result occurred by chance
Statistical Significance • Statistical significance is calculated by determining: the probability that the differences between sets of data occurred by chance or were the result of the experimental treatment. Statistical Significance (α) reveals the probability level that results could be obtained by chance. Most common pre-determined value= 5%/.05 (…which means that there is a 5% chance or below that results were obtained by chance)
Statistical Significance and the Null Hypothesis • Two hypotheses need to be formed: • Research hypothesis- the one being tested by the researcher. • Null hypothesis- the one that assumes that any differences within the set of data is due to chance and is not significant.
The Null Hypothesis • Instead of testing to find the intended result, research test the “Null” which is the OPPOSITE of one’s hypothesis. • If there is ANY difference between the control and the experimental group, and the research is confident it’s because of the IV, he/she REJECTS THE NULL. • Example 1: Caffeine has NO effect on student’s ability to stay awake past 2 a.m. • Example 2: Music has NO effect on subjects’ memory