250 likes | 312 Views
CHAPTER 7. COMPARING TWO MEANS Difference between two groups. Often in Social sciences we are not just interested in looking at which variables covary or predict an outcome. Instead we want to look at the effect of one variable another by systematically changing some aspect of the variable.
E N D
CHAPTER 7 COMPARING TWO MEANS Difference between two groups
Often in Social sciences we are not just interested in looking at which variables covary or predict an outcome. Instead we want to look at the effect of one variable another by systematically changing some aspect of the variable. Rather collecting naturally occuring data as in correlation and regression, we manipulate one variable to observe ist effect on the other. 7.2. Revision of Experimental Research
Group 1: Positive Reinforcement Group 2: Negative Reinforcement Teaching method (Positive or Negative) is known as the independent variable. It has two levels Outcome is the statistical ability: called the dependent variable. Example: Effect of encouragement on learning
7.2.1 Two methods of data collection Between Group, Between Subject or Independent Design Within subject or repeated measure design (Dependent Design)
Repeated Measures Design (Dependent) Example: Train Monkeys to run Economy Training Phase 1: Given Userfriendly computers with press buttons, which change various parameters of the economy. Once they change the parameters, a figure appears on the screen indicating the economy growth. No Feed Back Training Phase 2: The same monkeys are given computers. If the economic growth is good, they get a banana. 7.2.2. Two types of variation Independent Design (Different Participants Example: Train Monkeys to run Economy Training Phase 1: 5 Monkeys Recieve training without feedback. 5 Monleys Recieve Training withfeedback. No Feed Back If there was no experimental manipulation (i.e No bananas), then we would expect the behaviour of hte monkeys to be the same. We expect this beacuse other factors i.e age, gender, IQ, motivation are same for both conditions. Therefore monkeys who score high in condition 1 should also score high in condition 2 and vice versa. But the performance will not be identical. There will be small differences in performance created by unknown factors. This variation is called unsystematic variation. If there was no experimental manipulation (i.e No bananas), then we would still find some variation between behaviour between the groups because they contain different monkeys, who vary in their ability, motivation and IQ etc. The fctors that were held constant in the repeated measures design are free to vary in the independent design. Therefore the unsystematic variation will be bigger than for repeated measures design,
7.2.2 Two Types of Variation (Continued) UnSystematic Variation This variation results from random factors that exist between the experimental conditions (such as natural differences in ability) Systematic Variation The variation due to the experimentor doing some thing to all of the participants in one condition but not in the other condition • The role of statistics is to discover how much variation there is in Performance and then to work out how much of this is systematic and how much is unsystematic. • Repeated Measure Design: Difference between two conditions can be caused by only two things • The manipulation that was carried out on the participants • Any other factor that might affect the way in which a person peroforms from one time to the next. • Independent Design: • The manipulation that was carried out on the participants • Difference between the characteristics of the people allocated to each of the groups. The second factor inthis case is likely tocreate considerable random variation both within each condition and between them.
7.3.1. Error Bar graphs for between group designs SPSS Example: spiderBG.sav Figure 7.3
7.3.2. Error Bar graphs for repeated measures design Figure 7.9
Examples: Does the viewing of an advertisement leads to more purchase. 7.4 Testing Difference between Means: The T Test Independent means t test : Different participants Dependent means t test: Same participants
Two samples of data are collected and the sample means calculated. These means might differ by either a little or a lot. If the samples come from the same population, then we expect their means to be roughly equal. Although it is possible for their means to differ by chance alone, we would expect large differences between sample means to occur very infrequently. Therefore under the Null Hypotheses we assume that the experimental manipulation (advertisement) has no effect on the participants, therefore the sample means are very similiar. H0 : Mean Sample 1 = Mean Sample 2 We compare the difference sample means that we collected to the difference between the sample means that we would expect to obtain by chance. We use the standard error as the gauge of the variability between sample means. If std error is small, we expect most samples to have similiar means. If std error is large, we expect to obtain large differences in sample means by chance alone. 7.4.1 Rationale for the t test
If the difference between the sample means we have collected is larger than that what we would expect based on the standard error then we can assume one of the two things That sample means in our population fluctuate a lot by chance alone and we have, by chance collected two samples that are atypical (not representative) of the population from which they came. The two samples come from different populations but are typical of their respective parent population. In this scenario, the difference between samples represents a genuine difference between the samples. (Null Hypothesis is incorrect) As the observed difference between the sample means gets larger, the more confident we become that the second explaination is correct 7.4.1 Rationale for the T Test (Continued)
Both independent and dependent tests are parametric tests Data are from normally distributed populations Data are measure at least at the interval level The independent t test, because it is used to test different groups of people also assumes Varinaces in these populations are roughly equal (homogeneity of variance) Scores are independent (because they come from different people) 7.4.2. Assumptions of the t test
7.5 The Dependent T Test: To Analyse whether differences between group means are statistically meaningfull. The equation compares the mean difference between our samples with the difference we would expect to find bewteen population means and then takes in to account the standard error of the differences.
7.5.1 Sampling distributions and the standard error Sampling distributions have several properties that are important • If the population is normally distributed then so is the sampling distribution i.e if the sample size is more than 30, it is always normal • The mean of the sampling distribution is equal to the mean of the population. • Standard deviation of a sampling distribution is equal to the std devation of the population divided by the square root of the sample size. • We can extend this idea to differences between sample means • If you take several pairs of samples from a population and calculate their means, then you can also calculate the difference between their means. • Look at explaination on page 289
Variation explained by the model 7.5.2 The dependent t test equation explained Variation not explained by the model If we calculate the difference between each persons score in each condition and add these differences, we get the total amount of difference. If we divide this by no of participants, we get averge difference. This average difference is D bar and an indicator of systematic variation due to the experimental effect. We need to be sure that the observed difference is due to our experimental manipulation (and not a chance result) Knowing the mean difference is not useful beacuse it depends on the scale of measurement so we standardise the value. We can standardise it by dividing it by the sample std deviation of the differences. Std deviation is a measure of how much variation there is between participants differences scores. Thus the std deviation of differnces represents the unsystematic variation.
7.5.2 The dependent t test equation explained (continued) • Dividing by std deviation is a useful means of standardising the average difference between conditions. • We are interested in knowing how the difference between sample means compares, to what we would expect to find had we not imposed an experimental manipulation. • We can use the properties of the sampling distribution: Instead of dividing the average differences between conditions by std deviaiton of differnces, we divide it by std error of differnces. • Dividing by the std error standardises the average differences between conditions, but also tells us how the difference between the sample means compares in magnitude to what we would expect by chance alone. • If std error is large, then large differences between samples are more common (because the distribution of differences is more spread out). Conversly if the std error is small, then large differences between sample means are uncommon. (because the distribution of differences is very narrow and centred around zero). • Therefore if the avg difference between our samples is large and the std deviation is small, then we can be confident that the difference we observed in our sample is not by a chance result. So if the difference is not by chance it must have been caused by the experimental manipulation.
The t statistics is simply the ratio of the systematic varioation in the experiment to the unsystematic variation . If the experimental manipulation creates and effect, then we expect the systematic variation to be much greater than the unsystematic variation. i.e t shall be greater than 1. If the experimental manipulation is unsuccessful then we might expect the variation caused by individual differences to be much greater than that caused by the experiment. So t will be less than 1. We can compare the obtained t value against the max value we would expect to get by chance alone in a t distribution with the same dof. If the value we obtain exceeds this critical value we can be confident that this reflects an effect of our independent variable. 7.5.2 The dependent t test equation explained (continued) Variation explained by the model Variation not explained by the model
Open the file: spiderRM.sav Analyse==Compare Means==Paired Samples T Test 7.5.2 The dependent t test using SPSS
7.6 The Independent T Test Instead of looking at differences between pairs of scores, we look at the differences between the overall means of the two samples and compare them to the differences we would expect to get between the means of the two populations from which the samples come. Under the null hypotheses the equation becomes
If we tookl several pairs of samples, the differences between the sample means will be similiar across pairs.
Open the file spiderBG.sav Analyse==Compare Means==independent Samples T Test 7.6.2 The independent t test using spss
Open SpiderBG.Sav and simple linear regression, using group as predictor and anxiety as the outcome.