400 likes | 486 Views
Chapter 7: Advanced Correlational Strategies. Regression : Predict scores on one variable from scores on another variable Use GRE scores to predict success in grad school Regression equation : predict one score on the basis of another score.
E N D
Chapter 7: Advanced Correlational Strategies Regression: Predict scores on one variable from scores on another variable • Use GRE scores to predict success in grad school • Regression equation: predict one score on the basis of another score. • Goal is to find an equation for a regression line that best fit the data.
A regression line is a straight line that summarizes the linear relationship between two variables. • The regression line minimizes the sum of the squared deviations around the line. • It describes how an outcome variable y changes as a predictor variable x changes. • A regression line is often used as a model to predict the value of the response y for a given value of the explanatory variable x.
The regression equation is expressed by: y = 0 + 1x • y is the variable you are predicting (dependent variable, criterion variable, or outcome variable) • x is the predictor variable that we are using to predict y 0 is the regression constant (beta-zero), which is the y intercept of the line that best fits the data in the scatter plot 1is the regression coefficient which is the slope of the line that best represents the relationship between x and y
Example: Correlation between outside temp and how many students attend class. The regression equation values are: 0 is 114.35 and 1is -.61 • If it is supposed to be 82 degrees on Friday how many students would you expect to attend class that day? y = 0 + 1x Attendance = 114.35 - .61 (82) Attendance = 114.35 – 50.02 Attendance = 64.33
Multiple Regression is used when there is more than one predictor variable. • If you are predicting success in grad school you may use three predictor variables: GRE scores, University GPA, and IQ scores. • Then you can predict success in grad school based on all three predictors, which usually is more accurate than one predictor. • Allows the researcher to simultaneously consider the influence of all the predictor variables on the outcome variable.
Types of Multiple Regression 1. Standard multiple regression (simultaneous multiple regression): enter all the predictor variables at the same time. • You can predict grad school success by entering GPA, GRE, and IQ score simultaneously. • You will get one regression constant (0) and a separate regression coefficient (1) for each predictor variable, which is based on the correlation between each predictor variable and the outcome variable. • Grad school success: = -2.14 + .29(GPA) + .98(GRE) + 1.21 (IQ)
2. Stepwise Multiple Regression: enter the predictor variables one at a time. • First enter the predictor variable that correlates the highest with the outcome variable. • Next, you enter the variable that relates the strongest to the outcome variable after the first variable is entered. • It will account for the highest amount of variance in the outcome variable after the the first predictor variable is entered • This may or may not be the second highest correlation. If the second highest correlation was highly correlated with the first variable than it may not predict a unique amount of the variance in the outcome variable.
Motivation .40 Grad School y GPA .68 GRE .50
3. Hierarchical Multiple Regression: enter the predictor variables in a predetermined order, based on hypotheses the researcher wants to test. • Can partial out the effects of predictor variables entered in early steps to see if other predictor variables still have a contribution uniquely to the variance in the outcome variable. • Confounding variables: variables that tend to occur together, making it hard to determine their unique effects.
E.g. We want to determine the relation between drinking while pregnant and child's IQ score. • But, we know that mothers who drink while pregnant also tend to smoke and do other drugs while pregnant, which could also decrease child’s IQ. • We can enter smoking and other drug use into the regression equation first and then enter drinking: • to see if after smoking and other drug use are accounted for (partialled out), if drinking uniquely predicts IQ scores above and beyond smoking and other drug use.
Mediation Effects: occur when the effect of x on y is actually occurring because of a third variable, z. • First enter the possible mediator variables. • Then you can see if x uniquely predicts variance in y after z is accounted for and partialled out (statistically removed) • Correlation between drowning and eating ice cream, but this relation may be related to a mediator variable called summer (heat). • We could fist enter heat in to the regression to determine how strongly heat is uniquely related to drowning, then after heat is removed we can determine whether eating ice cream is actually uniquely related to drowning.
Multiple correlation coefficient (R) • The ability of all the predictor variables together to predict the outcome variable. • Represents the degree of the relationship between the outcome variable and the set of predictor variables. • Ranges from .00 to 1.00, the larger the R the better the predictor variable accounts for the variance the outcome variable. • R can be squared to show the percent of the variance in the outcome variable (y) that is accounted for by the set or predictor variables. • R = .50, accounting for 25% of the variance in y.
Studying the correlations between happiness and various predictor variables Predictor VariablesHappiness Self-Esteem .15 Social Network .33 Money .02 Life Satisfaction .20 • In a stepwise regressions, which variable would be entered first? Which will enter the equation second? • Which variable is least likely to be included as a predictor in the final equation? • If a standard regression was done, what is the smallest that the multiple correlation b/w the 4 predictor variables and the criterion variable could possibly be?
Cross-Lagged Panel Design • The correlation between two variables is calculated at two different points in time. • Then calculate the correlation between the two variables across time. • If we want to determine whether watching violent TV leads to aggressive behavior, OR if aggressive children prefer to watch more violent TV we can use a cross-lagged panel design • Look at correlation between TV violence (x) at time 1 and aggression (y) at time 2. • Look at correlation between aggression (y) at time 1 and TV violence (x) at time 2.
If TV violence leads to aggression then the correlation between x at time 1 and y at time 2 should be stronger than the correlation between y at time 1 and x at time 2. Time 1Time 2 TV violence r =.05 TV violence r =.31r =.01 r = .21 r = -.05 Aggression r =.38 Aggression
In this cross-lagged panel design foes x appear to cause y, does y appear to cause x, do both variables influence each other, or are x and y unrelated? Time 1Time 2 Energy r =.65 Energy r =.45r =.37 r = .51 r = .49 Exercise r =.23 Exercise
Structural Equations Modeling • Allows you to test hypotheses about the pattern of correlations. • Researcher makes precise predictions about how three or more variables are causally related. • x caused y which cases z • Then you can compare your hypothesized correlation matrix against the real correlation matrix. • This analysis determines the degree to which the patterns of correlations observed matches or fits with the researchers predictions or model. • Can also test two different models against each other to see which one fits best with the observed correlation matrix.
Factor Analysis • Analyze the interrelationships among a number of variables. • Look for a pattern in the correlation matrix; look for correlations among the correlations. • Can determine if some variables are all highly correlated with each other but not with other variables that may only correlate with each other. A B C D A 1.00 .69 .04 -.03 B -- 1.00 .09 .10 C -- -- 1.00 .75 D -- -- -- 1.00
Present the data in a factor matrix with factor loadings which represent the correlations of the variables with the factors. Variable Factor 1 Factor 2 A .90 .02 B .87 -.01 C .03 .92 D .07 .93 • Then you can identify labels for the factors. This is usually related to the researchers underlying hypotheses and theory, but can be subjective.
Factor analysis can be used to: • Study the underlying structure of psychological constructs (personality traits). • To reduce a large number of variables to a smaller, more manageable set of data. • May include 40 measures of three different types of working memory, knowing that there are only a few basic constructs • In the development of self-report measures of attitudes and personality. • Want to ensure certain measures are measuring the same construct.
Chapter 8: Basic Issues in Experimental Research • Experimental designs allow researchers to make cause and effect conclusions. Three characteristics: • Researcher must vary at least one independent variable and assess its effects one a dependent variable. • Researcher must assign participants to experimental conditions in a way that ensures initial equivalence. • Researcher must control extraneous variables that may influence the participants’ behavior.
Manipulating the independent variable: • Independent variable is the variable that is manipulated by the researcher. • Must have two or more levels (conditions) • Different does of a drug (100, 200, or 300 mg) • Quantitative differences (numerical differences in amount of drug, or amount of time etc) • Qualitative differences (one condition people study with back ground noise and in another with no background noise)
Types of independent variables: • Environmental manipulations: experimentally modify the physical or social environment • Different levels of lighting, group size. • Instructional manipulations: vary the instructions that participants receive. • One condition may tell participants the task will be very difficult, in another may tell them it will be easy • Invasive manipulations: invoke changes in the participant's body through surgery or drugs. • Different doses of a drug, rats with parts of their brain damaged.
Experimental groups: participants who receive some level of the independent variable Control group: participants who do not receive a level of the independent variable. • Helps to identify the baseline level of performance To ensure their independent variable is strong enough to produce and effect researchers may: • Pilot test: test the independent variable on a small sample of participants to ensure the levels of the independent variable are different enough to produce an effect. • Manipulation check: ask the participants if they noticed the difference in the independent variable
Subject variable: reflects existing characteristics of the participant (age, gender) Dependent variable: response being measured in the study
Assigning participants to conditions: • Want to ensure that the participants are the same before they are assigned to conditions, so effects are due to the manipulation of the independent variable and not due to pre-existing participant characteristics. Between subject designs: • Simple random assignment: Each participant has an equal probability of being placed in each condition. • Matched random assignment: test the participants on a measure related to the dependent variable and then assign to conditions by matching to ensure you have the same number of people who are high and low on the measure in each condition
Within-subjects design • Repeated measures design: each participant completes all conditions • No need for random assignment • Participants may participate in the experimental and control group or in all the different levels of the independent variable • More powerful than b/w subjects • Because the participants serve as their own controls • Require less participants (can have 30 who participate in all three conditions, instead of 30 per condition making 90).
Order effects: the order in which the levels of the independent variable are received may affect the participant’s behavior • If studying memory for words under different lighting conditions (each condition has more light) participants may be tired by the last condition which may reduce performance. • Participants may show a practice effect in that they get better at the task in subsequent conditions.
Counterbalancing: A procedure in which the order of conditions in a repeated-measures design is arranged so that each condition occurs equally often in each order. • Latin square design: each condition occurs once at each ordinal position and also follows equally often after each of the other conditions • Carryover effects: occurs when the effects at one level of the independent variable are still present at another level (condition). • Must ensure drug of one dosage wears off before the next conditions started
Experimental Control: • Eliminate or hold constant the effects of other extraneous variables that may effect the dependent variable. • Systematic variance: (b/w groups variance) is the part of the total variance that reflects the differences among the experimental groups or conditions. Systematic variance = treatment variance + confound variance • Treatment variance: is due to the independent variable • Confound variance: is due to extraneous variables that differ between the groups and not due to the independent variable
Error variance: reflects unsystematic differences among the participants • Random variations in the setting (temp, lighting) and procedure (experiment’s mood), or due to differences among participants within the group. • Can remove error variance from treatment variance using statistics. Total variance = treatment variance + confound variance + error variance • Want to maximize treatment variance, eliminate confound variance, and minimize error variance.
Sources of Error Variance • Individuals differences: participants may differ cognitively, physiologically, and behaviorally. • Get participants that are homogenous, more alike. • Transient states: participants may differ in transient states (mood, tiredness, health) • Environmental factors: differences in the study environment (noise, time of day). • Researchers should try to hold the environment constant • Differential treatment: researchers should treat all participants the same. Experimenter’s mood or health can influence how they treat some participants, or the participants behavior (friendly, mean etc) may affect their treatment • Measurement error: error in measuring. Try to use reliable measures.
Eliminating Confounds • Internal validity: the extent to which changes in the dependent variable can be attributed to the influence of the independent variable rather than to confounding variables. • Degree to which researchers can draw accurate conclusions about the effects of the independent variable. Internal validity threats: • Biased assignment of participants to conditions: participants in each condition differ at the beginning, so differences in the dependent variable may reflect pre-existing differences among the participants rather than differences due to the independent variable
Random Assignment A A B B C B B C A B B C B C A B C B C A A B B C B A B B A A B B C B B C Biased Assignment A A A B B B B B A B B C B C A B C B C A A B B C B A B B A B B B C C C C
Differential attrition: participates who do not continue in the study (drop out). Attrition can occur at a different rate in the different conditions • Problematic when more participants drop out of one condition as compared to the other condition • People who drop out may be different than those who stay (more scared of experiment, less motivated). • Pretest sensitization: taking a pretest may affect how participants behave in the experiment, so it is hard to determine whether effect is due to the pretest or the independent variable.
History: history effects can effect the dependent variable • Testing anxiety in participants, perhaps a participant in one groups had just gone through a very anxious situation and may be more anxious already due to other factors than in the experiment. • Maturation: Participants may change overtime in a longitudinal experiment. May be difficult to distinguish effect of the independent variable from maturation changes over time. • More problematic in research with children. • Miscellaneous design confounds: due to participants being treated differently in different conditions, which results in confounding.
Experimenter expectancy effects: researchers may observe behavior in a biased way that reflects what they expect to happen. • Their expectations can distort the results • Demand characteristics: participants may behave differently because of noticeable aspects of the experiment • They may be able to guess what the researchers are researching and act accordingly. • Double-blind procedure: neither the participant nor the researcher knows which condition a participant is in. • Helps to eliminate experimenter expectancy effects and demand characteristics
Placebo Effects: an artifact that occurs when participant's expectations about what effect an and experimental manipulation is supposed to have influence the dependent variable • If participants think they are in a drug group they may be more likely to say the drug produced an effect. • Placebo control group: receive a pill but with no drug, so participants do not know if they are truly receiving the drug
External validity: the extent to which the results of the study can be generalized beyond the study to other places, people, times, and procedures. • Experimenter's dilemma: the more the researcher controls the study setting the more internal validity the experiment has, but the lower the external validity. • Most researchers prefer strong internal validly over external validity, because they must ensure their effects are due to the independent variable. • Usually researchers are testing a theory about the relation between variables, so their relations should hold under different conditions and settings • Replication is important.