130 likes | 246 Views
PSYC 221: Statistics. Science Fiction, Hocus Pocus and Brute Force Methodology. Linear Regression. Goal is prediction of Y from knowledge of x Y( predict ED ) = a + bX Where a is intercept, b is slope of regression line and X is predict OR variable
E N D
PSYC 221: Statistics Science Fiction, Hocus Pocus and Brute Force Methodology
Linear Regression • Goal is prediction of Y from knowledge of x • Y(predictED) = a + bX • Where a is intercept, b is slope of regression line and X is predictOR variable • So, for example, Y = first year success (GPA) and X = ACT score • What about additional predictors? Can we improve our guessing by adding variables (high school GPA, extracurricular activities)?
A Quick Review: Regression a = regression constant (y intercept) b = regression coefficient (slope) x = score on predictor variable (known) Y(hat) = score to be predicted (solved for)
Multiple Regression • Determining the association between (relationship/correlation between) a criterion variable (Y predicted) and two or more predictor variables (X1, X2,… Xx scores) • Y(predictED) = a + b1X1 + b2X2 + b3X3 • Yields multiple correlation coefficient R • R2 = proportion of variance in Y accounted for by all of the predictor variables
Issues in Adding Variables in Multiple Regression Formula • Potential inter-correlation of predictorsMULTICOLLINEARITY • If inter-correlation, then order of addition into equation matters • How do you determine which to add in first, second, third, and so on? • Simultaneous – all predictors entered in simultaneously • Sequential (hierarchical) – predictors entered in one at a time in order determined by researcher • Stepwise – computer selects which predictor to enter in first to maximize fit
The Big Picture • The General Linear Model – general formula that all statistics we have used is based in • Desert Island formula Y = a + b1X1 + b2x2 + e Variance in DV = fixed factors + manipulated factors + error • Can derive r, t, F • Why did we do all those? • Why are we talking about the GLM?
The Issue of Inter-correlation • Partial Correlation – computing the amount of correlation above and beyond the influence of some third variable • e.g. the association between study time and grade in statistics class while holding constant (above and beyond) general mathematical ability • “partialing out” the effect of math ability • Often used as a way to bolster the conclusions of study (by ruling out alternative explanations) a)
Brute Force Use of Correlations • Reliability • Cronbach’s alpha • Multilevel (hierarchical) Modeling • Looking at correlations with data that are grouped in some way • e.g. correlation between study time and exam score across all sections of PSYC 221 offered • Differ in instructor, time, textbook, etc. • Compute correlation for each group, then average • Could also “layer” the situation • e.g. look at instructor experience as predictor of performance • Initial correlation (time and performance) = lower level variables • Instructor experience and performance = upper level variables
Internal Validity • Internal validity = refers to ability to infer cause and effect in experimental research • Correlations have very low internal validity because no IV is manipulated • Pattern of correlations can sometimes be used to test specific predictions about what is causing what
Causal Modeling • The Pattern is everything… • Media violence and aggression • Famous bobo doll experiment • Inability to directly manipulate IV • CAN look at pattern (path) of correlations • Examples of systems provided in the text • Path analysis • Mediational analysis • Structural equation modeling • Latent variables – unmeasured, theoretical variable (pg 631)
When in doubt, don’t panic! • What to do with unfamiliar statistics • Common features of statistics • p values • Effect sizes • Expressions of association or difference • Don’t check out! • Focus on what you DO know • Look for further information • Ask questions • Understand WHY – What is the question? What is the answer?