140 likes | 265 Views
Prediction II. Assumptions and Interpretive Aspects. Assumptions of Regression. Normal Distribution Both variables should be normally distributed For non-normal distributions we use non-parametric tests Continuous Variables Variables must be measured with a interval or ratio scale
E N D
Prediction II Assumptions and Interpretive Aspects
Assumptions of Regression • Normal Distribution • Both variables should be normally distributed • For non-normal distributions we use non-parametric tests • Continuous Variables • Variables must be measured with a interval or ratio scale • Non-parametric tests are better for the scores collected with a nominal and ordinal scales • Linearity • The relation between two variables should be linear • Homoscedasticity • The variability of of actual Y values about YI must be the same for all values of X.
Linearity • In unlinear distributions, r is lower than its real value • So, prediction is less successful • Some characteristics in nature are curvilinearly related. • For such variables, we need to use some advanced tecniques • For instance, the relationship between anxiety and success is curvilinear • When anxiety is low, success is low (motivation is low) • When anxiety is at its medium, success is high (motivation is high and anxiety does not have a derograting effect) • When anxiety is high, success is low (the organism is shocked)
Interpretive AspectsFactors Influencing r • Range of Talent • When Y, X or both are restricted the r is lower than its real value • Because, r is a byproduct of both S2YX and S2Y • That is S2YX/ S2Y in formula B • If we restrict the variance of Y, for instance, standart error of prediction would stay same. So, the r would get lower • See figure 11.1 on page 195 • This is what we called ceiling and floor effect
Interpretive AspectsFactors Influencing r • Range of Talent
Interpretive AspectsFactors Influencing r • Heterogeneity of Samples • When samples are pooled, the correlation for aggregated data depends on where the sample values lie relative to one another in both the X and Y dimensions • Let’s say professor Aktan and Göktürk prepared final exams for two courses: Statistics and Int. Resch. Methd.
Interpretive AspectsFactors Influencing r • Heterogeneity of Samples • Students always gets 20 points higher in Göktürk’s exams
Interpretive AspectsFactors Influencing r • Heterogeneity of Samples • Aktan insist on giving his own Statistics exam
Interpretive AspectsRegression Equation • β coefficient shows the slope of the regression line. • General equation of a straight line • Y=bX + c • Regression of Y on X c β
Interpretive AspectsRegression Equation • β coefficient shows the slope of the regression line. • To see that let’s use two z score distribution in which mean is 0 and SD is 1 • Now, Zx-mean and Zy-mean becomes 0. So, c=0 • Zsy/Zsx is equal to 1/1. So, B=(r1/1)Zx= rZx • As you can see, beta is equal to r in z distributions
Interpretive AspectsRegression Equation • Now, let’s say we calculated r between statistics and research scores for students of Çağ, ODTÜ and Mersin University • For Çağ University r= .82 • For Mersin University r= .62 • For ODTÜ r= .35
Interpretive AspectsProportion of Variance in Y Associated with Variance in X • Correlation coefficient has a special meaning • The squared correlation coefficient is equal to the proportion of variance in Y which is explained by the variance in X • That is explained variance • r2 = proportion of explained variance • 1- r2 = proportion of unexplained variance • Let’s say correlation between depression and GPA is .67 • So, change in depression explains 45% of change in GPA • r= .67, so r2 = .45
Interpretive AspectsProportion of Variance in Y Associated with Variance in X • We can see the meaning of this in the Figure below