170 likes | 233 Views
LESSON 4.4. MULTIPLE LINEAR REGRESSION. Residual Analysis. Design and Data Analysis in Psychology II Susana Sanduvete Chaves Salvador Chacón Moscoso. Type of residuals. Residuals (ordinary): difference between the observation (Y) and prediction( ).
E N D
LESSON 4.4.MULTIPLE LINEAR REGRESSION.Residual Analysis Design and Data Analysis in Psychology II Susana Sanduvete Chaves Salvador Chacón Moscoso
Type of residuals • Residuals (ordinary): difference between the observation (Y) and prediction( ). • The in residue ei is a random variable has the following properties : • Under the assumption of normality is obtained:
Type of residuals • Standardized residuals: errors after being established (zero mean and variance close to 1). • Helps to distinguish huge residuals.
Type of residuals • Outlier: one that has a large residue. • Subjective criteria. The most common is to consider an outlier when its standardized residual is bigger than 2. • The larger the standardized residual, more unusual is the observation.
Type of residuals • Outliers are important because their inclusion or not in the sample can differ greatly estimated regression line. • It is necessary to study direct scores with high standardized residuals. There are many causes that prompt the existence of outliers. Some of them are: • The observed point is an error (in measurement, in the transcription of data, etc.), but the fitted model is adequate. • The observed point is correct but the model fit is not, due to possible different reasons: • Because the relationship between the two variables is linear in a certain range but it is not linear to the point where it is observed. • There is a strong heteroscedasticity with some observations that are separated from the tag. • There is a classification variable that has not been taken into account.
Type of residuals • Studentized Residual: It is calculated the same way as standardized, but calculating the residual variance (sR) from the whole sample, except the residue of the observation under study. • Thus, dependence between numerator and denominator disappears.
Type of residuals • If n is high, the standardized and studentized residuals acquire close values. • Under the normality hypothesis, it is verified that ti follows a t distribution with n- 3 degrees of freedom.
Type of residuals • Eliminated residuals: • Difference between the value observed in the answer and the prediction, when the whole sample is used, except the measurement that is being studied. • If the measurement has a huge influence in the calculation of the regression line, the ordinary and eliminated residuals are different; in other cases, both values will be similar.
Graphics of residuals • The Box-Plot and the histogram of standardized residuals provide information about their distribution. • If the sample size is low, instead the histogram of residuals the dot-plot or the stem and leaf plot are used; their interpretations are the same.
Graphics of residuals residuals It implies the existence of a hidden variable.
Graphics of residuals Dot-plot of a group of residuals.
Graphics of residuals-predictions residuals predictions There is no problem detected.
Graphics of residuals-predictions residuals predictions The linear fitness is not adequate.
Graphics of residuals-predictions residuals predictions Linear fitness wrongly calculated.
Graphics of residuals-predictions residuals predictions There is heteroscedasticity.
Graphics of residuals-predictions residuals predictions Non-linear fitness and heteroscedasticity.
Graphics of residuals-predictions residuals predictions There are some outliers.