1 / 23

SESSION 49 - 52

SESSION 49 - 52. Last Update 17 th June 2011. Regression. Learning Objectives. XY-Scatter Diagrams Plotting the Regression Line Coefficient Estimates Pearson Coefficient of Correlation Spearman Rank Correlation Coefficient. XY-Scatter Diagram.

jagger
Download Presentation

SESSION 49 - 52

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SESSION 49 - 52 Last Update 17th June 2011 Regression

  2. Learning Objectives • XY-Scatter Diagrams • Plotting the Regression Line • Coefficient Estimates • Pearson Coefficient of Correlation • Spearman Rank Correlation Coefficient

  3. XY-Scatter Diagram To draw a scatter diagram we need data for two variables. In applications where one variable depends to some degree on the other variable, the dependent variable is labeled Y and the other, called the independent variable, X. The values for X and Y are combined into a single data point using the observations for X and Y as coordinates.

  4. Example Temperature - Truck

  5. Regression Analysis Regression analysis is used to predict the value of one variable on the basis of the other variables. The first-order linear model describes the relationship between the dependent variable Y and the independent variable(s) X. The regression model with a as the y-intercept and m as the slope coefficient is of the form:

  6. Example Temperature - Truck The estimators of the intercept a and slope coefficient b are based on drawing a straight line through the sample data:

  7. Intercept and Slope The intercept a is the y-coordinate of the point where the linear function intersects the y-axis. The slope coefficient b is defined as the change in y for a unit change in x.

  8. Fitted Line With Residuals The line drawn through the point is called the regression line.

  9. Residuals Squared The regression or least square line represents a line that minimizes the sum of the squared differences between the points and the line.

  10. Calculating Coefficients Raw Data (y-variable as dependent and x as independent variable):

  11. Solution Step1: Calculate the gradient (beta):

  12. Solution Step 2: Calculate the intercept (alpha):

  13. Interpreting the Coefficients The slope coefficient b may be interpreted as the change in the dependent variable y for a one unit change in x. In the previous example, a one unit change in temperature results in a b = 0.654 additional truckloads of cool drinks sold. The intercept a is the point at which the regression line and the y-axis intersect. If x = 0 lies far outside the range of sample values x, the interpretation of the intercept is not straight-forward. In the temperature-truck example, x = 0 lies outside the smallest and largest values for x in the sample. Interpreting the intercept for x would imply that at temperature of x = 0, the soft-drink sales decline to negative 3.914!

  14. Point Prediction Upon obtaining the coefficient estimates we can predict the outcome for various x (point prediction) between the minimum and maximum sample observation using the regression function y = a + mx. For example:

  15. Pearson Coefficient of Correlation The Pearson coefficient of correlation R may be used to test for linear association between variables. The coefficient is useful to determine whether or not a linear relationship exists between y and x. Note that variables may be positively or negatively correlated. R = 1 denotes perfect positive correlation, R = -1 signifies perfect negative correlation. R is defined for:

  16. Type of Relationship DIRECT LINEAR RELATIONSHIP INVERSE LINEAR RELATIONSHIP NO LINEAR RELATIONSHIP Small Dispersion Wide Dispersion Small Dispersion Wide Dispersion No Correlation r = 0 Negative Linear Correlation exists -1 < r < 0 Positive Linear Correlation exists 0 < r <+ 1

  17. Coefficient of Determination Squaring the Pearson coefficient of correlation delivers the coefficient of determination R2in regression. It may be interpreted as the proportion of variation in the dependent variable y that is explained by the variation in the explanatory variable x. R2 is a measure of strength of the linear relationship between y and x.

  18. Solution Step 3: Calculate R and R2

  19. Spearman Rank Correlation The standard coefficient of correlation allows for determining whether there is evidence of a linear relationship between two interval variables. In case where the variables are ordinal, or, if both variables are interval, the normality requirement may not be satisfied. A nonparametric test statistic called Spearman Rank Correlation Coefficient may be used under the circumstances.

  20. Objective: Comparing 2 Variables Analyzing the relationship between two variables Data type? Nominal Ordinal Nominal Population Distribution? Error is normal or x and ybivariate normal x and y not bivariate normal Simple linear regression Spearman Rank Correlation Chi-Square test of a contingency table

  21. Example Below there is a list of organizational strengths that were independently ranked by management and staff and the managing director wished to know how closely correlated were the assessments:

  22. Calculating RS

More Related