170 likes | 181 Views
Explore linear regression, correlation, R2 values, outliers, and coefficients of determination. Learn applications, interpretations, and warnings with examples and explanations.
E N D
Linear Regression MARE 250 Dr. Jason Turner
Linear Regression Linear regression investigates and models the linear relationship between a response (Y) and predictor(s) (X) Both the response and predictors are continuous variables (“Responses”) Linear regression analysis is used to: - determine how the response variable changes as a particular predictor variable changes - predict the value of the response variable for any value of the predictor variable
Regression vs. Correlation Linear regression investigates and models the linear relationship between a response (Y) and predictor(s) (X) Both the response and predictors are continuous variables (“Responses”) Correlation coefficient (Pearson) – measures the extent of a linear relationship between two continuous variables (“Responses”)
When Regression vs. Correlation? Linear regression - used to predict relationships, extrapolate data, quantify change in one versus other is weighted direction Correlation coefficient (Pearson) – used to determine whether there is a relationship or not IF Regression – then it matters which variable is the Response (Y) and which is the predictor (X) Y – (Dependent variable)X – (Independent) X causes change in Y (Y outcome dependent upon X) Y Does Not cause change in X (X –Independent)
Linear Regression Regression provides a line that "best" fits the data (from response & predictor) The least-squares criterion (method used to draw this "best line“) requires that the best-fitting regression line is the one with the smallest sum of the squared error terms (the distance of the points from the line).
Linear Regression The R2 and adjusted R2 values represent the proportion of variation in the response data explained by the predictors Adjusted R2 is a modified R2 that has been adjusted for the number of terms in the model. If you include unnecessary terms, R2 can be artificially high
y Linear Regression y = b0 + b1x y = dependent variable b0 + b1= are constants b0= y intercept b1= slope x = independent variable Urchin density = b0 + b1(%coral)
Effects of Outliers Outliers may be influential observations A data point whose removal causes the regression equation (line) to change considerably Consider removal much like an outlier If no explanation – up to researcher
Warning on Regression Regression is based upon assumption that data points are scattered about a straight line What can we do to determine if a Regression is warranted?
Coefficient of Determination (R2) Coefficient of Determination (R2) - Expression of the proportion of the total variability in the response (s) attributable to the dependence of all of the factors R2 – used for assessing the “goodness of fit” of a regression model
Coefficient of Determination (R2) Should use Adjusted R2 as it is a more conservative measure R2 values range from 0 to 100%. An R2 of 100% means that all of the variability in the data can be explained by the model
Coefficient Relationships The coefficient of determination (r2) is the square of the linear correlation coefficient (r)
In Lab… Regression Analysis: _ Urchins versus % Coral
In Lab… Regression Analysis: _ Urchins versus % Coral
In Lab… Regression Analysis: _ Urchins versus % Coral