200 likes | 373 Views
Simple Linear Regression. If it’s simple, how bad can it be?. Linear Functions. Does this look familiar? We use something very similar: Where: b is the slope a is the y intercept x is the independent (predictor) variable y is the dependent (response) variable. Regression Line.
E N D
Simple Linear Regression If it’s simple, how bad can it be?
Linear Functions • Does this look familiar? • We use something very similar: Where: b is the slope a is the y intercept x is the independent (predictor) variable y is the dependent (response) variable
Regression Line • Use the regression line to make a prediction about the dependent variable • , which is typically called “y-hat,” means an estimate of y • For example • x might be the ACT score of entering freshmen
Regression Example • Made-up example comparing cigarette smoking to health issues • Y = number of health problems experienced by people between the ages of 65 and 70 • X = number of packs of cigarettes smoked per day between their ages of 20 and 50 • Our goal is to create an equation that will help us predict the value of y from the value of x Source: http://www.hippocampus.org/course_locator?skinPath=http://www.hippocampus.org/hippocampus.skins/default&course=Statistics%20for%20Social%20Sciences&lesson=18&topic=3&topicTitle=Regression%20Examples
Correlations • http://www.duxbury.com/authors/mcclellandg/tiein/johnson/correlation.htm • http://www.rossmanchance.com/applets/guesscorrelation/GuessCorrelation.html
Least Squares Method • Once again, we’re back to variances • Recall: variance squares the deviation from the mean • Regression line that has the smallest value of distances (squared) from it is our least squares line • How do we do that? • We have values for y and x, we need a and b Where r is the correlation coefficient for the line
Regression Equation • Substituting the values we just calculated, our regression equation is: • If we enter a value for x into this equation, we will get a predicted value for y
Least Square Error • The formula gives us an approximation for the value of the dependent variable • Actual value of y is obtained by measurement • If the regression line was perfect,
Least Square Error • Deviation of actual score from predicted is the error term
Cigarette Study Sum of errors = 0
Error Variance Sum of errors, squared = 108.571 Error variance = sum of squares/df = 108.571/5 = 21.7