1 / 16

Simple Linear Regression

Simple Linear Regression. Lecture for Statistics 509 November-December 2000. Correlation and Regression. Study of association and/or relationship between variables.

Download Presentation

Simple Linear Regression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Simple Linear Regression Lecture for Statistics 509 November-December 2000

  2. Correlation and Regression • Study of association and/or relationship between variables. • Useful for determining the effect of changes in one variable (called the independent or control variable) on another variable (called the dependent or response variable). • Regression models could be utilized to determine optimal operating conditions [these conditions specified by the control variables] in order to achieve a certain specified value or yield on the response variable. • Regression models could also be utilized to predict the value of the response given a value of the independent variable, or could be used for “calibrating” the value of the independent variable to achieve a certain response. Stat 509 - Regression Lecture

  3. Some Examples • Control variable is X = Average Speed of a Car and response variable is Y=Fuel Efficiency of the Car. Goal is to determine speed to optimize the efficiency of the car. • Control variable is X = Temperature, while the response variable is Y = Yield in a chemical reaction. • Control variable is X = amount of fertilizer applied on a plant, while the response variable is Y = yield of this plant. • Control variable is X = thickness of a stack of bond paper, while the response variable is Y = number of sheets in this stack. • Control variable is X = average time of studying, while the response variable is Y = GPA. Stat 509 - Regression Lecture

  4. Population Model • Each member of the population will have a value for the independent variable X and the response variable Y, usually represented by the vector (X,Y). • For a given value X = x, the variable Y has a certain distribution whose conditional mean is m(x) and whose conditional variance is s2(x). • This could be visualized as follows: When you consider the subpopulation consisting of units whose values of X equal x, then their Y-values has a certain distribution whose mean is m(x) and whose variance is s2(x). When you pick a unit from this subpopulation, then the Y-value that you will observe is governed by this particular distribution. In particular, this observation could be expressed via • Y = m(x) + e, where e is some “error term.” Stat 509 - Regression Lecture

  5. Assumptions for Simple Linear Regression • Assumptions for Simple Linear Regression • m(x) = E(Y|X=x) = a + bx. This means that the mean of Y, given X = x, is a linear function of x. • b is called the regression coefficient or the slope of the regression line; a is the y-intercept. • s2(x) = s2 does not depend on x. This is the assumption of “equal variances” or homoscedasticity. • Furthermore, for the sample data (x1, Y1), (x2, Y2), …, (xn, Yn): • Y1, Y2, …, Yn are independent observations, and their conditional distributions are all normal. • In shorthand notation: • Yi = m(xi) + ei = a + bxi + ei, i=1,2,…,n, where e1, e2, …, en are independent and identically distributed (IID) N(0,s2). Stat 509 - Regression Lecture

  6. Regression Problem • Given the sample (bivariate) data (x1, Y1), (x2, Y2), …, (xn, Yn), satisfying the linear regression model • Yi = a + bxi + ei with e1, e2, …, en IID N(0, s2) • we would like to address the following questions: • How should the data be summarized graphically? • What are the estimators of the parameters a, b, and s2? • What will be an estimate of the prediction line? • What are the properties of the estimators of the model parameters? • How do we test whether the fitted regression model is a significant model? • How do we construct CIs or test hypotheses concerning parameters? • How do we perform prediction using the prediction model? Stat 509 - Regression Lecture

  7. Illustrative Example: On Plasma Etching • Plasma etching is essential to the fine-line pattern transfer in current semiconductor processes. The paper “Ion Beam-Assisted Etching of Aluminum with Chlorine” in J. Electrochem. Soc. (1985) gives the data below on chlorine flow (x, in SCCM) through a nozzle used in the etching mechanism, and etch rate (y, in 100A/min) Stat 509 - Regression Lecture

  8. The Scatterplot Stat 509 - Regression Lecture

  9. Least-Squares Prediction Line Stat 509 - Regression Lecture

  10. Stat 509 - Regression Lecture

  11. Stat 509 - Regression Lecture

  12. Analysis of Variance Table Stat 509 - Regression Lecture

  13. Stat 509 - Regression Lecture

  14. Excel Worksheet for Regression Computations Stat 509 - Regression Lecture

  15. Regression Analysis from Minitab Stat 509 - Regression Lecture

  16. Fitted Line in Scatterplot with Bands Stat 509 - Regression Lecture

More Related