ENGN 2226 ENGINEERING SYSTEMS ANALYSIS L24: Linear regression

ENGN 2226 ENGINEERING SYSTEMS ANALYSISL24: Linear regression

Linear Models • Very many causal relationships can be exactly described by linear models. • Many causal relationships can be adequately modelled by linear models. • Linear models are a basic tool that is useful in a variety of situations and are a fundamental tool for engineering systems analysis.

Identifying linear models in data. Start with a set of data that has several measured variables for each instance of the population.

Scatter plot Look for structure in the scatter plot of the two variables. In this case we are looking for linear structure.

Linear predictive model The model can be used to predict values for values of the response variable that have not been measured. Question: How does one compute the parameters a and b that define the model.

Graphical representation of error We will deal exclusively with the measurement noise case in this course. Measurement noise case Noise equally in both variables Gaussian noise Gaussian noise Least squares error Total least squares cost

Measurement Noise

yk+1 yk k+2 k+1 k yk+2 Linear Model Estimators • The mean response of the model is a straight line function, the population regression line, of the explanatory variable. • The measurement yk, yk+1, yk+2 are sampled from a normal distribution with variance 2 around the point a xk+ b .

yk+1 yk k+2 k+1 k yk+2 Linear Model Estimators • We can see the difference between the actual measurements and the estimated measurements using our linear model. Now we want to try and understand the error between the two caused by our model

Linear Model Estimators • When we are choosing an estimator we need to determine the objective we want to meet. This will determine the form (and the mathematical formula) of the estimator we need. • There are lots of possible objectives but the most common is to reduce the sum squared error (SSE) between our linear model and the observations we have.

Least squares error The error term is the difference between the predicted output and the measured output Least squares error

Effect of minimising LS cost The LS cost acts to minimise the squared error of the residue equally along the predictive linear model.

Least Squares Estimator

Proof of LSE • On the Board

Least Squares Estimator

Least Squares Example (The Data) • We have a set of data with explanatory and response variables • How many explanatory and response variable pairs are needed to find an estimate of the linear model?

Least Squares Example (Data Mean)

Least Squares Example (Data Variance)

Least Squares Example (The LSE)

ENGN 2226 ENGINEERING SYSTEMS ANALYSIS L24: Linear regression