C22: The Method of Least Squares

CIS 2033 based onDekking et al. A Modern Introduction to Probability and Statistics. 2007 Instructor Longin Jan Latecki C22: The Method of Least Squares

22.1 – Least Squares Given is a bivariate dataset (x1, y1), …, (xn, yn), where x1, …, xn are nonrandom and Yi = α + βxi + Ui are random variables for i = 1, 2, . . ., n. The random variables U1, U2, …, Un have zero expectation and variance σ 2 Method of Least Squares: Choose a value for α and β such that S(α,β)=( ) is minimal.

22.1 – Regression The observed value yicorresponding to xiand the value α+βxion the regression line y = α + βx.

22.1– Estimation Method of Least Squares: Choose a value for α and β such that S(α,β)=( ) is minimal. To find theleast squares estimates, we differentiate S(α, β) with respect to α and β, and we set the derivatives equal to 0: • After some calculus magic, we get two equations to estimate α and β:

22.1– Estimation • After some simple algebraic rearranging, we obtain: (slope) (intercept)

Regression line y = 0.25 x –2.35 for points

22.1– Least Square Estimators are Unbiased • The estimators for α and β are unbiased. • For the simple linear regression model, the random variable is an unbiased estimator for σ2.

22.2– Residuals A way to explore whether the linear regression model is appropriate to model a given bivariate dataset is to inspect a scatter plot of the so-called residuals ri against the xi. The ith residual ri is defined as the vertical distance between the ith point and the estimated regression line: We always have

22.2– Heteroscedasticity • Homoscedasticity: The assumption of equal variance of the Ui (and therefore Yi). • In case the variance of Yi depends on the value of xi, we • speak of heteroscedasticity. For instance, heteroscedasticity occurs when Yi with a large expected value have a larger variance than those with small expected values. This produces a “fanning out” effect, which can be observed in the figure:

22.3– Relation with Maximum Likelihood • What are the maximum likelihood estimates for αand β? • To apply the method of least squares no assumption is needed about the type of distribution of the Ui. In case the type of distribution of the Ui is known, the maximum likelihood principle can be applied. In particular, when the Ui are independent with an N(0, σ2) distribution. ThenYihas an N (α+ βxi, σ2) distribution, making the probability density function

When Yi are independent, and eachYi has an N(α+βxi, σ2) distribution, and assuming that the linear model is appropriate to model a given bivariate dataset, the residuals ri should look like the realization of a random sample from a normal distribution. An example is shown in the figure below:

22.3– Maximum Likelihood For fixed σ >0 the loglikelihood l (α, β, σ) obtains the maximum when is minimal. Hence, when random variables independent with a N(0,σ 2) distribution, the maximum likelihood principle and the least squares method return the same estimators. The maximum likelihood estimator for σ 2 is:

C22: The Method of Least Squares

C22: The Method of Least Squares

Presentation Transcript

The Comparative Method

Finding Eigenvalues and Eigenvectors

What is Robust Design or Taguchi’s method?

Sequential sums of squares

C H A P T E R

The big M method

Vehicle Circulation and the Hungarian Method

APPLICATION OF LATTICE BOLTZMANN METHOD

AVIVA`S METHOD

CRCT BINGO 2013

Lab 4 : Most Probable Number Method (MPN)

Simple Linear Regression

A Mathematical View of Our World

Artificial Compressibility Method and Lattice Boltzmann Method Similarities and Differences

8/27/14

Some new and old results regarding Room squares and related designs

Regression Analysis and Multiple Regression

Quine-McCluskey Method

Functions of Several Variables Partial Derivatives