260 likes | 472 Views
Linear regression in matrix terms. (continued). Clean-up from last class…. Least squares estimates in simple linear regression setting. soap suds so*su soap 2 4.0 33 132.0 16.00 4.5 42 189.0 20.25 5.0 45 225.0 25.00 5.5 51 280.5 30.25 6.0 53 318.0 36.00
E N D
Linear regression in matrix terms (continued)
Least squares estimates in simple linear regression setting soap suds so*su soap2 4.0 33 132.0 16.00 4.5 42 189.0 20.25 5.0 45 225.0 25.00 5.5 51 280.5 30.25 6.0 53 318.0 36.00 6.5 61 396.5 42.25 7.0 62 434.0 49.00 --- --- ----- ----- 38.5 347 1975.0 218.75
The main thing is to understand what the inverse matrix is: The inverse of (X'X) It’s very messy (and not very informative) to determine inverses by hand. Just about everyone let’s a computer do the dirty work.
“Linear dependence (and rank) is not always obvious” The rank of this matrix is 2, not 3 as I claimed in last class. The (column)rank of a matrix is the maximum number of linearly independent columns in the matrix. The (row) rank of a matrix is the maximum number of linearly independent rows in the matrix. And, rank = column rank = row rank.
Row dependency: Column dependency:
The main point • If the columns of the X matrix (that is, if two or more of your predictor variables) are linearly dependent (or nearly so), you will run into trouble when trying to estimate the regression function.
Sum of squares In general, if you pre-multiply a vector by its transpose, you get a sum of squares.
But, it can be shown that equivalently: where J is a (square) n×n matrix containing all 1’s. Total sum of squares Previously, we’d write:
But, note that we get the same answer by: Example: Total sum of squares If n=2:
Error term assumptions • We used to say that the error terms εi for i=1, …, n are: • independent • normally distributed • with mean E(εi)=0 • with variance σ2(εi)=σ2. • Now, how can we say the same thing using matrices and vectors?
The n×1 random error term vector, denoted as ε, is: Error terms as a random vector
The n×1 mean error term vector, denoted as E(ε),is: The mean (expectation) of the random error term vector Definition Assumption Definition
The n×n variance matrix, denoted as σ2(ε),is defined as: The variance of the random error term vector That is, the diagonal elements are just the variances of the error terms, while the off-diagonal elements are the covariances between the error terms.
The ASSUMED variance of the random error term vector BUT, we assume the variances of the error terms are constant (σ2) and we assume the error terms are independent (which is equivalent to assuming the covariances are 0). That is:
For example: An aside: When you multiply a matrix by a scalar You just multiply each element of the matrix by the scalar.
An alternative way of expressing the ASSUMED variance of the random error term vector
Putting the regression function and assumptions all together, we get: The general linear regression model • where: • Y is a ( ) vector of response values • β is a ( ) vector of unknown parameters • X is an ( ) matrix of known constants (predictor values) • ε is an ( ) vector of independent, normal error terms with mean E(ε) = 0 and variance σ2(ε) = σ2I.