880 likes | 1.1k Views
Session 5. Outline. Creating the Full Model “From Scratch” Excel Array Functions Qualitative Variables Dummy Variables Interaction Effects. How does Excel calculate these coefficients?. Matrix Algebra!. 4 Basic Matrix Operations. Sum Product Transpose Matrix Multiplication
E N D
Outline • Creating the Full Model “From Scratch” • Excel Array Functions • Qualitative Variables • Dummy Variables • Interaction Effects Applied Regression -- Prof. Juran
How does Excel calculate these coefficients? Matrix Algebra! Applied Regression -- Prof. Juran
4 Basic Matrix Operations • Sum Product • Transpose • Matrix Multiplication • Inverting Matrices Applied Regression -- Prof. Juran
Sum Product Applied Regression -- Prof. Juran
Notes: r = i and c = j. In other words, the two matrices must have the same number of rows as each other and the same number of columns as each other. They do not need to be square matrices (where r = c and i = j). Applied Regression -- Prof. Juran
A B C D 1 7 4 6 2 5 2 11 3 4 9 3 8 5 10 12 1 6 =SUMPRODUCT(A1:C2,A4:C5) 7 208 8 Applied Regression -- Prof. Juran
Transpose For matrix A, Applied Regression -- Prof. Juran
Notes: • If A is an r x c matrix, then must be a c x r matrix. • A does not need to be a square matrix. Applied Regression -- Prof. Juran
There is an Excel function for this purpose, called TRANSPOSE. This function is one of a special class of functions called array functions. In contrast with most other Excel functions, array functions have two important differences: • They are entered into ranges of cells, not single cells • You enter them by pressing Shift+Ctrl+Enter, not just Enter Applied Regression -- Prof. Juran
Using the spreadsheet above as an example, we start by selecting the entire range A4:C6. Then type into the formula bar =TRANSPOSE(A1:C2) Press Shift+Ctrl+Enter, and curly brackets will appear round the formula (you can’t type them in). Applied Regression -- Prof. Juran
Matrix Multiplication For matrices A and B, Applied Regression -- Prof. Juran
Inverting Matrices First, define a square matrix Ij as a matrix with j rows and j columns, completely filled with zeroes, except for ones on the diagonal: This special matrix is called the identity matrix. Applied Regression -- Prof. Juran
Now, for a square matrix A with j rows and j columns, there may exist a matrix called A-inverse (symbolized ) such that: Not all square matrices can be inverted, a fact that has implications for regression analysis. Applied Regression -- Prof. Juran
Remember: • Select the entire range A5:B6 before typing the formula. • Press Shift+Ctrl+Enter. Applied Regression -- Prof. Juran
Least-Squares Regression “By Hand” Combinations of: • Basic arithmetic • These four matrix operations • The T and F functions we already know Applied Regression -- Prof. Juran
Definitions Observed dependent variable data: Observed independent variable data: Applied Regression -- Prof. Juran
Unknown true coefficients: Unknown true errors: Applied Regression -- Prof. Juran
Estimated coefficients: Residual errors from : Applied Regression -- Prof. Juran
The linear model in matrix form: A.1 A.2 A.4 Applied Regression -- Prof. Juran
Regression Formulas Two special matrices: “C” matrix: A.5 “Projection” or “Hat” matrix: A.3 Applied Regression -- Prof. Juran
Example: Supervisor Data Applied Regression -- Prof. Juran
Example: Supervisor Data(Set Intercept to Zero) Applied Regression -- Prof. Juran
Bottom Part of the Output Applied Regression -- Prof. Juran
The “C” Matrix Applied Regression -- Prof. Juran
The “P” Matrix Applied Regression -- Prof. Juran
We’ll come back to the rest of the “bottom” part of the output later (slide 41). Applied Regression -- Prof. Juran
Middle Part of the Output Applied Regression -- Prof. Juran
Top Part of the Output Applied Regression -- Prof. Juran
Now that we have the standard error of the estimate , we can finish the “bottom” part of the output. Applied Regression -- Prof. Juran
cjj refers to the jth row and jth column in the C matrix; each independent variable’s standard error is calculated with a different cjj value. We used VLOOKUP here as a fancy way of doing it. Applied Regression -- Prof. Juran
Juran’s Lazy Portfolio • Invest in Vanguard mutual funds under university retirement plan • No shorting • Max 8 mutual funds • Rebalance once per year • Tools used: • Excel Solver • Basic Stats (mean, stdev, correl, beta, crude version of CAPM) Decision Models -- Prof. Juran 42
Categorical Variables Frequently in regression analysis we wish to study the effects of independent variables that are measured as nominal data (also known as categorical data). For example, many demographic factors such as political party, gender, ethnicity, marital status, or credit rating are measured in this way. Although regression analysis is generally designed to work with continuous data, these categorical types of variables can be studied with regression. Applied Regression -- Prof. Juran
The Binary Case: The most basic version of this situation is when a variable has two possible values, such as gender. We can set up a single “dummy” variable, such as: Why not use 1 for Male and 2 for Female? Applied Regression -- Prof. Juran
Multi-Category Cases: When there are k categories, we can model them with k – 1 dummy variables. For example, if we were studying the effects of different days of the week on sales, we could set up six dummy variables: Why not have a seventh variable for Saturday? Applied Regression -- Prof. Juran
Example: Interest Rates Data from Session 3: Applied Regression -- Prof. Juran
ANOVA results from Session 3: Applied Regression -- Prof. Juran
Regression Model for Interest Rates Applied Regression -- Prof. Juran