125.785 Research Methods in Finance

125.785 Research Methods in Finance Seminar 2: The Simple OLS Model

Introduction Tough guys don’t do math. Tough guys fry chicken for a living. -- Jaime Escalante (Stand and Deliver, 1988).

Administration Computer Labs Quiz 1 (31 July 3-3.30pm) The Simple Regression Model Readings: Chapter 2, 4, Studenmund. Eviews Demonstration Outline

We begin with the Ordinary Least Squares (OLS) regression model This generates a ‘straight line’ between 2 variables. The line ‘approximates’ the relationship between the two variables The variables are Dependent (Y) Independent or explanatory (X). Recap: Simple OLS Model

Why Use OLS? • OLS is relatively easy to use (unlike MLE) • The goal of minimising is appropriate and theoretically appealing. • It punishes large deviations from the regression line • OLS estimates have some useful characteristics • Under certain conditions it generates the best, unbiased linear estimators of the coefficients.

Aside • OLS uses a least squaresmethod. • It minimises the sum of the square of the residuals. • The estimate of Y consists of • Constant (beta- nought) • Slope (beta- one)

Assumption 1 The regression model is linear, is correctly specified, and has an addititive error term. Assumption 2 The error term has a zero population mean Assumption 3 All explanatory variables are uncorrelated with the error term. The Classical Assumptions

Assumption 4 Observations of the error term are uncorrelated with each other (no serial correlation). We can’t predict future errors on the basis of past errors. Assumption 5 The error term has a constant variance (homoscedastic) Assumption 6 No explantory variable is a perfect linear function of any other explanatory variable. Classical Assumptions II

Illustration: Violation of Homoscedasticity

The model has a deterministic part. That part of the value of Y we can successfully predict Stochastic part That part of Y we can’t explain or predict. The stochastic part contains Measurment errors Missing variables Genuine random effects Non-linearities in the relationship. Analysing the OLS Model

General Equation Dependent Variable Explanatory Variable Constant Stochastic portion Deterministic portion

Analysing a Model Estimate Value of Parameter E.g. R2 Test value for Significance E.g. F-test Select preferred hypothesis

Evaluating the Model Overall Model- R2 Coefficients- β

This is very important: Models are evaluated at two levels. The high level is the overall model. The low level is the individual components of the model. Overall Model We measure how good (goodness of fit) it is with the R2. Hypotheses are: EITHER [H0] the model does not have a good fit (R2=0) OR [H1] the model has a good fit (R2>0) Evaluating the Model

We test each measure. Goodness of fit uses an F-test A high R2 implies a high F-statistic. A high F-stastic implies a low p-value. A p-value is the odds of making a mistake (Type I error). The mistake is rejecting the null hypothesis when it is true Critical values for p range from 1% to 10% Hypothesis Tests

The Simple OLS model generates two estimates of the coefficients Constant (Y-intercept) Slope (beta) These are the measures of value. These are tested with a t-test. The null hypothesis is ALWAYS βi = 0 This hypothesis implies that X has no effect on Y. Use p-value for evaluation. Evaluating the coefficients

It can be easy to find relationships between economic variables. A relationship does not prove that X causes Y. E.g. Storks and Babies Modellers prefer to use economic theory to generate a model Then test the model. Causality

Summary- R2

Summary- β

Recap- Regression Coefficients • The key point is that X Y. • Causality is in one direction. • The OLS model estimates 2 coefficients • Slope and • Constant

The Regression Equation • Yi = β0 + β1Xi + ei • This is the general Cartesian equation for a straight line. • Note- the values of the coefficients are estimates. • We cannot know their true value with certainty.

The Slope • One way to describe β is the slope. • A better way is • ΔY / ΔX or • dY/dX • Intuitively it is • How much Y changes IF we change X by one unit ALL other factors constant

The expression dY/dX approximates a derivative. Note: The derivative f′(x) = ln g(x) is: So the slope of: ln(y) =f(x) = (dy/y)/dx (Numerator is now a growth-rate.) ln(y)=f(ln(x) = (dy/y)/(dx/x) = (dy/dx)(x/y) (Expression is an elasticity) Slopes and Natural Logs

If β has a true value of 0, the “explanatory” variable has no explanatory power. This is tested with the t-ratio H0β=0 vs. H1β≠0 We can test other hypotheses about β Eg. test elasticity of X. The null is (usually) that β=1. If we reject we may conclude Elastic β>1 vs. Inelastic β<1. Testing Coefficients

t-test for β • β has a Student’s t-distribution • t-statistic is a function of • Estimated β • Hypothesised β • Standard Error • Tests easily done with programs like Eviews.

ANOVA • ANOVA is abbreviation for • ANalysis Of VAriance. • In order to calculate ‘goodness of fit’ we need a comparison point. • This is the Total Sum of Squares.

Measuring TSS • TSS is a measure of “total scatter”. • Measured either as squared sum of differences between: • Observed yi and mean y

The TSS is a variance measure (spread term). It has two components Explained Sum of Squares Residual Sum of Squares ESS is the proportion of the TSS our model “explains” R2 = ESS/TSS Has an F-distribution RSS is the proportion of TSS we can’t explain. OLS minimises this term ANOVA II

Statistical Tests

Eviews Tool Bar • The Eviews window has: • Tool bar • Top-pane (for manipulating variables) • Lower Scroll-bar (output may appear there) Top Pane Scroll Bar

Data New pane opens

The data file is trade.wf1 We will estimate an ‘Absorption Model’ for NZ. This uses National Income Accounting procedures. Y=C+I+G+X-M Hence Y-C-I-G=X-M Add/Substract TX from both sides (Y-TX-C)-I+(TX-G)=(X-M)+TX-TX Lab Introduction

Absorption Model • Note- Y-TX-C equals Savings (S) • So • (S-I)+(TX-G)=(X-M) • Net Private Savings + Net Public Savings = Current Account Balance • Net National Savings (NNS) = CAB • Implication: Economies with low savings rates (deficits) will run trade deficits.

C- Household consumption on final goods and services I- Investment G- Government Spending on goods and services TX- Taxes X- Exports M- Imports (removed for double-counting purposes). S- Household Income, less Consumption Definitions

Return on an asset • We have data for an asset (Grange Wine) • Price at 3 auctions • Vintage of each bottle • This exercise requires the data be manipulated in 3 ways: • The price has to be expressed as a growth rate. • The Vintage of the bottle has to be expressed as ‘age’ • Years with missing variables need to be omitted.

New Series • EITHER • New variables can be created using: • Quick • Generate Series… • OR • Expression is included in regression estimate

Conclusion • Key Numbers • R-square, F-statistics and p-value • Coefficient estimates • t-statistics and their p-values.

125.785 Research Methods in Finance

125.785 Research Methods in Finance

Presentation Transcript

Research Methods in

Research Methods in Criminology

Research Methods in Neuropsychology

RESEARCH METHODS IN BIOPSYCHOLOGY

Research Methods in Sexuality Research

Monte Carlo methods in finance

Research Methods in MIS

Research Methods in MIS

Research Methods in Psychology

Research Methods in MIS

Research methods in Psychology

Research Methods in Psychology

Research Methods in Psychology

Research Methods in Anthropology

BU4010 – Quantitative Methods in Finance

BU4010 – Quantitative Methods in Finance

Research Methods in HIB

Research Methods in Economics

Research Methods in Education

Research Methods in MIS

Research Methods in Education

Research Methods in CS