440 likes | 685 Views
Statistics and Data Analysis. . Part 24 ? Statistical Tests: 3. Hypothesis Tests. Hypothesis Tests in the Regression ModelTests of Independence of Random Variables. 1/44. Application: Monet Paintings. Does the size of the painting really explain the sale prices of Monet's paintings?
E N D
1. Statistics and Data Analysis Professor William Greene
Stern School of Business
IOMS Department
Department of Economics
2. Statistics and Data Analysis
3. Hypothesis Tests Hypothesis Tests in the Regression Model
Tests of Independence of Random Variables
4. Application: Monet Paintings Does the size of the painting really explain the sale prices of Monet’s paintings?
Investigate: Compute the regression
Hypothesis: The slope is actually zero.
Rejection region: Slope estimates that are very far from zero.
5. Regression Analysis Investigate: Is the coefficient in a regression model really nonzero?
Testing procedure:
Model: y = a + ßx + e
Hypothesis: H0: ß = 0.
Rejection region: Least squares coefficient is far from zero.
Test:
a level for the test = 0.05 as usual
Compute t = b/StandardError
Reject H0 if t is above the critical value
1.96 if large sample
Value from t table if small sample.
Reject H0 if reported P value is less than a level
6. An Equivalent Test Is there a relationship?
H0: No correlation
Rejection region: Large R2.
Test: F=
Reject H0 if F > 4
Math result: F = t2.
7. Partial Effect Hypothesis: If we include the signature effect, size does not explain the sale prices of Monet paintings.
Test: Compute the multiple regression; then H0: ß1 = 0.
a level for the test = 0.05 as usual
Rejection Region: Large value of b1 (coefficient)
Test based on t = b1/StandardError
8. Investigation: Regression Airline costs depend on output
Do they also depend on airline structure variables such as
Load factor = percentage of seats filled
Stage length = average length of flights
Points = the number of cities served by the airline.
We use multiple regression to determine the joint influence of all the variables.
9. Airlines Cost Data
10. Testing “The Regression”
11. Cost “Function” Regression
12. Investigation: Gasoline Market Model: Gasoline demand depends on price and income.
Does it also depend on the prices of (1) new cars, (2) used cars, (3) public transportation?
Model: I will use logs and compute elasticities.
13. Application: Part of a Regression Model Regression model includes variables x1, x2,… I am sure of these variables.
Maybe variables z1, z2,… I am not sure of these.
Model: y = a+ß1x1+ß2x2 + d1z1+d2z2 + e
Hypothesis: d1=0 and d2=0.
Strategy: Start with model including x1 and x2. Compute R2. Compute new model that also includes z1 and z2.
Rejection region: R2 increases a lot.
14. Test Statistic
15. Gasoline Market
16. Gasoline Market
17. Gasoline Market
18. Improvement in R2
19. Is Genre Significant?
20. Application Health satisfaction depends on many factors:
Age, Income, Children, Education, Marital Status
Do these factors figure differently in a model for women compared to men?
Investigation: Multiple regression
Hypothesis: The regressions are the same.
Rejection Region: Estimated regressions that are very different.
21. Equal Regressions Setting: Two groups of observations (men/women, countries, two different periods, firms, etc.)
Regression Model: y = a+ß1x1+ß2x2 + … + e
Hypothesis: The same model applies to both groups
Rejection region: Large values of F
22. Procedure: Equal Regressions There are N1 observations in Group 1 and N2 in Group 2.
There are K variables and the constant term in the model.
This test requires you to compute three regressions and retain the sum of squared residuals from each:
SS1 = sum of squares from N1 observations in group 1
SS2 = sum of squares from N2 observations in group 2
SSALL = sum of squares from NALL=N1+N2 observations when the two groups are pooled.
The hypothesis of equal regressions is rejected if F is larger than the critical value from the F table (K numerator and NALL-2K-2 denominator degrees of freedom)
24. Computing the F Statistic
25. A Test of Independence In the credit card example, are Own/Rent and Accept/Reject independent?
Hypothesis: Prob(Ownership) and Prob(Acceptance) are independent
Formal hypothesis, based only on the laws of probability: Prob(Own,Accept) = Prob(Own)Prob(Accept) (and likewise for the other three possibilities.
Rejection region: Joint frequencies that do not look like the products of the marginal frequencies.
26. A Contingency Table Analysis
27. Independence Test
28. Comparing Actual to Expected
29. When is Chi Squared Large? For a 2x2 table, the critical chi squared value for a = 0.05 is 3.84.
(Not a coincidence, 3.84 = 1.962)
Our 103.33 is large, so the hypothesis of independence between the acceptance decision and the own/rent status is rejected.
30. Computing the Critical Value
31. Analyzing Default Do renters default more often (at a different rate) than owners?
To investigate, we study the cardholders (only)
We have the raw observations in the data set.
32. Cross Tabulating the Raw Data
33. Subset of the DataData ? Subset Worksheet
34. Contingency Table Analyzer
35. Specifying the Table
36. Hypothesis Test
37. Treatment Effects in Clinical Trials Does Phenogyrabluthefentanoel (Zorgrab) work?
Investigate: Carry out a clinical trial.
N+0 = “The placebo effect”
N+T – N+0 = “The treatment effect”
Is N+T > N+0 (significantly)?
38. Confounding Effects
39. What About Confounding Effects?
40. Multiple Choices: Travel Mode 210 Travelers between Sydney and Melbourne
4 available modes, air, train, bus, car
Among the observed variables is income.
Does income help to explain mode choice?
Hypothesis: Mode choice and income are independent.
41. Travel Mode Choices
42. Travel Mode Choices and Income
43. Contingency Table Analysis
44. Computing Chi Squared
45. Chi Squared Test Results