1 / 50

Statistical Inference and Regression Analysis: GB.3302.30

Statistical Inference and Regression Analysis: GB.3302.30. Professor William Greene Stern School of Business IOMS Department Department of Economics. Statistics and Data Analysis. Part 10 – Advanced Topics. Advanced topics. Nonlinear Least Squares Nonlinear Models – ML Estimation

kirima
Download Presentation

Statistical Inference and Regression Analysis: GB.3302.30

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistical Inference and Regression Analysis: GB.3302.30 Professor William Greene Stern School of Business IOMS Department Department of Economics

  2. Statistics and Data Analysis Part 10 – Advanced Topics

  3. Advanced topics • Nonlinear Least Squares • Nonlinear Models – ML Estimation • Poisson Regression • Binary Choice • End of course.

  4. Statistics and Data Analysis Nonlinear Least Squares

  5. Nonlinear Least Squares

  6. Lanczos 1 Data

  7. Nonlinear Regression

  8. Nonlinear Least Squares There are no explicit solutions to these equations in the form of bi = a function of (y,x).

  9. Strategy for Nonlinear LS

  10. NLS Strategy • Pick b • A. Compute yi0 and xi0 • B. Regress yi0 on xi0 • This obtains a new b • Return to step A or exit if the new b is the same as the old b

  11. Lanczos 1 First Iteration Now, repeat the iteration using this as b

  12. This is the correct answer

  13. Gauss-Marquardt Algorithm • Starting with b0 • A. Compute regressors xi0 Compute residuals ei0 = yi – f(xi,b0) • B. New b1= b0 + slopes in regression of ei0 on xi0 • Return to A. or exit if estimates have converged. • This is equivalent to our earlier method.

  14. Statistics and Data Analysis Maximum Likelihood: Poisson

  15. Application: Doctor Visits • German Individual Health Care data: N=27,236 • Model for number of visits to the doctor: • Poisson regression • Age, Health Satisfaction, Marital Status, Income, Kids

  16. Poisson Regression

  17. Nonlinear Least Squares

  18. Maximum Likelihood Estimation This defines a class of estimators based on the particular distribution assumed to have generated the observed random variable. The main advantage of ML estimators is that among all Consistent Asymptotically Normal Estimators, MLEs have optimal asymptotic properties.

  19. Setting up the MLE The distribution of the observed random variable is written as a function of the parameters to be estimated P(yi|data,β) = Probability density | parameters. The likelihood function is constructed from the density Construction: Joint probability density function of the observed sample of data – generally the product when the data are a random sample.

  20. Likelihood for the Poisson Regression

  21. Newton’s Method

  22. Properties of the MLE • Consistent: Not necessarily unbiased, however • Asymptotically normally distributed: Proof based on central limit theorems • Asymptotically efficient: Among the possible estimators that are consistent and asymptotically normally distributed • Invariant: The MLE of g() is g(the MLE of )

  23. Computing the Asymptotic Variance We want to estimate {-E[H]}-1 Three ways: (1) Just compute the negative of the actual second derivatives matrix and invert it. (2) Insert the maximum likelihood estimates into the known expected values of the second derivatives matrix. Sometimes (1) and (2) give the same answer (for example, in the Poisson regression model). (3) Since E[H] is the variance of the first derivatives, estimate this with the sample variance (i.e., mean square) of the first derivatives. This will almost always be different from (1) and (2). Since they are estimating the same thing, in large samples, all three will give the same answer.

  24. Poisson Regression Iterations

  25. MLE NLS

  26. Using the Model. Partial Effects

  27. Effect of Income Depends on Age

  28. Effect of Income | Age

  29. Statistics and Data Analysis Binary Choice

  30. Case Study: Credit Modeling • 1992 American Express analysis of • Application process: Acceptance or rejection; Y = 0 (reject) or 1 (accept). • Cardholder behavior • Loan default (D = 0 or 1). • Average monthly expenditure (E = $/month) • General credit usage/behavior (C = number of charges) • 13,444 applications in November, 1992

  31. Proportion for Bernoulli • In the AmEx data, the true population acceptance rate is 0.7809 =  • Y = 1 if application accepted, 0 if not. • E[y] =  • E[(1/N)Σiyi] = paccept = . • This is the estimator

  32. Some Evidence = Homeowners Does the acceptance rate depend on home ownership?

  33. A Test of Independence • In the credit card example, are Own/Rent and Accept/Reject independent? • Hypothesis: Prob(Ownership) and Prob(Acceptance) are independent • Formal hypothesis, based only on the laws of probability: Prob(Own,Accept) = Prob(Own)Prob(Accept) (and likewise for the other three possibilities. • Rejection region: Joint frequencies that do not look like the products of the marginal frequencies.

  34. Contingency Table Analysis The Data: Frequencies Reject Accept TotalRent 1,845 5,469 7,214Own 1,100 5,030 6,630Total 2,945 10,499 13,444 Step 1: Convert to Actual Proportions Reject Accept TotalRent 0.13724 0.40680 0.54404Own 0.08182 0.37414 0.45596Total 0.21906 0.78094 1.00000

  35. Independence Test Step 2: Expected proportions assuming independence: If the factors are independent, then the joint proportions should equal the product of the marginal proportions. [Rent,Reject] 0.54404 x 0.21906 = 0.11918[Rent,Accept] 0.54404 x 0.78094 = 0.42486[Own,Reject] 0.45596 x 0.21906 = 0.09988[Own,Accept] 0.45596 x 0.78094 = 0.35606

  36. Comparing Actual to Expected It appears that the acceptance rate is dependent on home ownership

  37. When is the Chi Squared Large? • Critical values from chi squared table • Degrees of freedom = (R-1)(C-1). Critical chi squaredD.F. .05 .01 1 3.84 6.63 2 5.99 9.21 3 7.81 11.34 4 9.49 13.28 5 11.07 15.09 6 12.59 16.81 7 14.07 18.48 8 15.51 20.09 9 16.92 21.6710 18.31 23.21

  38. Analyzing Default • Do renters default more often (at a different rate) than owners? • To investigate, we study the cardholders (only) DEFAULT OWNRENT 0 1 All 0 4854 615 5469 46.23 5.86 52.09 1 4649 381 5030 44.28 3.63 47.91 All 9503 996 10499 90.51 9.49 100.00

  39. Hypothesis Test

  40. More Formally Model Acceptance and Default

  41. Probability Models zi

  42. Likelihood Function

  43. American Express, 1992

  44. Logistic Model for Acceptance

  45. Probit Default Model

  46. Think statisticallyBuild models Thank you.

More Related