1 / 95

Session 7

Session 7. Outline. Chi-square Goodness-of-Fit Tests Fit to a Normal Simulation Modeling Autocorrelation , serial correlation Runs test Durbin-Watson Model Building Variable Selection Methods Minitab. Goodness-of-Fit Tests.

lisle
Download Presentation

Session 7

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Session 7

  2. Outline • Chi-square Goodness-of-Fit Tests • Fit to a Normal • Simulation Modeling • Autocorrelation, serial correlation • Runs test • Durbin-Watson • Model Building • Variable Selection Methods • Minitab Applied Regression -- Prof. Juran

  3. Goodness-of-Fit Tests • Determine whether a set of sample data have been drawn from a hypothetical population • Same four basic steps as other hypothesis tests we have learned • An important tool for simulation modeling; used in defining random variable inputs Applied Regression -- Prof. Juran

  4. Example: Warren Sapp Financial analyst Warren Sapp wants to run a simulation model that includes the assumption that the daily volume of a specific type of futures contract traded at U.S. commodities exchanges (represented by the random variable X) is normally distributed with a mean of 152 million contracts and a standard deviation of 32 million contracts. (This assumption is based on the conclusion of a study conducted in 1998.) Warren wants to determine whether this assumption is still valid. Applied Regression -- Prof. Juran

  5. He studies the trading volume of these contracts for 50 days, and observes the following results (in millions of contracts traded): Applied Regression -- Prof. Juran

  6. Applied Regression -- Prof. Juran

  7. Here is a histogram showing the theoretical distribution of 50 observations drawn from a normal distribution with μ = 152 and σ = 32, together with a histogram of Warren Sapp’s sample data: Applied Regression -- Prof. Juran

  8. The Chi-Square Statistic Applied Regression -- Prof. Juran

  9. Essentially, this statistic allows us to compare the distribution of a sample with some expected distribution, in standardized terms. It is a measure of how much a sample differs from some proposed distribution. A large value of chi-square suggests that the two distributions are not very similar; a small value suggests that they “fit” each other quite well. Applied Regression -- Prof. Juran

  10. Like Student’s t, the distribution of chi-square depends on degrees of freedom. In the case of chi-square, the number of degrees of freedom is equal to the number of classes (a.k.a. “bins” into which the data have been grouped) minus one, minus the number of estimated parameters. Applied Regression -- Prof. Juran

  11. Applied Regression -- Prof. Juran

  12. Applied Regression -- Prof. Juran

  13. Note: It is necessary to have a sufficiently large sample so that each class has an expected frequency of at least 5.We need to make sure that the expected frequency in each bin is at least 5, so we “collapse” some of the bins, as shown here. Applied Regression -- Prof. Juran

  14. The number of degrees of freedom is equal to the number of bins minus one, minus the number of estimated parameters. We have not estimated any parameters, so we have d.f. = 4 – 1 – 0 = 3. The critical chi-square value can be found either by using a chi-square table or by using the Excel function: =CHIINV(alpha, d.f.) = CHIINV(0.05, 3) = 7.815 We will reject the null hypothesis if the test statistic is greater than 7.815. Applied Regression -- Prof. Juran

  15. Our test statistic is not greater than the critical value; we cannot reject the null hypothesis at the 0.05 level of significance. It would appear that Warren is justified in using the normal distribution with μ = 152 and σ = 32 to model futures contract trading volume in his simulation. Applied Regression -- Prof. Juran

  16. The p-value of this test has the same interpretation as in any other hypothesis test, namely that it is the smallest level of alpha at which H0 could be rejected. In this case, we calculate the p-value using the Excel function: = CHIDIST(test stat, d.f.) = CHIDIST(7.439,3) = 0.0591 Applied Regression -- Prof. Juran

  17. Example: Catalog Company If we want to simulate the queueing system at this company, what distributions should we use for the arrival and service processes? Applied Regression -- Prof. Juran

  18. Arrivals Applied Regression -- Prof. Juran

  19. Applied Regression -- Prof. Juran

  20. Applied Regression -- Prof. Juran

  21. Applied Regression -- Prof. Juran

  22. Applied Regression -- Prof. Juran

  23. Services Applied Regression -- Prof. Juran

  24. Applied Regression -- Prof. Juran

  25. Decision Models -- Prof. Juran

  26. Decision Models -- Prof. Juran

  27. Decision Models -- Prof. Juran

  28. Decision Models -- Prof. Juran

  29. Decision Models -- Prof. Juran

  30. Decision Models -- Prof. Juran

  31. Decision Models -- Prof. Juran

  32. Decision Models -- Prof. Juran

  33. Decision Models -- Prof. Juran

  34. Decision Models -- Prof. Juran

  35. Decision Models -- Prof. Juran

  36. Decision Models -- Prof. Juran

  37. Other uses for the Chi-Square statistic The chi-square technique can often be employed for purposes of estimation or hypothesis testing when the z or t statistics are not appropriate. In addition to the goodness-of-fit application described above, there are at least three other important uses for chi-square: • Tests of the independence of two qualitative population variables. • Tests of the equality or inequality of more than two population proportions. • Inferences about a population variance, including the estimation of a confidence interval for a population variance from sample data. Applied Regression -- Prof. Juran

  38. Serial Correlation (A.k.a. Autocorrelation) Are the residuals independent of each other? What if there’s evidence that sequential residuals have a positive correlation? Applied Regression -- Prof. Juran

  39. Applied Regression -- Prof. Juran

  40. Applied Regression -- Prof. Juran

  41. There seems to be a relationship between each observation and the ones around it. In other words, there is some positive correlation between the observations and their successors. If true, this suggests that a lot of the variability in observation Yi can be explained by observation Yi – 1. In turn, this might suggest that the importance of Money Stock is being overstated by our original model. Applied Regression -- Prof. Juran

  42. Applied Regression -- Prof. Juran

  43. Applied Regression -- Prof. Juran

  44. Applied Regression -- Prof. Juran

  45. Applied Regression -- Prof. Juran

  46. Applied Regression -- Prof. Juran

  47. Applied Regression -- Prof. Juran

  48. Applied Regression -- Prof. Juran

  49. Applied Regression -- Prof. Juran

More Related