1 / 10

Goodness of Fit Tests

Goodness of Fit Tests. The goal of χ 2 goodness of fit tests is to test is the data comes from a certain distribution. There are various situations to which these tests apply. The first situation we will explore is when we observe count data in k different categories.

suttong
Download Presentation

Goodness of Fit Tests

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Goodness of Fit Tests • The goal of χ2 goodness of fit tests is to test is the data comes from a certain distribution. • There are various situations to which these tests apply. • The first situation we will explore is when we observe count data in k different categories. • The aim is to test the null hypothesis that the probabilities of the k categories are p1, p2,…,pk. • We distinguish between two cases. STA261 week 12

  2. Chi-Squared Test - Case 1 • The null hypothesis completely specifies the probabilities of each of the k categories. • For each category we calculate the expected count Ei = npi. • The test statistic and its distribution are… STA261 week 12

  3. Example • The statistic department at U of T offers introductory courses for students from other disciplines. The department believes that 40% of the students are math major, 30% are computer science, 20% biology and 10% chemistry. A random sample of 120 students revealed 52, 38, 21, and 9 from the four majors above. Does this data support the department claim? STA261 week 12

  4. Chi-Squared Test - Case 2 • The null hypothesis does not fully specify the probabilities. • In this case the probabilities of the different categories may be functions of other parameters. • First use the sample data to estimate r unknown parameters. • Then use the estimated parameters to estimate the k probabilities. • For each category, calculate the estimated expected count. • The test statistic is… STA261 week 12

  5. Example • A farmer believes that the number of eggs a chicken will give per day has a Poisson(λ) distribution. He observed the following data…. STA261 week 12

  6. Remark • In many cases we will observe data that are not categorized and we would want to test if the data comes from a certain distribution. • If the distribution we are testing is discrete the values of the variable will be the actual categories. • However, if the variable takes infinite possible values, the grouping should be done so that the expected frequency in each category is at least 5. • If the distribution we are testing is continuous we need to group the measurement of the random variable of interest into k intervals. Very often the choice of cells is done arbitrarily. • χ2 tests has low power when they are applied to continuous data, in which case we can use other tests. STA261 week 12

  7. Example STA261 week 12

  8. Kolmogorov-Smirnov Goodness-of-Fit Test • K-S test is also called the Kolmogorov-Smirnov D test. • The K-S goodness-of-fit test tests whether or not a given distribution is not significantly different from one hypothesized. • It is a more powerful alternative to chi-square goodness-of-fit tests. • The test statistic in the K-S test is based on the largest absolute difference between the cumulative observed proportion and the cumulative proportion expected on the basis of the hypothesized distribution. STA261 week 12

  9. Contingency Tables • The goal is to test if two categorical variables are independent. • The row variable has r categories while the column variable has c categories. • The data is the count of observations in the rxc table… • The null hypothesis states that the row variable and the column variable are independent. The alternative states that the variables are dependent. • To conduct the test, we calculate the expected count for each cell… • The test statistic and its distribution is…. STA261 week 12

  10. Example STA261 week 12

More Related