1 / 19

M obile C omputing G roup

M obile C omputing G roup. A quick-and-dirty tutorial on the chi2 test for goodness-of-fit testing. Outline. The presentation follows the pyramid schema. Chi2 tests for GoF. Goodness-of-fit (GoF). Background -concepts. Background. Descriptive vs. inferential statistics

braunk
Download Presentation

M obile C omputing G roup

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mobile Computing Group A quick-and-dirty tutorial on the chi2 test for goodness-of-fit testing

  2. Outline The presentation follows the pyramid schema Chi2 tests for GoF Goodness-of-fit (GoF) Background -concepts

  3. Background • Descriptive vs. inferential statistics • Descriptive : data used only for descriptive purposes (use tables, graphs, measures of variability etc.) • Inferential : data used for drawing inferences, make predictions etc. • Sample vs. population • A sample is drawn from a population, assumed to have some characteristics. • The sample is often used to make inferences about the population (inferential statistics) : • Hypothesis testing • Estimation of population parameters

  4. Background • Statistic vs. parameter • A statistic is related (estimated from) a sample. It can be used for both descriptive and inferential purposes • A parameter refers to the whole population. A sample statistic is often used to infer a population parameter • Example : the sample mean may be used to infer the population mean (expected value) • Hypothesis testing • A procedure where sample data are used to evaluate a hypothesis regarding the population • A hypothesis may refer to several things : properties of a single population, relation between two populations etc. • Two statistical hypotheses are defined: a null H0 and an alternative H1 • H0 is the often a statement of no effect or no difference. It is the hypothesis the researcher seeks to reject

  5. Background • Inferential statistical test • Hypothesis testing is carried out via an inferential statistic test : • Sample data are manipulated to yield a test statistic • The obtained value of the test statistic is evaluated with respect to a sampling distribution, i.e.,a theoretical probability distribution for the possible values of the test statistic • The theoretical values of the statistic are usually tabulated and let someone assess the statistical significance of the result of his statistical test • The goodness-of-fit is a type of hypothesis testing • devise inferential statistical tests, apply them to the sample, infer the matching of a theoretical distribution to the population distribution

  6. GoF as hypothesis testing • Hypothesis H0: • The sample is derived from a theoretical distribution F() • The sample data are manipulated to derive a test statistic • In the case of the chi2 statistic this includes aggregation of data into bins and some computations • The statistic, as computed from data, is checked against the sampling distribution • For the chi2 test, the sampling distribution is the chi2 distribution, hence the name

  7. Goodness-of-fit • Statistical tests and statistics : the big picture EDF-based tests Chi2 type tests Specialized tests e.g., KS test, Anderson-Darling test e.g., Shapiro-Wilk test for normality Generalized chi2 statistics Classical chi2 statistics Log-likelihood ratio statistic Modified chi2 statistic Pearson chi2 statistic

  8. Pearson chi2 statistic • M : number of bins • Oi (Ni):observed frequency in bin i • n : sample size • Ei (npi) : expected frequency in bin i according to the theoretical distribution F() If X1, X2, X3…Xn , the random sample and F() the theoretical distribution under test, the Pearson chi2 statistic is computed as:

  9. Interpretation of chi2 statistic • Theory says that the Pearson chi2 statistic follows a chi2 distribution, whose df are • M-1, when the parameters of the fitted distribution are given a priori (case 0 test) • Somewhere between M-1 and M-1-q, when the q parameters of the distribution are estimated by the sample data • Usually, the df for this case are taken to be M-1-q • Having estimated the value of the chi2 statistic X2 , I check the chi2 distribution with M-1 (M-1-q) df to find • What is the probability to get a value equal to or greater than the computed value X2, called p-value • If p > a, where a is the significance level of my test, the hypothesis is rejected, otherwise it is retained • Standard values for a are 0.1, 0.05, 0.01 – the higher a is the more conservative I am in rejecting the hypothesis H0

  10. Example • A die is rolled 120 times • 1 comes 20 times, 2 comes 14, 3 comes 18, 4 comes 17, 5 comes 22 and 6 comes 29 times • The question is: “Is the die biased?” –or better: “Do these data suggest that the die is biased?” • Hypothesis H0 : the die is not biased • Therefore, according to the null hypothesis these numbers should be distributed uniformly • F() : the discrete uniform distribution

  11. Example – cont. • Interpretation • The distribution of the test statistic has 5 df • The probability to get a value smaller or equal than 6.7 under a chi2 distribution with 5 df (p-value) is 0.75, which is < 1-a for all a in {0.01..0.1}. • Therefore the hypothesis that the die is not biased cannot be rejected • Computations:

  12. Interpretation of Pearson chi2 • Graphical illustration • At 10% significance level, I would reject the hypothesis if the computed X2>9.24) 10% of the area under the curve 6.7 9.24 11.07 15.09 z P-value : 0.25 0.1 0.05 0.01

  13. Properties of Pearson chi2 statistic • It can be estimated for both discrete and continuous variables • Holds for all chi2 statistics. Max flexibility but fails to make use of all available information for continuous variables • It is maybe the simplest one from computational point of view • As with all chi2 statistics, one needs to define number and borders of bins • These are generally a function of sample size and the theoretical distribution under test

  14. Bin selection • How many and which? • Different opinions in literature, no rigid proof of optimality • There seems to be convergence on the following aspects • Probability of bins • The bins should be chosen equiprobable with respect to the theoretical distribution under test • Minimum expected frequencies npi : • (Cramer, 46) : npi > 10, for all bins • (Cochran, 54) : npi > 1 for all bins, npi >= 5 for 80% of bins • (Roscoe and Byars,71)

  15. Bin selection • Relevance of bins M to sample size N • (Mann and Wald, 42), (Schorr, 74) : for large sample sizes 1.88n2/5 < M < 3.76n2/5 • (Koehler and Larntz,80) : for small sample size M>=3, n>=10 and n2/M>=10 • (Roscoe and Byars, 71) • Equi-probable bins hypothesis : N > M when a = 0.01 and a = 0.05 • Non-equiprobable bins : N>2M (a = 0.05) and N>4M (a=0.01)

  16. Bin selection • Bins vs. sample size according to Mann and Ward

  17. Bin selection : cont. vs. discrete 1.0 0.9 0.8 0.7 0.6 Equi-probable bins easy to select 0.5 0.4 0.3 0.2 0.1 Bin i 1.0 Less straightforward to define equi-probable bins 1 2 3 4 5 6 7

  18. References Textbooks • D.J. Sheskin, Handbook of parametric and nonparametric statistical procedures • Introduction (descriptive vs. inferential statistics, hypothesis testing, concepts and terminology) • Test 8 (chap. 8) – The Chi-Square Goodness-of-Fit Test (high-level description with examples and discussion on several aspects) • R. Agostino, M. Stephens, Goodness-of-fit techniques • Chapter 3 – Tests of Chi-square type • Reviews the theoretical background and looks more generally at chi2 tests, not only the Pearson test.

  19. References Papers • S. Horn, Goodness-of-Fit tests for discrete data: A review and an Application to a Health Impairment scale • Good discussion of the properties and pros/cons of most goodness-of-fit tests for discrete data • accessible, tutorial-like

More Related