820 likes | 1.11k Views
Chi-Square Test. Chapter 7. Content. test of fourfold data test of paired fourfold data Fisher probabilities in fourfold data test of R×C table Multiple comparison of sample rates test of goodness of fit. objection :
E N D
Chi-Square Test Chapter 7 105
Content • test of fourfold data • test of paired fourfold data • Fisher probabilities in fourfold data • test of R×C table • Multiple comparison of sample rates • test of goodness of fit 105
objection: to deduce if there is any discrimination of the ratio or structure ratio between two populations or among more than two populations multiple comparison of the ratio of multi-samples to deduce if there is any correlation between two class variables test of goodness of fit test statistic : fit for :qualitative data
objective:to judge if there is any discrimination of the rate or structure ratio between two populations (equal to the u-test) demand:the number of individuals from the two samples classified into two categories should be transformed into a fourfold data 105
1 The basic idea of test distribution (1)distribution is a continuous distribution: (2) one of the basic characters is that it can be plus to others : 105
2. The basic idea of test eg 7-1 one hospital want to compare the curative effect of drug A(experimental group)and drug B control group)in lowering encephalic pressure。They classified 200 patients with high encephalic pressure into two groups at random,the results are as follows (table 7-1)。So whether the effective ratio is different? 105
Table 7-1 the comparison of the efficient ratio between two groups in lowering encephalic pressure 105
this data can be sorted into the form as chart7-2,that is to say there are two groups disposed, the number of each of them is made up of two parts, occurred and not occurred. There are four basic data( )in the table ,and other data can be induced by them, that is why it is called fourfold table data. 105
Basic idea :can be understood through the basic formula of test A means actual frequency, while T means theoretical frequency。 105
The respected frequencies can be calculated by the following formula: TRC refers to the respected frequencies in Row R and Line C nR refers to the total of the right row nC refers to the total of the right line 105
the respected frequency is set by the hypothesis ,and by the ratio after merging 。 105
the test statistic :the value of reflects the fitness of actual frequency and respected frequency 105
from formula 7-1,we can see that the value of also depends on the size of (exactly the size of )。 is decided by the number of the grids which can be evaluated freely, but not the sample size . 105
3. The process of hypothesis test (1) establish hypothesis, and set the criteria of the test。 H0:π1=π2 the effective ratios of the two collectivities in lowering encephalic pressure between the experimental group and control group is equal H1:π1≠π2the ratios of them are not equal α=0.05。 105
distribution is a continuous one, while the fourfold table data is dispersible, the valueof calculated by the latter is also dispersible, so in order toimprove the continuousness of the statistic distribution ,the continuousness correcting is needed. 105
the conditions in choosing test formula for the fourfold table data: ,special formula; ,corrected formula; ,Fishier exact probabilities method。 the continuity correcting for test is on fit for the fourfold table data when equals to 1,while is more than one ,it shouldn’t be corrected。 105
eg 7-2 one doctor want to compare the effect of drug A and drug B in curing cerebrovascular diseases,he classifies 78 patients with such illness into two groups at random ,the results are as follows (table 7-2),So whether the curative effect of the two drugs is the same ? 105
Table 7-2 the comparison of the efficient ratio in curing cerebrovascular diseases with two kinds of drugs 105
in this case, ,so the corrected formula can be used here ,through the critical value table of ,we can know that 。According to the test level 0.05, can’t be rejected ,so we can’t say that the effective ratios is different in curing cerebrovascular diseases. 105
If not corrected ,then so the conclusion is on the contrary。 105
Section 2 -test of paired fourfold table 105
It is the same as the measurement data that there are group design and paired design among the deduction of the differences of the two population ratios (proportions) in enumeration count data . That is fourfold table data and paired fourfold table data 105
Example 7-3,A laboratory has measured the serum antinuclear antibodies in 58 patients with questionable systemic lupus erythematosus by latex agglutination and immunofluorescence ,according to table 7-3. Is there the difference between the two methods? 105
In the paired design experiment ,there are four possible results of the two treatments as to the each pair: ① positive number both of the two methods( a); ② negative number both of the two methods (d); ③ positive number of immunofluorescence, negative number of latex agglutination (b); ④ positive number of latex agglutination, negative number of immunofluorescence (c)。 105
a, d are the agreement of the two methodsb, c are not agreement of the two methods Statistic: 105
Cautions: The method is used for small sample Reasons : 1. only consider the disagreement condition (b,c) 2. not consider the sample size n and the conditions of the agreement (a,d) When the n ,a,d are large enough and the b,care relative small ,there is nothing practical significance even if there is statistical significance. 105
Section 3 Fisher exact probabilities method in 2×2 table 105
conditons: Basis of theroy:hypergeometric distribution not test 105
Example 7-4,a doctor will study the precaution affect of the type B hepatitis immunoglobulin against intrauterine infection of fetus, and randomized 33 positive HBsAg patients into two groups:precaution group and nonprecuation group,looking at the table 7-4.Is there the difference between the two groups on the fetus infection ratio? 105
table7-4 comparison between the two groups of fetus infection ratio of HBV 105
1.Basic idea: When the periphery total numbers of fourfold table are fixed, we can calculate the all combinations probabilities of the four actual frequencies, then make deduction according to the α level and the cumulative probabilities. 105
1.Calculate Pi : combination number: minimal periphery total number +1 For example7-4,the numbers of combination: 9+1=10 105
The sum of the Pi is 1 Calculation formula: 105
2.calculation of the accumulation probabilities If crossing decibel of existent fourfold table is a*d*-b*c*=D*, the probability is P*, than Direpresents the crossing decibel of other combination fourfold table, the probabilities are Pi. 105
One-sided test • If the D*>0 in the existent fourfold table, we must calculate the accumulation probabilities of all on the base of Di≥D* and Pi≤P*. If D*<0, then we should calculate the accumulation probabilities on the condition of Di≤D* and Pi≤P*. 105
(2)Two-sided test Calculate the accumulation probabilities of all assembly fourfold table which are consistent with and . If or , the sequences of all combination in the fourfold table are symmetry, we can get the two-sided accumulation probabilities only through the one-sided accumulation probabilities ×2. 105
Checking procedure (this example is n=33<40) 1、Calculate the D* and P* of existent sample fourfold table ,as well as Di of all fourfold tables, please reference the table 7-5. in this example. 2、Calculate the Pi of all fourfold table consistent with . 105
3、Calculate the accumulation probabilitis of the fourfold tables corresponding and . In this example , , , , , and , are in line with the qualification. The accumulation probability is According to the size of test we can’t presume that the HBV infection rate of the infants which were performed precaution injection isn’t equal to that of who without pre-caution injection. 105
Table7-5 The Fisherexact probility calculating table of theexample7-4 105
Example 7-5 Some research studies the P53 expression of adenoma of adenocarcinoma and adenoma of gallbladder, detect P53 expression of 10 respective samples of each disease from the same time exairesis by immunohistochemistry, data were shown in Table 7-6. The problem is whether there is any significant difference between the positive rate between adenoma of adenocarcinoma and adenoma of gallbladder ? 105
Table 7-6 P53 positive expression rate between adenoma of adenocarcinoma and adenoma of gallbladder 105