550 likes | 764 Views
Analysis of Categorical Data. Dr Siti Azrin Binti Ab Hamid Unit Biostatistics and Research Methodology. Outline. Types of categorical analysis Steps to analysis. Overview univariable analysis. I ntroduction.
E N D
Analysis of Categorical Data Dr SitiAzrinBintiAbHamid Unit Biostatistics and Research Methodology
Outline • Types of categorical analysis • Steps to analysis
Introduction • Categorical data analysis deals with discrete data that can be organized into categories. • The data are organized into a contingency table.
Contingency table • Consists of two columns and two rows. • Cells are labeled A through D. • Columns and rows are added for labels. • Row: independent variable / exposure / risk factors • Column: dependent variable / outcome
Pearson Chi-square • To test the association between two categorical variables • Independent sample • Result of test: • Not significant: no association • Significant: an association
Research Question • Does estrogen receptor associated with breast cancer status? • Data: Breast cancer.sav
Step 1: State the hypothesis • HO: There is no association between estrogen receptor and breast cancer status. • HA: There is an association between estrogen receptor and breast cancer status.
Step 2: Set the significance level • α = 0.05
Step 3: Check the assumption • Two variables are independent • Two variables are categorical • Expected count of < 5 - > 20%: Fisher exact test - < 20%: Pearson Chi-square Expected count = Row total x Column total Grand total
Step 4: Statistical test • Calculate the Chi-square value x2 = ∑((O – E)2/ E) = 5.897 df= (R-1)(C-1) = (2-1)(2-1) = 1 Between 0.01 – 0.02
Step 4: Statistical test 4 1 5 3 7 2 6 8 10 9
Step 5: Interpretation p value = 0.016 < 0.05 – reject HO, accept HA
Step 6: Conclusion • There is significant association between estrogen receptor and breast cancer status using Pearson Chi-square test (p = 0.016).
Fisher’s Exact Test • To test the association between two categorical variables • Independent sample • Sample sizes are small
Research Question • Does gender associated with coronary heart disease? • Data: CHD data.sav
Step 1: State the hypothesis • HO: There is no association between gender and coronary heart disease. • HA: There is an association between gender and coronary heart disease.
Step 2: Set the significance level • α = 0.05
Step 3: Check the assumption • Two variables are independent • Two variables are categorical • Expected count of < 5 - > 20%: Fisher exact test - < 20%: Pearson Chi-square Expected count = Row total x Column total Grand total
Step 3: Check the assumption 2 cells (50%) – expected count < 5
Step 4: Statistical test • Calculate the Chi-square value x2 = ∑((O – E)2/ E) = 3.0968 df= (R-1)(C-1) = (2-1)(2-1) = 1 Between 0.1 – 0.05
Step 4: Statistical test 4 1 5 3 7 6 2 8 10 9
Step 5: Interpretation p value = 0.140 > 0.05 – accept HO
Step 6: Conclusion • There is no significant association between gender and coronary heart disease using Fisher’s Exact test (p = 0.140).
McNemar Test • Categorical data • Dependent sample - Matched sample - Cross over design - Before & after (same subject) • To determine whether the row and column marginal frequencies are equal (marginal homogeneity)
Hypotheses • Null hypothesis of marginal homogeneity states the two marginal probabilities for each outcome are the same HO : PB = PC HA : PB≠PC A & D = concordant pair B & C = discordant pair Discordant pair is pair of different outcome
Research Question • Does type of mastectomy associated with 5-year survival proportion in patients with breast cancer? • The sample were breast cancer patients - matched for age (same decade of age) - same clinical condition • Data: breast ca.sav
Step 1: State the hypothesis • HO: There is no association between type of mastectomy and 5-year survival proportion in patients with breast cancer. • HA: There is an association between type of mastectomy and 5-year survival proportion in patients with breast cancer.
Step 2: Set the significance level • α = 0.05
Step 3: Check the assumption • Two variables are dependent • Two variables are categorical
Step 4: Statistical test • x2 = (|b-c|-1)2/(b + c) = (|0 – 8| - 1)2 / (0 +8) =6.125 • df= (R-1)(C-1) = (2-1)(2-1) = 1 Calculated x2 > tabulated x2 *x2 = (|b-c|-0.5)2/(b + c)
Step 4: Statistical test 3 6 2 1 9 7 4 5 8
Step 5: Interpretation p value = 0.008 < 0.05 – reject HO, accept HA
Step 6: Conclusion • There is an association between type of mastectomy and 5-year survival proportion in patients with breast cancer using McNemar test (p = 0.008).
Cochran Mantel-Haenszel Test • Test is a method to compare the probability of an event among independent groups in stratified samples. • The stratification factor can be study center, gender, race, age groups, obesity status or disease severity. • Gives a stratified statistical analysis of the relationship between exposure and disease, after controlling for a confounder (strata variables). • The data are arranged in a series of associated 2 × 2 contingency tables.
Research Question • Does the type of treatment associated with response of treatment among migraine patients after controlling for gender? • Confounder: gender
Step 2: Check the assumption • Random sampling • Stratified sampling
Step 3: State the hypothesis • HO: There is no association between type of treatment and response of treatment among female and male migraine patients. • HA: There is an association between type of treatment and response of treatment among female and male migraine patients.
Step 4: Statistical test • Compute the expected frequency from each stratum ei = (ai + bi)(ai + ci) ni • Compute each stratum vi = (ai +bi)(ci +di)(ai +ci)(bi + di) ni2(ni -1) • Compute Mantel-Haenszel statistics x2MH = ∑(ai –ei)2 ∑vi
Step 4: Statistical test • Compute the expected frequency from each stratum ei = (ai + bi)(ai + ci) ni e1 = (16 +11)(16+ 5) 52 = 10.9038 e2 = (12 +16)(12+ 7) 54 = 9.8519
Step 4: Statistical test • Compute each stratum vi = (ai +bi)(ci +di)(ai +ci)(bi + di) ni2(ni -1) v1 = (16 + 11)(5 + 20)(16 + 5)(11+20) (52)2(52-1) = 3.1865 v2 = (12 + 16)(7 + 19)(12 + 7)(16+19) (54)2(54-1) = 3.1325
Step 4: Statistical test • Compute Mantel-Haenszel statistics x2MH = (∑ai–∑ei)2 ∑vi = ((16 +12) - (10.9038 + 9.8519))2 3.1865 + 3.1325 = 8.3051 = 8.31
Step 4: Statistical test • Compute odd ratio ORMH = ∑(aidi/ ni) ∑(bici/ ni) = (16 x 20/ 52) + (12 x 19 / 54) (11 x 5/ 52) + (16 x 7/ 54 = 3.313
Step 4: Statistical test Data: Migraine.sav 1 3 2 4 6 5
Step 5: Interpretation • Compute Mantel-Haenszel statistics x2MH = (∑ai–∑ei)2 ∑vi = ((16 +12) - (10.9038 + 9.8519))2 3.1865 + 3.1325 = 8.3051 = 8.31 Calculated value > tabulated value Reject HO