390 likes | 580 Views
Appropriate techniques of statistical analysis. Anil C Mathew PhD Professor of Biostatistics & General Secretary ISMS PSG Institute of Medical Sciences and Research Coimbatore 641 004. Types of studies. Case study Case series Cross sectional studies Case control study Cohort study
E N D
Appropriate techniques of statistical analysis Anil C Mathew PhD Professor of Biostatistics & General Secretary ISMS PSG Institute of Medical Sciences and Research Coimbatore 641 004
Types of studies • Case study • Case series • Cross sectional studies • Case control study • Cohort study • Randomized controlled trials • Screening test evaluation
Data analysis-Case series Measures of averages • Mean, Median, Mode • Length of stay for 5 patients 1,3,2,4,5 Mean length of stay 3 days Median length of stay 3 days Mode length of stay No mode
Data analysis-case series • Frequency distribution
Design of Cohort Study Time Direction of inquiry disease Exposed People without the disease Population no disease disease Not Exposed no disease
Is obesity associated with adverse pregnancy outcomes? Women with a Body Mass Index > 30 delivering singletons. Ref- University of Udine, Italy,2006
Design of Case Control Study Exposed Disease Not Exposed Exposed No Disease Not Exposed
Analysis of Case-control study Odds ratio = a*d/b*c =80*70/30*20 =9.3
Data Analysis-Screening Test Evaluation-Whether the plasma levels of (Breast Carcinoma promoting factor) could be used to diagnose breast cancer? Positive criterion of BCPF >150 units vs. Breast Biopsy (the gold standard) TP = 570 FN = 30 FP = 150 TN = 850
Sensitivity = P (T+/D+)=570/600 = 95% Specificity = P(T-/D-) = 850/1000 = 85% False negative rate = 1 – sensitivity False positive rate = 1 – specificity Prevalence = P(D+) = 600/1600 = 38% Positive predictive value = P (D+/T+) = 570/720 = 79%
Tradeoffs between sensitivity and specificity When the consequences of missing a case are potentially grave When a false positive diagnosis may lead to risky treatment
Data analysis-case series Measures of variation • Range • Standard deviation
Data analysis- Analytical studies • Tests of significance
Case Study 1: Drug A and Drug B • Aim: Efficacy of two drugs on lowering serum cholesterol levels • Method: Drug A – 50 Patients Drug B – 50 Patients • Result: Average serum cholesterol level is lower in those receiving drug B than drug A at the end of 6 months
Drug B is superior to Drug A in lowering cholesterol levels : Possible/Not possible
B) Drug B is not superior to Drug A, instead the difference may be due to chance: Possible/Not possible
C) It is not due to drug, but uncontrolled differences other than treatment between the sample of men receiving drug A and drug B account for the difference: Possible/Not possible
D) Drug A may have selectively administrated to patients whose serum cholesterol levels were more refractory to drug therapy: Possible/Not possible
Observed difference in a study can be due to 1) Random change 2) Biased comparison 3) Uncontrolled confounding variables
Solutions: A and B • Test of Significance – p value • P<0.05, means probability that the difference is due to random chance is less than 5% • P<0.01, means probability that the difference is due to random chance is less than 1% • P value will not tell about the magnitude of the difference
Solutions: C and D • Random allocation and compare the baseline characteristics
“t” Test Ho: There is no difference in mean birth weight of children from HSE and LSE in the population CR = t = | X1 - X2 | SD 1 + 1 n1 n2 SD = (n1-1)SD12 + (n2-1)SD22 n1 + n2- 2 SD = 14*0.272 + 9*0.222 = 0.25 23 t = | 2.91 – 2.26| = 6.36 0.25 1 + 1 15 10 DF = n1 + n2 – 2 CAL > Table REJECT Ho
GENERAL STEPS IN HYPOTHESIS TESTING 1) State the hypothesis to be tested 2) Select a sample and collect data 3) Calculate the test statistics 4) Evaluate the evidence against the null hypothesis 5) State the conclusion
Commonly used statistical tests • T test-compare two mean values • Analysis of variance-Compare more than two mean values • Chi square test-Compare two proportions • Correlation coefficient-relationship of two continuous variables
Example-Analysis of variance • Serum zinc level in simple febrile patients based on duration of seizure occurred
Example Chi-square test • Characteristics of patients in the two groups
Example Correlation • We found a negative correlation between serum zinc level and simple febrile seizure event r = - 0.86 p <0.001
Type 1 and Type 2 ErrorsHo TrueHo False / H1 True Accept Ho Reject Ho Power = 1- β
Multivariate problem • Main outcome • Continuous variable-Linear regression • Dichotomous variable-Logistic regression
Bradford Hills Questions • Introduction- Why did you start? • Methods-What did you do? • Results- What did you find? • Discussion- What does it mean?
How to begin writing? • Data Tables Methods, Results Introduction , Discussion Abstract Title, Key words, References