450 likes | 607 Views
National Naval Medical Center Directorate for Professional Education Clinical Investigation Department. RESEARCH COURSE Statistical Data Analysis and Scientific Research Dr. Francois O. Tuamokumo, Mathematical Statistician , Darnall Library, NNMC, Bethesda, MD
E N D
National Naval Medical CenterDirectorate for Professional EducationClinical Investigation Department RESEARCH COURSE Statistical Data Analysis and Scientific Research Dr. Francois O. Tuamokumo, Mathematical Statistician , Darnall Library, NNMC, Bethesda, MD “The National Naval Medical Center is an approved provider of continuing nursing education by the Navy Medicine Manpower, Personnel, Training and Education Command, an accredited approver by the American Nurses Credentialing Center’s Commission on Accreditation.”
Disclosure Statement This CE/CME activitydoes not have commercial support,and has no conflicts of interest. Research Course
DATA MANAGEMENT: INTRODUCTIONBy design of a study we mean planning the study in such a way that appropriate data can be collected and analyzed.
DATA are measurements collected on some characteristics called variables. • VARIABLES are the characteristics on which measurements are made
TYPES OF DATA • Qualitative data • Quantitative data • Qualitative data are categories. example: Gender (male, female) Stage of cancer (I, II, III, IV) • Quantitative data are numbers. example: Age, height, weight # of first trimester visits
Conclusion Methods of analysis depend on the data. • Data Management & Quality Assurance Access Excel Minitab BMDP SPSS STATA SAS
Identifying & resolving outliers • Identifying & and resolving missing data • Identifying duplicate records • Data Dictionary VariableAbbreviation Identification Code ID Low Birth Weight LoBrtWt
(0 = Birth Weight ≥ 2500g, 1 = Birth Weight < 2500g) Age of mother in years Age Weight in pounds at last wtLMst menstrual period Race (1=White, 2=black, Race 3=Hispanic, 4=other) Smoking status during Smoke
pregnancy (1=yes, 0=no) History of premature labor Prmtrlbo (0=none,1=one, 2=two, etc) History of hypertension Hptnsion (1= yes, 0 = no)
Presence of uterine Utrnirrt irritability (1=yes, 0= no) Number of physician visits visits during first trimester (0 = none, 1=one, 2=two, etc) Birth weight in grams brtwt
Some Considerations in Research • What are the variables of interest on which data will be collected? • What are the testable research questions of interest? • Are these questions clearly, concisely, and completely stated?
What is the purpose of the study? a. Descriptive b. Hypothesis testing, or c. Modeling • Descriptive study: To estimate a population parameter. • Ex: meanarterial blood pressure. proportion (percent) with improved respiratory outcome
Provide a 95% confidence interval for the estimates. • Confidence Interval: An interval over which the true value is expected to lie. Confidence Interval for: 1. population mean 2. population proportion
How large a sample do I need? Answer: Depends on type of study A. Estimation B. Testing Hypothesis A. Estimation of population mean, μ and population proportion, p
Example • A hospital administrator wishes to estimate the mean weight of babies born in her hospital. How large a sample of birth records should be taken if she wants to be 95% confident that the sample mean weight will be within 0.50 pound of the true mean weight of all babies born in her hospital?
Assume that a reasonable estimate of σ is 1 pound. Using the formula,
HYPOTHESES TESTING AND SAMPLE SIZE AN HYPOTHESIS IS AN ASSERTION ABOUT A POPULATION PARAMETER, SUCH AS THE POPULATION MEAN OR THE POPULATION PROPORTION.
TWO TYPES OF HYPOTHESES: NULL ALTERNATIVE THE RESEARCHER WISHES TO DISCREDIT THE NULL STATEMENT.
ERRORS IN HYPOTHESIS TESTING TYPE I ERROR TYPE II ERROR • TYPE I ERROR: REJECTING THE NULL HYPOTHESIS WHEN IT IS TRUE • TYPE II ERROR: ACCEPTING THE NULL HYPOTHESIS WHEN IT IS FALSE
P-value • It is the smallest significance level for which the null hypothesis is rejected. • Compare it to level of significance, α (normally, .05)
II. SAMPLE SIZE FOR COMPARISON OF TWO GROUPS: • DATA TYPE: A. NUMERICAL DEPENDENT VARIABLE
PROBLEM:THE RESEARCH QUESTION IS WHETHER THERE IS A DIFFERENCE IN THE EFFICACY OF SALBUTAMOL AND IPRATROPIUM BROMIDE FOR THE TREATMENT OF ASTHMA. • DESIGN: RANDOMIZED TRIAL TO DETERMINE THE EFFECT OF THESE DRUGS ON FEV1 (FORCED EXPIRATORY VOLUME IN 1 SECOND) AFTER 1 WEEK OF TREATMENT.
ANALYSIS: DIFFERENCES IN MEANS • TEST: t-TEST • SPECIFICATIONS: 1. NULL AND ALTERNATIVE HYPOTHESES: NULL: MEAN FEV1 AFTER ONE WEEK OF TREATMENT IS THE SAME IN ASTHMATIC PATIENTS TREATED WITH SALBUTAMOL AS IN THOSE TREATED WITH IPRATROPIUM BROMIDE. ALTERNATIVE (2-SIDED):
2. MEAN FEV1 = 2 LITERS STD. DEVIATION = 1 LITER - IPRATROPIUM(LITERATURE) 3. EFFECT SIZE: = 0.2 LITERS (10% * 2) STANDARDIZED EFFECT SIZE = (EFFECT SIZE / STD.DEV.) = 0.2 LITERS
4. LEVEL OF SIGNIFICANCE = .05 POWER = .80 THUS SAMPLE SIZE PER GROUP = 393
EXISTENCE OF HIGH IN-BETWEEN VARIABILITY AMONGST OBSERVATIONS • DESIGN: RANDOMIZED TRIAL • ANALYSIS: PRE-POST CHANGES • TEST: t-TEST
1. HYPOTHESES H0: CHANGE IN MEAN FEV1s ARE THE SAME HA: CHANGE IN MEAN FEV1s ARE DIFFERENT 2. STANDARD DEVIATION OF THE CHANGE = 0.25 (FROM PILOT) 3.EFFECT SIZE = 0.2 LITERS STANDARDIZED EFFECT SIZE = .80 4. LEVEL OF SIGNIFICANCE = .05, POWER = .80 FROM FORMULA, n = 25 PER GROUP
DATA TYPE - • B. CATEGORICAL (BINARY) DEPENDENT VARIABLE • EXAMPLE: PROPORTION OF MEN WHO DEVELOP CORONARY HEART DISEASE (CHD) WHILE TREATED WITH ASPIRIN COMPARED WITH THE PROPORTION WHO DEVELOP CHD WHILE TAKING A PLACEBO
SPECIFICATION REQUIREMENTS: • EFFECT SIZE IS SPECIFIED BY SPECIFYING P1 AND P2 TYPE OF STUDIES: A. COHORT STUDIES: P1AND P2 ARE PROPORTIONS OF SUBJECTS EXPECTED TO HAVE THE OUTCOME IN THE TWO GROUPS. 2. STATE THE NULL AND ALTERNATIVE HYPOTHESES. 3. SET ALPHA AND BETA.
PROBLEM: THE RESEARCH QUESTION IS WHETHER ELDERLY SMOKERS HAVE GREATER INCIDENCE OF SKIN CANCER THAN NONSMOKERS - COHORT
EXAMPLE: HOW MANY SMOKERS AND NONSMOKERS WILL NEED TO BE STUDIED TO DETERMINE WHETHER THE 5-YEAR SKIN CANCER INCIDENCE IS AT LEAST 30% IN SMOKERS? 1.H0: THE INCIDENCE IS THE SAME HA: THE INCIDENCE IS DIFFERENT 2. 5-YEAR INCIDENCE OF SKIN CANCER IS ABOUT 20% IN NONSMOKERS – LITERATURE REVIEW. 3. ALPHA = 0.05 AND POWER = 0.80 n = 313, FOR A TWO-SIDED HA n = 250, FOR A ONE-SIDED HA ** ABOVE PROBLEM MAY BE STATED IN FORM OF RELATIVE RISK.
EXAMPLE: AN INVESTIGATOR IS INTERESTED IN WHETHER WOMEN WHO USE ORAL CONTRACEPTIVES ARE AT A MUCH HIGHER RISK OF HAVING A MYOCARDIAL INFARCTION WHEN COMPARED TO NON-USERS (PROSPECTIVE)
B. CASE-CONTROL STUDY: • SPECIFICATION REQUIREMENTS: • 1. THE ODDS RATIO TO BE DETECTED IN THE CASE GROUP • 2. P2: THE PROPORTION OF CONTROLS EXPOSED TO THE PREDICTOR VARIABLE
WHERE,P1IS THE PROPORTION OF CASES EXPOSED TO THE PREDICTOR VARIABLE
EXAMPLE: 1. EXPECTS THAT 10% OF CONTROLS WILL BE EXPOSED TO ORAL CONTRACEPTIVES (P2) 2. WISHES TO DETECT AN ODDS RATIO OF 3 ASSOCIATED WITH THE EXPOSURE FROM FORMULA, P1 = 0.25 HENCE, FOR A TWO-SIDED HYPOTHESIS, n = 112 PER GROUP
Questions ??? Research Course
Thank You! My contact Information: Dr. Francois O. Tuamokumo Phone: (301) 319 8788 Email: francois.tuamokumo@med.navy.mil Research Course