70 likes | 166 Views
Preparing for data analysis - 2011 MBBS Honours Program. Jenny Zhang Research Fellow School of Medicine The University of Queensland. Overview. Analytical Plan Hypotheses Variables Data management Statistical methods Examples of database in Excel and SPSS
E N D
Preparing for data analysis - 2011 MBBS Honours Program Jenny Zhang Research Fellow School of Medicine The University of Queensland
Overview • Analytical Plan • Hypotheses • Variables • Data management • Statistical methods • Examples of database in Excel and SPSS • Examples of code book and description of your variables
hypotheses Define the most appropriate hypotheses to fully answer the research question. • What is hypotheses? - Hypotheses are statements in quantitative research in which the investigator makes a prediction or a conjecture about the outcome of a relationship among attributes or characteristics. Examples: There is correlation between GPA and academic performance. (a correlation study) Socioeconomic position will be related to the preventive health services utilization. (a cross sectional study) Average iron status will be different for children whose family cooks in iron pots compared to children whose family cooks in aluminum pots. (an experimental study)
Variables The attributes, characteristics, or dimension being measured or studied. • Dependent variables (DVs) The outcome variable where an effect is measured. • Independent variables (IVs) An independent variable is that variable which is presumed to affect or determine a dependent variable. Example: You are interested in how stress affects heart rate in humans. Your independent variable would be the stress and the dependent variable would be the heart rate. You can directly manipulate stress levels in your human subjects and measure how those stress levels change heart rate.
Data management • Data coding Coding your questions (code book) and label your variables -HSU Database.doc • Data entry Data are entered into the dataset against each ID number. Excel database - Jennysur_original_dataanalysis.xls - analysis_1.xls SPSS database - TTDASS_example.sav - Jennysur_original_data analysis.sav • Data verification 10 % of the sample size, re-enter to detect any entry mistakes.
Data management • Data cleaning 1. Perform frequency distribution of all variables to check for - Duplicated ID - Invalid values for each variable against the coding protocol - Extreme values (determine plausibility and final inclusion or exclusion from analysis) TTDASS_example.sav 2. Consistency checks - cross-checks as determined by your research team, e.g. the filter questions for impossible combinations and inconsistent values and meanings; baseline data and follow up data; term 1-5. • Data storage De-identified and kept confidential; (questionnaire) safely stored in a locked filing cabinet.
Statistical Methods Choose the best way to analyse the data for each hypothesis and rationale for choice of any analysis • Descriptive statistics - Describe characteristics of participants with respect to all variable available. E.g. age, sex, country of born. - Describe statistics of all variables • Bivariate analysis (among two variables) - comparisons between 2 or more groups - The relationship between outcome variable and each of independent variable • Multivariable analysis (1 outcome variable + 2 more independent variables) - General Linear Model - Logistical Regression model