1 / 53

Educational Research 101: How to Manage Your Data and Prepare for the Statistical Consultation

Educational Research 101: How to Manage Your Data and Prepare for the Statistical Consultation. Francis S. Nuthalapaty, MD H. Lee Higdon III, PhD. 2009 APGO Faculty Development Seminar. Case Study: The wrong way. Case Study: The wrong way.

haley
Download Presentation

Educational Research 101: How to Manage Your Data and Prepare for the Statistical Consultation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Educational Research 101:How to Manage Your Data and Prepare for the Statistical Consultation Francis S. Nuthalapaty, MD H. Lee Higdon III, PhD 2009 APGO Faculty Development Seminar

  2. Case Study: The wrong way

  3. Case Study: The wrong way • Statistician was consulted after the data had been collected. • Study question was not clearly defined. • Variables were not defined. • Data Dictionary was not developed. • Data were not cleaned/validated. • Result: a statistician that is asked to perform a miracle!

  4. Case Study: Lesson Arrangements to consult with a statistician should be made before you start enrolling and collecting data on patients! In fact, they should be made before protocol development to prevent issues downstream.

  5. Learning Objectives • Describe the continuum of data management • List data collection instruments / approaches • Understand how to create a data dictionary • Describe methods to validate data • Describe various data analytic tools • Describe how to decide on statistical approaches

  6. Question Where does data management fit into the research process?

  7. The Research Process • Question • Literature search • Objective / Hypothesis • Study design • IRB • Study conduct • Data analysis • Dissemination of results

  8. Data Management Pearl “No study is better than the quality of its data” - Friedman “…get it right the first time” - Crerand

  9. Analysis Steps in Data Management • Definition • Acquisition • Data Entry • Validation

  10. Data Definitions • Identifying your data • Identifying your data types • Naming your data variables • Creating a data dictionary

  11. Data Types Types of Variables Qualitative Quantitative Interval Nominal Ratio Ordinal

  12. Data Definition Exercise

  13. Data Variable Names • Make the name descriptive (easier to remember) • Keep it short (less than 10 characters) • User lower case • Avoid spaces – use “underscore” • Use numbers to indicate sequences

  14. Data Variable Formats • Variable formats: • Numeric • String

  15. Data Variable Values • Possible responses for a variable • Numeric format: • 0 = no / 1 = yes • String format: • a = no / b = yes

  16. Data Variable Values

  17. Note on Missing Values • What about variables with no response? • Leave it blank • Assign a period “.” • Assign a value (usually out of the expected response range) • Avoid text

  18. Data Naming Exercise

  19. Data Dictionaries / Code Books • Brings together all data elements: • Data types / formats • Variable names • Expected response values (range) • Comments • Self-generated vs. computer generated • “Rosetta Stone” for the database

  20. Data Dictionary Exercise

  21. Data Acquisition Pick the best method for the environment

  22. Data Acquisition Methods • Interviews • Questionnaires • Assessments • MCQ examinations • OSCE / OSAT • Laboratory studies

  23. Data Acquisition Environments • Observational encounters • Structured research encounters • Self-report

  24. Data Acquisition Problems • Major types of data issues: • Missing data • Incorrect data • Excess variability

  25. Data Acquisition Problems • Reasons for poor data quality: • Researcher-dependent data: • Insufficient time • Inadequate training • Lack of focus on study tasks • Poor communication • Protocol deviation

  26. Data Acquisition Problems • Reasons for poor data quality: • Subject-dependent data: • Inadequate instruction • Poor comprehension • Sensitive or stigmatized behaviors

  27. Data Acquisition Options • Paper forms • Direct entry • Computer assisted data acquisition

  28. Advantages Controlled distribution and return Comments Double data entry Disadvantages Anonymity Manual quality checks Data entry time / errors Data Acquisition: Paper Forms

  29. Data Acquisition: Direct Entry • Options: • MS Excel, MS Access • Epi Info – free on the web • Direct entry into statistical software • Pros / Cons: • No data transcription • Errors

  30. Data Acquisition • Computer assisted data acquisition: • Automated data collection • OCR forms • Computer-based case report forms / questionnaires • Computer-assisted self-interviews • Mobile computing device diaries

  31. Data Acquisition: CASI • Special Focus: Health Behaviors • Factors which may affect reporting: • Sensitive or stigmatized behaviors • Age discrepancy between participant and interviewer • Lack of privacy • Lack of comprehension of self-administered questionnaires

  32. Data Acquisition: CASI • Computer-assisted self-interview (CASI): • Computer-based interview • Can incorporate audio, video, and text • Respondent listens to or reads questions on screen • Submits answers through keypad or touch screen

  33. Data Acquisition: CASI • Benefits of CASI: • Interview conducted in privacy • Standardized interview • Computer controlled branching • Automated consistency and range checking • Multilingual administration

  34. Analysis Steps in Data Management • Definition • Acquisition • Data Entry • Validation

  35. Data Validation • Is all of the data present? • Are the responses within the expected range? • Does the data make sense?

  36. Data Validation • Is all of the data present? • Visually examine the data cells • Frequencies

  37. Data Validation • Are the responses within the expected range? • Frequencies • Maximum / minimum values • Descriptive statistics • Means • Standard deviations

  38. Data Validation

  39. Once the outlier is found, one can reference the chart for clarification

  40. Descriptive Statistics

  41. Data Distribution Definitions by SPSS 16.0

  42. Data Distribution

  43. Data Distribution

  44. Scatterplots

  45. Who is Represented in the Data? • Sample test of proportions • Percent of gender • Percent of ethnicity • Sample test of means • Age • BMI • Does our data reflect the population at large or a subset?

  46. Who is not? • Compare data of the included and excluded individuals • Are they similar for: • Age (continuous – Student t test) • BMI (continuous – Student t test) • Ethnicity (discrete/categorical – Chi-square test) • Gender (discrete/categorical – Chi-square test)

  47. Analysis Steps in Data Management • Definition • Acquisition • Data Entry • Validation

  48. Data Analysis • Choose the right tool for the job • Commonly used statistical tests: • If the data are normally distributed (i.e. bell-shaped curve) then we use parametricstatistical test • If the data are (1) not “bell-shaped”, or (2) have small sample sizes, generally less than 30 per group or (3) contain “outliners”, then we use nonparametric statistical tests.

More Related