230 likes | 413 Views
Documentation Tools in the Survey Lifecycle. Outline. What is NSFG Webdoc? Instrument documentation != Survey documentation Data Cleaning/Processing in the Data Lifecycle Unexpected dividends: Documentation Tool as Management Tool. Cases 7643 female cases (age 15-44) 4928 male cases
E N D
Outline • What is NSFG Webdoc? • Instrument documentation != Survey documentation • Data Cleaning/Processing in the Data Lifecycle • Unexpected dividends: Documentation Tool as Management Tool
Cases 7643 female cases (age 15-44) 4928 male cases 13593 pregnancy cases 5300 variables Intended and unintended births Sexual intercourse Contraceptive use Infertility HIV testing … National Survey of Family Growth, Cycle 6 • Conducted by the U.S. National Center for Health Statistics of the Department of Health and Human Services in 2002
NSFG WebDoc • Previously, for NSFG Cycle 5 • Codebook: 6000 pages • Questionnaires: 600 pages • User Guide: 450 pages • Features of WebDoc for NSFG Cycle 6 • Web browsable codebook • PDFs for printable codebooks • Search functionality • (based on Oracle Text Indexing)
Document Assembly Component DDI from Blaise Word documents Excel spreadsheets Text documents SAS data definition statements SAS output Web Display Component Cocoon XSLT Oracle WebDoc Components 80% 20%
Blaise instrument documentation Female questionnaire Male questionnaire 800+ questions Final public release 3 data files (female, male , pregnancy) 5300+ variables Raw Computed Recodes Imputation flags Weights intermediate Survey Documentation Instrument Documentation ≠ • Inhouse data files • full data files • Interviewer observations • Audio CASI files
Documentation in the Lifecycle Data Archiving Study Design Data Collection Data Processing Data Distribution Data Discovery Data Analysis Instrument Doc Survey Doc
Consistent Codes for DK, RF, NA Confidentiality Review Consistency Checks Perturbation Multiple regression & Logical Imputation Recodes (10 levels) How do you manage this process? A Documentation Tool can help!!
Unexpected Dividend: Documentation Tool as Management/Diagnostic Aid • WebDocs was started near the end of the Data Collection Phase of the NSFG project. • A working version of WebDocs was finished early in the Data Processing/Cleaning Phase. • Small unplanned modifications to the working prototype was used to produce diagnostics/metrics of the data cleaning process.
Documentation in the Lifecycle Data Archiving Study Design Data Collection Data Processing Data Distribution Data Discovery Data Analysis Survey Doc
WebDoc as diagnostic tool • Features • Identifies problems: • Missing Documentation Pieces: question text, description, • Editing errors: missing labels, misspelled variable names, • Data errors: missing values, out-of-range values (incorrect computation or recode logic), inconsistent missing codes, SYS-MIS • … • Wiki used to track comments on each variable (almost zero programming necessary)
Documentation Tool as a Diagnostic/Management Tool • Do not treat documentation as an afterthought. • Have the documentation tool available early at the beginning of the phase in order to use it as a diagnostic/management tool to manage the process • Metadata in a database can be repurposed. • Another example: SRC’s BlaiseDoc • Can be used to produce HTML Questionnaire with programmed skip patterns • HTML Questionnaire can be used by non-programmers to quickly check the work of instrument programming.
Documentation in the Lifecycle Data Archiving Study Design Data Collection Data Processing Data Distribution Data Discovery Data Analysis Survey Doc Instrument Doc Instrument Doc
Summary • Make Documentation Tools available early. • Use a database so documentation metadata can be easily repurposed. • Documentation tools can be used by non-programmers to inspect/manage the product quality and the process. • NSFG WebDoc • http://www.icpsr.umich.edu/NSFG