1 / 41

Trials and Tribulations of creating DDI Codebooks at the University of Guelph

Trials and Tribulations of creating DDI Codebooks at the University of Guelph. A.Michelle Edwards and Carol Perry, Data Resource Centre, University of Guelph Guelph, Ontario. Current Search Function. Search Results. Current Documentation. Identifying Variables. Rationale for Change.

Download Presentation

Trials and Tribulations of creating DDI Codebooks at the University of Guelph

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Trials and Tribulations of creating DDI Codebooks at the University of Guelph A.Michelle Edwards and Carol Perry, Data Resource Centre, University of Guelph Guelph, Ontario

  2. Current Search Function

  3. Search Results

  4. Current Documentation

  5. Identifying Variables

  6. Rationale for Change • 522 datasets to date. • No comprehensive metadata search function. • No current variable search within dataset. • Limits researcher’s autonomy.

  7. XML tags Started with approx. 30 or so tags… As of June 5, 2002 • 101 tags • 59 are filled • Information contained inside tags

  8. Codebook Templates • Used Maddie to develop initial template. • Edited the template to add tags as required. • Filled in fields common to all codebooks.

  9. Codebook Templates • Statistics Canada data • ICPSR data • B2020 data format

  10. Statistics Canada Codebook

  11. Differences between Codebook Templates • Authoring entity • Distributor (DLI vs. ICPSR) • Licenses • Other material – ICPSR abstract link • B2020 • No direct link to database • No variables

  12. How do we move our information from an HTML readme file to an XML file???

  13. Readme to XML • Document Description • Study Description • Data Files Description

  14. Readme to XML • Currently – copy and paste information from the Readme (html) file into the XML Codebook. • Script extracts metadata from html and places into XML. • Same amount of time.

  15. Variable Information

  16. Variable Information • Sources of Variable information • Variable names, labels, and position from the SAS program. • Frequencies for each variable value from SAS output.

  17. Variable Information • Sources of Variable information • Literal questions from questionnaires if available.

  18. Variable Information • Script: • Looks into the SAS program – pulls out the variable names, labels and positions. • Looks into a SAS output file for frequencies and variable value labels.

  19. Variable Information • Script: • If questionnaire is available – seeks out questions and matches with variables.

  20. Variable Information • Problems with Script: • SAS programs must be consistent in their format. • SAS output and questionnaires – matching variables.

  21. SAS to XML • SAS 8.2 - XML engine and ODS XML. • Can create XML SAS output. • Variable names, labels, value labels, and frequencies. • Variable positions with the input statement and Proc Print  XML.

  22. SAS to XML Frequency Output

  23. SAS to XMLProc print output

  24. SAS to XML

  25. SAS to XML • Advantages: • SAS programs do not need to be consistent. • Use one program from start to finish – SAS. • Still in development.

  26. XML to Viewable Document • Saxon – to render our XML documents to HTML using XSL Stylesheets. • XSL – pull out info from XML document and display with HTML tags.

  27. XSL Templates • Set for each: • Statistics Canada • ICPSR • B2020 • Initial templates from University of Virginia samples.

  28. XSL Templates • Abstract • Study Info • Methodology & File Dimensions • Questions • Variables & Frequencies • Other Documents

  29. XSL Stylesheets

  30. Search • Uses SAS IntrNet to call and run the UNIX SGREP search. • Creates an XML file with results. • Calls Saxon to render the file with the Variable XSL Stylesheet.

  31. “Final Product” • Frames to put it all together. • Links to each component (abstract, etc.). • Returns the rendered HTML on the fly.

  32. “Final Product”

  33. “Final Product” • Sun Exposure Survey 1996 • http://tdr.uoguelph.ca/DATA/WWWDOCS/XML/SES2/ses96cbk.html

  34. “Finished Product” • 522 datasets to date. • 35 Completed DDI-compliant codebooks. • Fall completion ???

  35. “Final Product”

  36. “Final Product”

  37. “Final Product”

  38. “Final Product”

  39. “Final Product”

  40. “Final Product”

  41. “Final Product”

More Related