1 / 81

Pet Fish and High Cholesterol in the WHI OS: An Analysis Example

Pet Fish and High Cholesterol in the WHI OS: An Analysis Example. Joe Larson 5 / 6 / 09. The Question. In the WHI Observational Study, are women with pet fish less likely to have ever taken pills for high cholesterol at baseline?. What We Want to Do. Find the data

cora-bailey
Download Presentation

Pet Fish and High Cholesterol in the WHI OS: An Analysis Example

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pet Fish and High Cholesterol in the WHI OS: An Analysis Example Joe Larson 5 / 6 / 09

  2. The Question • In the WHI Observational Study, are women with pet fish less likely to have ever taken pills for high cholesterol at baseline?

  3. What We Want to Do • Find the data • Download the appropriate zip files • Load them into SAS • Merge our sets together • Do a basic Chi-Square test

  4. A Few Notes: • The data files used for this example are subsets of the full form data. • This was done to reduce download time and ease the replication of this analysis • All processes we will go through are identical to what you would do for a normal analysis

  5. Finding the Data • The first step is to figure out what we need to answer our question. We will need: • Pet data • Cholesterol data • Demographic data (to help us select only women in the observational study)

  6. First we want to go to the study operations web site: www.whiops.org

  7. Select the Study Operations Link

  8. Click on the “Data” Tab

  9. The Data Screen

  10. The Data Screen • Data available for both WHI and WHIMS • Images of all forms • Options to look for dictionaries by category • Link to the Data Distribution Agreement - Anyone who uses the data should fill it out - PI’s are responsible for data at the clinics

  11. Let’s Look for Our Data • First, let’s hunt for the fish data. Since we don’t know what form it’s on, let’s click on the ‘Data dictionaries by analysis category’ link.

  12. Where Would Fish Be? • Let’s take a look in the Psychosocial/Behavioral subcategory

  13. Since there are 216 variables, it will be easier to right click on the document and search for “fish”

  14. Searching for Fish

  15. Found It!

  16. The Fish Variable is on Form 37 - We should also note that it is a sub question of ‘Do you have a pet” and is a “Mark all that apply” question!

  17. Now Let’s Find High Cholesterol • Going back to the ‘Data Dictionaries by Category’ screen, it will be in the Medical History section

  18. Medical History is Broken Up into Subcategories • It should be under Cardiovascular

  19. It looks like it is on Form 30

  20. Now We Just Need an Indicator to tell us which Participants are in the OS • All trial flags and indicators are in the Demographics file Now We’re Ready to Download the Data!

  21. Back to the Data Screen • Click on ‘Datasets’

  22. The Datasets Screen

  23. An Aside: The Datasets Page • All data is arranged by form • In addition to the zip files with the data, the .pdf files of the data dictionaries can also be downloaded separately • For more detailed info on what’s in a zip file, please see the Appendix at the end of the walkthrough

  24. Downloading the Data • For the purposes of this demo, smaller sets have been created that anyone with a WHI password can download • Only PI’s can normally download the actual data files • Scroll down to the bottom of the Datasets page to find these files

  25. WHI Example Files for Downloading

  26. Downloading the Data • When you click on the zip file link, you get a pop up box • Save the file in the directory of your choice

  27. Downloading Data • For my example, I’ve saved all of the data in a directory I created called “DataTraining”

  28. Extracting the Data from the Zip Files • Double click on the first zip file, the demographics file, you should be able to see the contents Click on the ‘Extract’ button

  29. Extracting the Data from the Zip Files • Extract the files to the same directory as your zip files

  30. Extracting the Data from the Zip Files • Repeat with the other two zip files. • The resulting directory should look like this:

  31. Analyzing the Data • We now have everything we need to look at the data • For the purposes of this example, I’m going to use SAS • Other software such as S-Plus, Stata, R, SPSS, and others can also be used • Even if using another program, the SAS Load code provided can be used to determine the order of variables in the dataset as well as formats

  32. Loading in the Data • From the Default SAS screen, go up to the File menu and select ‘Open Program’

  33. Loading in the Data • Select all three of the files and click ‘Open’

  34. Loading in the Data • Let’s start with the demographics data • One change needs to be made to each file to let SAS know where the data is located • Find where the actual file is being read in, this is the line in the file that begins with INFILE • We can also change the name of the file we are creating in the line above the INFILE statement

  35. Loading in the Data • In the example, we’ve put the data in ‘S:\DataTraining’ • I’ve also renamed the file ‘demographics’ instead of the default, which was ‘dem_ctos_train’

  36. Loading in the Data • Now that the location of the datafile has been updated, we can run the SAS Code • Go to the ‘running man’ icon, which is the button to submit code

  37. Loading in the Data • If you are concerned or unsure whether it worked or not, you can look at the SAS log. The tab is at the bottom of the screen. • Any errors would show up as RED in the log

  38. SAS Log for Loading in Demographics

  39. Loading in the Data • Now we want to repeat the process for the other two files. • First for Form 30

  40. Loading in the Data • Then for Form 37

  41. Looking at What We Have • Let’s make a new SAS program file to look at the data • Go to the File Menu and select ‘New Program’

  42. Looking at What We Have • We can also now close the three files used to load the data into SAS • You should now have your new program, the log, and the output tabs

  43. Looking at What We Have • To know the names of the files we’ve loaded we can use some PROC DATASETS code.

  44. Looking at What We Have • Once the code is typed in, click the submit button again and then go to the LOG tab

  45. Looking at What We Have • In the log we see the three files we’ve loaded: - DEMOGRAPHICS (The Demographics File) - FORM30 (The Cholesterol Data) - FORM37 (The Pet Fish Data) Now we need to do some data manipulation to pull this all together

  46. The Demographics File • Let’s look at the demographics file (DEMOGRAPHICS) first • PROC CONTENTS can be used to determine what variables are in a file • Highlight the code and then hit the submit button

  47. The Demographics File • On the output screen we see what variables are available • We only want to keep OS participants, so we will need the OSFLAG variable, which has a value of 1 for participants in the observational study • We also want to keep the ID variable for merging the files later

  48. The Demographics File • Let’s Look at the Code to do this: • We are manipulating the ‘demographics’ file and creating a new file ‘demographics_2’ with our changes • We only want to keep the ID and OSFLAG variables

  49. The Form 30 File • This is our medical history data • Looking at the data dictionary, we see that this file is a baseline file with one row per participant

  50. The Form 30 File • Let’s look at the contents of the Form 30 file

More Related