130 likes | 211 Views
A workshop on using R to select a sample for EHES. Susie Cooper & Johan Heldal Statistics Norway. Overview. What is R and why use it? Practical Exercises Installing and loading R and packages Reading external files Calculating sample sizes
E N D
A workshop on using R to select a sample for EHES Susie Cooper & Johan Heldal Statistics Norway
Overview • What is R and why use it? • Practical Exercises • Installing and loading R and packages • Reading external files • Calculating sample sizes • Stage 1 - Selecting Primary Sampling Units (PSU) • Stage 2 - Selecting Secondary Sampling Units (SSU) • Where to get more information
Why use R for EHES? • It has been agreed with EU because • It’s free - therefore available for all countries involved. • Very flexible • Very powerful and fast tool for sampling and analyses. However… • There can be a steep learning curve to using the program. • No user-friendly interface.
What is EHESsampling? • A tool for planning the sampling design • Can be used to find good stratifications • Can calculate cost-variance optimal sample sizes within PSUs. • Can calculate costs and variances of alternatives. • A tool for taking a probability sample from a sampling frame.
Using EHESsampling • The EHESsampling manual • Before using EHESsampling you have to prepare some input datasets from the main sampling frame. For sampling at stage 1 you need • A dataset describing the PSUs • A dataset describing the strata For stage 2 you need • The main sampling frame describing the individual units
1. Loading Packages • Load the EHESsampling package and other necessary packages each time you re-open R: library(EHESsampling)
2. Reading External Files • Open a new script by selecting File and New script
2. Reading External Files • Set the working directory where data files are stored by typing into the new script: setwd("X:/120/EHES/R/Data") • Then press + R to send the line to the console Location on your computer where the data files are stored
2. Reading External Files • Read in the chosen file and save it in the working environment. PSUs.df<-read.table("post1000.csv", sep=";", dec=",", header=T) • The file is now stored as PSUs.df for this session.
Print the first 6 lines of this 2. Reading External Files • To see the start of the data set type: head(PSUs.df)
Further Sampling Steps • Read in the strata dataset • Calculate the PSU sample sizes • Take a sample of PSUs – stage 1 • Merge the selected PSUs with the main sampling frame containing individual units. • Sample individual units – stage 2
Help! • EHESsampling manual available at: www.ehes.info • EHES participant manual – Part 1: Chapter 05 • R websites: • R official site: www.r-project.org • Quick R: www.statmethods.net • Us: • Johan.Heldal@ssb.no • Susie.Cooper@ssb.no