Ranked Set Sampling: Improving Estimates from a Stratified Simple Random Sample

Ranked Set Sampling: Improving Estimates from a Stratified Simple Random Sample Christopher Sroka, Elizabeth Stasny, and Douglas Wolfe Department of Statistics The Ohio State University

Alternative Title – Ranked Set Sampling: Where are the Samplers? • Purpose: Show that RSS can be incorporated into traditional sampling designs • Compare RSS to traditional sampling designs • Develop stratified ranked set sampling (SRSS) • Computer simulation to evaluate relative standard error

Notation • Select m random samples of size m with replacement from the population • Order the m items within each set using auxiliary variable or visual judgment • We do this before measuring our variable of interest

Notation • Select one ranked unit from each set and quantify with respect to variable of interest X[1]1 X[1]2 X[1]3 . . . X[1]m X[2]1 X[2]3 X[2]2 X[2]m X[3]m X[3]1 X[3]3 X[3]2 . . . . . . . . . . . . X[m]m X[m]1 X[m]3 X[m]2 Set m Set 1 Set 3 Set 2

. . . X[1]k X[2]k X[3]k . . . X[m]k Notation • Repeat k times to get a total of mk measurements on our variable of interest X[1]1 X[2]1 X[3]1 . . . X[m]1 X[1]2 X[2]2 X[3]2 . . . X[m]2

Notation • Our estimator of the population mean for the variable of interest is the average of our mk quantified observations:

RSS vs. Stratified Sampling • For fixed sample size n = mk,

RSS estimator from before Stratum weights RSS vs. Stratified Sampling • Expect SSRS to be better than RSS, since uses more population info • Can we improve on SSRS using RSS? • Stratified ranked set sampling (SRSS): Use RSS to select units from each stratum • We estimate the population mean by

Simulation • USDA data on corn production in Ohio • Treat the data set as a population • Use computer simulation to estimate the precision of each technique • Sample from data using each method • Estimate mean accordingly • Repeat 50,000 times • Use the variance of the 50,000 mean estimates to approximate the standard error of the estimator

Simulation • Performed simulation multiple times, varying • Sample size • Number of strata • Number of sets • Combination of ranking variable and variable of interest (correlations vary) • Reported standard error as percent of standard error under simple random sampling

Simulation • Number of sets in RSS equals number of strata in SSRS and SRSS • Only one cycle within strata for SRSS • For example, for 3 strata and sample size of 30 RSS: 3 sets of 3, repeat for 10 cycles SSRS: 3 strata, 10 observations per stratum SRSS: 3 strata, 10 sets of 10, 1 obs. per set

Results • SRSS is more precise than SSRS for almost all combinations of variables, set sizes, and sample sizes • Increased precision of SRSS the highest when • Strong correlation between ranking variable and variable of interest (i.e., accurate rankings) • Large sample size • SRSS less precise or not much more precise than SSRS when • Low correlation • Large number of strata combined with low sample size

Results – High Correlation (0.996) WHITE = SSRS RED = SRSS

Results – Moderate Correlation (0.620) WHITE = SSRS RED = SRSS

Conclusions • Can improve precision of survey estimation by using RSS in place of SRS • SRSS will improve estimation for all variables in a survey, no matter how low the correlation • SRSS may not require collecting additional information

Future Research • Use different variables for stratification and ranking • Performance under optimal strata allocation • Do results hold for any sampling design that uses SRS in its final stage? • Cost considerations

Ranked Set Sampling: Improving Estimates from a Stratified Simple Random Sample

Ranked Set Sampling: Improving Estimates from a Stratified Simple Random Sample

Presentation Transcript

Random Walk on Graphs and its Algorithmic Applications

Issues in Sampling and Sample Design – A Managerial Perspective

What is the probability that a person chosen at random from those in the sample will be in the 31-45 age category?

CHAPTER 7, the logic of sampling

Sampling Theory

Correlation and Simple Linear Regression

Top Ranked University

SIMPLE SQUAMOUS EPI

Introducing ranked retrieval

Introducing ranked retrieval

Sampling Error

Chapter 8 Statistical Inference and Sampling

Europe

Forest Mensuration II

Sampling Techniques

Chapter (1):

13 Collecting Statistical Data

Introducing ranked retrieval

Sampling and Sample Size in Epidemiology

Chapter 13

Sampling in IEA Studies

Sample size calculation and development of sampling plan