200 likes | 279 Views
The Importance of Sample Size and Its Varying Effects on Precision in Large-Scale Surveys. Dipankar Roy, PhD Bangladesh Bureau of Statistics dr.droy69@gmail.com Presented at the International Seminar at Rajshahi University 18-19 October 2012 Rajshahi, Bangladesh.
E N D
The Importance of Sample Size and Its Varying Effects on Precision in Large-Scale Surveys Dipankar Roy, PhD Bangladesh Bureau of Statistics dr.droy69@gmail.com Presented at the International Seminar at Rajshahi University 18-19 October 2012 Rajshahi, Bangladesh
Sample size determination • the act of choosing the number of observations to be surveyed • The way should be statistically sound and formulae-oriented • Samples should be selected with selection probability (base weight) • Samples should be allocated scientifically (need based)
Sampling • The process of selecting units • Study them • Generalize the result (estimate/statistic) • Back to population (parameter) • Infer about population through sample
A major goal of data analysis • sample mean (or proportion) to estimate the corresponding parameters in the respective population • Statistical inference about a population NOT for sample
Two approaches • Precision-based approach • Power-based approach
How large a sample is needed to • enable statistical judgments that are accurate and reliable? AND • to attain a desirable level of precision?
Sample size should not be determined • arbitrarily • without solving the equation • Required/optimum samples can ensure accurate, precise and reliable estimates • Too low samples lack the precision • Unnecessary larger samples yield minimal gain
Sampling Error • Standard Error (SE) • Margin of Error (MOE) • Confidence Interval (CI)
MOE Indicates that a data user can be certain that the estimate (statistic) and the population value (parameter) differ by no more than the value of the MOE
There is some margin of error d in the estimated proportion p in relation to the true proportion P • There is some risk α that the actual error is larger than d Pr(|p-P|>d)= α OR Pr(|p-P|<=d)=1-α
n=[z^2*P*(1-P)]/d^2 • the level of precision, • the level of confidence or risk, and • the degree of variability in the attributes
Sample of size n is required to • estimate an event of p • within d of its true value • with 100(1-α)% confidence level
HIES • Coefficient of variation should have been used in determining sample size for a study variable like income • Household income, by its nature, seems to be heterogeneous within and/or between localities
Interval width is equal to twice the margin of error and it is directly proportional to • If the sample size is increased by a factor of 4, the interval width will be reduced by half • High levels of precision require larger sample sizes • Higher confidence levels require larger sample sizes
Sample size depends on domain-level estimation • Sample size does not necessarily depend on how large the population • In a certain stage there is no necessity for increasing the sample size for population becoming any larger • For any complex design, sample size should be inflated by the design effect.