380 likes | 394 Views
S1 IDAC Research Methodology Module II Census Vs Sample Studies. Objective of Sampling Types of Sampling Probability sampling Non-probability sampling Sampling Theory Sampling Distribution Sample size Determination. What is Sampling ?.
E N D
S1 IDAC Research Methodology Module II Census Vs Sample Studies
Objective of Sampling • Types of Sampling • Probability sampling • Non-probability sampling • Sampling Theory • Sampling Distribution • Sample size Determination
What is Sampling ? Population (N) – A group that includes all the cases (individuals, objects, or groups) in which the researcher is interested. Sample (n)– A relatively small subset from a population. (n<N) Sampling Technique- Process of Sample Selection is called Sampling Technique
Objective of Sampling • Reduces the cost of research (e.g. political polls) • Generalize about a larger population (e.g., benefits of sampling city with respect to neighborhood) • In some cases (e.g. industrial production) analysis may be destructive, so sampling is needed
Sampling -Advantages • Time saving of researcher and those being surveyed. • Cost saving to group or agency commissioning the survey. • Non-interference with population. Large sample could alter the nature of population, eg. opinion surveys. • Do not destroy population, eg. crash test only a small sample of automobiles. • Cooperation of respondents – individuals, firms, administrative agencies. • Partial data that is available, eg. fossils and historical records, climate change.
Sample Sources • Friends, family, neighbours, acquaintances. • Students in a class or co-workers in a workplace. • Volunteers.
Sample Design A sample design is a definite plan for obtaining a sample from a given population. It refers to the technique or the procedure in selecting items for the sample. The main sample design considerations are 1. Objective 2. Population 3. Sampling Units(Geographical) and frame (list of sampling unit) 4. Size of sample 5. Parameters of Interest
Sample design considerations –Cont… 6. Data Collection (Relevant data) 7. Non-respondents 8. Selection of Proper sampling Design 9. Organizing field work 10. Pilot Survey 11. Budgetary Constraint
Sampling and Non-sampling errors The errors involved in the collection of data are classified in to sampling errors and non-sampling errors
Sampling errors Sampling errors arise due to the fact that only a part of the population has been used to estimate population parameters and to draw inference about population. Sampling error can be measured for a given sample design and size. If we increase the sample size the precision can be improved. But increasing size of the sample has its own limitations.
Non-sampling errors Non-sampling errors arise at the stage of collection and preparation of data and thus are present in both the sample survey as well as the census survey. Non-sampling errors can be reduced by defining the sampling units, frame and the population correctly and by employing efficient people in the investigations.
Census Vs sample Studies In a sample survey, since we study only a subpart of the whole population, requires less money and less time. Considering Non-sampling errors the sample survey are much more accurate than those of census survey. In the case of Census survey sampling errors are absent. But it consumes more time and money. If population is not so large, a Census survey may provide better results than any sample survey.
Sampling from a process • It may be difficult or impossible to obtain or construct a frame. • Larger or potentially infinite population – fish, trees, manufacturing processes. • Continuous processes – production of milk or other liquids, transporting commodities to a warehouse. • Random sample is one where any element selected in the sample: • It is selected independently of any other element. • Follows the same probability distribution as the elements in the population.
Sampling from a process –Cont… • Careful design for sample is very important. • Sample production of milk at random times. • Forest products – randomly select clusters from maps or previous surveys of tree types, size, etc.
Types of Sampling • Non-Probability sampling • Probability Sampling
Non-probability Sampling Non-probability sampling is that sampling procedure which does not afford any basis for estimating the probability that each item in the population being included in the sample. Non-probability sampling is also known as 1. Deliberate sampling 2. Purposive sampling 3. Judgment Sampling
Non-probability Sampling In this type of sampling, item for the sample are selected deliberately by the researcher. Under non-probability sampling the organizers of the inquiry purposively choose the particular units of the universe for constituting a sample on the basis that the small mass that they so select out of a huge one will be typical or representative of the whole. Quota Sampling is an example for non-probability sampling
Probability Sampling Probability sampling is also known as ‘random sampling’ or ‘chance sampling’ ,every item of the universe has an equal chance of inclusion in the sample. The implications of a random sampling(simple random sampling) are 1. It gives each element in the population an equal probability of getting in to the sample. 2. It gives each possible sample combination an equal probability being chosen.
Complex Random Sampling Designs Some complex random sampling designs, which are the mixture of probability and non-probability sampling methods. 1. Systematic Sampling 2. Stratified Sampling 3. Cluster Sampling 4. Multi-stage sampling 5.Sampling with probability proportional to size 6. Sequential Sampling
Systematic Sampling In systematic sampling only the first unit is selected randomly and the remaining units of the sample are selected at fixed intervals. Systematic sampling is the most practical way of sampling is to be select every ith item on a list.
Stratified Sampling Stratified sampling technique applied when population does not constitute a homogenous group. In this sampling technique population id divided in to sub-populations (Strata) which are more homogeneous than the total population. Important context of stratified sampling: 1. Strata forming 2. Selection of item from each stratum 3. Allocation of sample size of each stratum.
Stratified Sampling –Cont…. • Proportionate stratified sample – The size of the sample selected from each subgroup is proportional to the size of that subgroup in the entire population. (Self weighting) • Disproportionate stratified sample – The size of the sample selected from each subgroup is disproportional to the size of that subgroup in the population. (needs weights)
Stratified Sampling – Cont…. If total population N=8000 Sample size n=30 Strata size N1=4000 ,N2=2400 ,N3=1600 Sample size under different strata n1=n .P1 where P1=N/N1 n1=30 X (4000/8000) = 15 Similarly n2=30 X (2400/8000) = 9 N3=30 X (1600/8000) = 6 Generally n1/N1 .σ1 = n2/N2 .σ2 ………= nk/Nk .σk
Cluster Sampling If the total area of interest happens to be a big one , a convenient way in which a sample can be taken is to divide the total area in to a number of smaller non-overlapping areas and then to randomly select a number of these smaller areas(cluster).Cluster sampling reduces the cost by selected cluster survey but it is less precise than random sampling.
Multi-stage sampling Multi-stage sampling is a further development of the principle of cluster sampling. The first stage total population is divided in to primary sampling unit. Then primary sampling unit is again divided. If only two stage division it is known as two stage sampling.
Sampling with probability proportional to size In case the cluster sampling units do not have the same number or approximately the same number of elements, it is considered appropriate to use a random selection process where the probability of each cluster being included in the sample is proportional to the size of the cluster.The actual numbers selected in this way do not refer to individual elements, but indicate which clusters and how many from the cluster are to be selected by simple random sampling or by systematic sampling.
Sequential Sampling This sampling design is some what complex sample design. The ultimate size of the sample under this technique is not fixed in advance, but is determined according to mathematical decision rules on the basis of information yielded as survey progresses. This is usually adopted in case of acceptance sampling plan in context of statistical quality control. When a particular lot is to be accepted or rejected on the basis of a single sample, it is known as single sampling.
Large random sample from any population A sample size n of greater than 100 is generally considered sufficiently large to use
Terms used in sampling • Sampled population – Population from which sample drawn . Researcher should clearly define. • Frame – list of elements that sample selected from. • Parameter – Characteristics of a population . Eg. total (annual GDP or exports), proportion p of population that votes Liberal in federal election. Also, µ or σ of a probability distribution are termed parameters. • Statistic – Numerical characteristics of a sample. Eg. monthly unemployment rate, pre-election polls. • Sampling distribution of a statistic is the probability distribution of the statistic.
Descriptive Statistics and Inferential statistics • Descriptive statistics intend to describe a big amount of data with summary charts and tables, but do not attempt to draw conclusions about the population from which the sample was taken. • Inferential statistics, you are testing a hypothesis and drawing conclusions about a population, based on your sample. In this case, you are going to run into fancy sounding concepts like ANOVA, T-Test, Chi-Squared, confidence interval, regression, etc.
Inferential statistics • Inferential statistics, there are generally two forms • 1.Estimation Statistics • “Estimation statistics” is a fancy way of saying that you are estimating population values based on your sample data. • 2. Hypothesis Testing • Hypothesis testing is simply another way of drawing conclusions about a population parameter (“parameter” is simply a number, such as a mean, that includes the full population and not just a sample).With hypothesis testing, one uses a test such as T-Test, Chi-Square, or ANOVA
Sampling Distribution Suppose that we draw all possible samples of size n from a given population and we compute a statistic (e.g., a mean, proportion, standard deviation) for each sample. The probability distribution of this statistic is called a sampling distribution.
Sampling Distribution Sampling distribution of the mean – A theoretical probability distribution of sample means that would be obtained by drawing from the population all possible samples of the same size.
Central Limit Theorem • No matter what we are measuring, the distribution of any measure across all possible samples we could take approximates a normal distribution, as long as the number of cases in each sample is about 30 or larger. • If we repeatedly drew samples from a population and calculated the mean of a variable or a percentage or, those sample means or percentages would be normally distributed.