190 likes | 409 Views
جامعة طيبة كلية العلوم قسم الرياضيات. TAIBAH UNIVERSITY Faculty of Science Department of Math. Introduction to Statistics. STAT 1 01. First Semester 1435/1436. Teacher :. Lesson 2-2. Source of Data and sampling methods. Data Sources There are two sources of data :-
E N D
جامعة طيبة كلية العلوم قسم الرياضيات TAIBAH UNIVERSITY Faculty of Science Department of Math. • Introduction to Statistics STAT 101 First Semester 1435/1436 • Teacher :
Lesson 2-2 Source of Data and sampling methods
Data Sources • There are two sources of data :- • 1. Primary data: Data that you retrieve firsthand (obtained directly from the original source). Primary data sources include information collected and processed directly by the researcher, such as observations, surveys, interviews, and focus groups. • There are a variety of techniques to use when gathering primary data. Listed below are some of the most common data collection techniques used for collecting data. • Interviews • Questionnaires • Observations • Focus Groups • Ethnographies, Oral History, and Case Studies • Documents and Records
2. Secondary data. data that is retrieved from pre-existing sources. Secondary data sources include information that you retrieve through pre-existing sources such as research articles, Internet or library searches. Pre-existing data may also include examining existing records and data within the program such as publications and training materials, financial records, student client data, and performance reviews of staff, etc. • Methods of Data Collection • If the information you need is not already available from a previous study, you might acquire it by:- • A census • Sampling survey • Experimentation. • Observational study
So there are four main methods of data collection. • 1. Census. A census is a study that obtains data from every member of a population. In most studies, conducting a census may be time consuming, costly, impractical, or even impossible. • 2. Sample survey. A sample survey is a study that obtains data from a subset of a population, in order to estimate population attributes. • 3. Experiment. An experiment is a controlled study in which the researcher attempts to understand cause-and effect relationships. The study is "controlled" in the sense that the researcher controls
(1) how subjects are assigned to groups and • (2) which treatments each group receives. • In the analysis phase, the researcher compares group scores on some dependent variable. Based on the analysis, the researcher draws a conclusion about whether the treatment ( independent variable) had a causal effect on the dependent variable. • 4. Observational study. Like experiments, observational studies attempt to understand cause-and-effect relationships. However, unlike experiments, the researcher is not able to control • (1) how subjects are assigned to groups and/or • (2) which treatments each group receives.
Each method of data collection has advantages and disadvantages. Forexample, when the population is large, a sample survey has a big resource advantage over a census. A well-designed sample survey can provide very precise estimates of population parameters - quicker, cheaper, and with less manpower than a census. Sampling Concepts Sampling method refers to the process by which members of a population are selected for a sample. Examples. Choosing every fifth voter who leaves a polling place to interview, drawing playing cards randomly from a deck, polling every tenth visitor who views a certain Web site today.
As a group, sampling methods fall into one of two categories. • Non-probability samples. • Probability samples. • 1. Non-Probability Sampling Methods • With non-probability sampling methods, we do not know the probability that each population element will be chosen, and/or we cannot be sure that each population element has a nonzero chance of being chosen. • Two of the main types of non-probability sampling methods are voluntary samples and convenience samples.
(a) Voluntary sample. • A voluntary sample is made up of people who self-select into the survey. Often, these folks have a strong interest in the main topic of the survey. • Suppose, for example, that a news show asks viewers to participate in an on-line poll. This would be a volunteer sample. The sample is chosen by the viewers, not by the survey administrator. • (b) Convenience sample. • A convenience sample is made up of people who are easy to reach. • Non-probability sampling methods offer two potential advantages - convenience and cost. The main disadvantage is that non-probability sampling methods do not allow you to estimate the extent to which sample statistics are likely to differ from population parameters. Only probability sampling methods permit that kind of analysis.
Probability Sampling Methods With probability sampling methods, each population element has a known (non-zero) chance of being chosen for the sample. In probability sampling, a random device—such as tossing a coin, consulting a table of random numbers, or employing a random-number generator—is used to decide which members of the population will constitute the sample instead of leaving such decisions to human judgment. The main types of probability sampling methods are simple random sampling, systematic random sampling, stratified sampling, cluster sampling, and multistage sampling. The key benefit of probability sampling methods is that they guarantee that the sample chosen is representative of the population. This ensures that the statistical conclusions will be valid.
Simple random sampling Simple random sampling: A sampling procedure for which each possible sample of a given size is equally likely to be the one obtained. Simple random sample: A sample obtained by simple random sampling. Simple random sampling refers to any sampling method that has the following properties. 1. The population consists of N objects. 2. The sample consists of n objects. 3. If all possible samples of n objects are equally likely to occur, the sampling method is called simple random sampling. There are many ways to obtain a simple random sample the most important of them are .
lottery method. Each of the N population members is assigned a unique number. The numbers are placed in a bowl and thoroughly mixed. Then, a blind-folded researcher selects n numbers. Population members having the selected numbers are included in the sample. Random-Number Tables Obtaining a simple random sample by picking slips of paper out of a box (lottery method) is usually impractical, especially when the population is large. Fortunately, we can use several practical procedures to get simple random samples. One common method involves a table of random numbers, a table of randomly chosen digits.
Random-Number Generators Nowadays, statisticians prefer statistical software packages or graphing calculators, rather than random-number tables, to obtain simple random samples. The built-in programs for doing so are called random-number generators. When using random number generators, be aware of whether they provide samples with replacement or samples without replacement. Systematic Random Sampling One method that takes less effort to implement than simple random sampling is systematic random sampling. The following procedure presents a step-by-step method for implementing systematic random sampling.
Step 1. Divide the population size by the sample size and round the result down to the nearest whole number, m. Step 2. Use a random-number table or a similar device to obtain a number, k, between 1 and m. Step 3. Select for the sample those members of the population that are numbered k, k + m, k + 2m, . . . .
Stratified sampling In stratified sampling the population is first divided into subpopulations, called strata, and then sampling is done from each stratum. Ideally, the members of each stratum should be homogeneous relative to the characteristic under consideration. In stratified sampling, the strata are often sampled in proportion to their size, which is called proportional allocation. The following procedure presents a step-by-step method for implementing stratified (random) sampling with proportional allocation. Step 1. Divide the population into subpopulations (strata). Step 2. From each stratum, obtain a simple random sample of size proportional to the size of the stratum; that is, the sample size for a stratum equals the total sample size times the stratum size divided by the population size. Step 3. Use all the members obtained in Step 2 as the sample.
As a example, suppose we conduct a national survey. We might divide the population into groups or strata, based on geography - north, east, south, and west. Then, within each stratum, we might randomly select survey respondents. Cluster sampling Another sampling method is cluster sampling, which is particularly useful when the members of the population are widely scattered geographically. With cluster sampling, every member of the population is assigned to one, and only one, group. Each group is called a cluster. A sample of clusters is chosen, using a probability method (often simple random sampling).
The following procedure provides a step-by-step method for implementing cluster sampling. Step 1. Divide the population into groups (clusters). Step 2. Obtain a simple random sample of the clusters. Step 3. Use all the members of the clusters obtained in Step 2 as the sample. Note the difference between cluster sampling and stratified sampling. With stratified sampling, the sample includes elements from each stratum. With cluster sampling, in contrast, the sample includes elements only from sampled clusters.
Multistage sampling. Most large-scale surveys combine one or more of simple random sampling, systematic random sampling, cluster sampling, and stratified sampling. With multistage sampling, we select a sample by using combinations of different sampling methods. For example, in Stage 1, we might use cluster sampling to choose clusters from a population. Then, in Stage 2, we might use simple random sampling to select a subset of elements from each chosen cluster for the final sample. multistage sampling is used frequently by pollsters and government agencies. Frame The list of all items in the population from which samples will be selected. Example. Voter registration lists, municipal real estate records, customer or human resource databases, directories.
Sampling Selection Methods Proper sampling can be done with or without replacement. Sampling With Replacement. A sampling method in which each selected item is returned to the frame from which it was selected so that it has the same probability of being selected again. For example. Selecting entries from a fishbowl and returning each entry to the fishbowl after it is drawn. Sampling Without Replacement. A sampling method in which each selected item is not returned to the frame from which it was selected. Using this technique, an item can be selected no more than one time. For example Selecting numbers in state lottery games, selecting cards from a deck of cards during games of chance such as Blackjack.