460 likes | 554 Views
Our lesson. Survey and Random Samples. Warm Up. Describe what information you can get about a data set by looking at a box-and-whisker plot. Find the Median and the LQ and UQ for the given data 2. 593, 588, 540, 434, 420, 398, 390, 375. Warm Up. 3. 40, 45, 50, 60, 65, 70, 75, 80, 85.
E N D
Our lesson Survey and Random Samples Confidential
Warm Up • Describe what information you can get about a data set by looking at a box-and-whisker plot. • Find the Median and the LQ and UQ for the given data • 2. 593, 588, 540, 434, 420, 398, 390, 375 Confidential
Warm Up 3. 40, 45, 50, 60, 65, 70, 75, 80, 85 From the data of Q3. 4. Find the inter quartile range. 5. What are the limits on Outliers? Are there any Outliers? Confidential
Lets recap what we have learned in the last lesson • A box and whisker plot is used to display a set of data. • To create this plot we first find out median, first quartile and second quartile. • Plot the given data set on a number line. • Mark the highest and lowest data points with connected black circles and make a box between the quartiles and a line through the median. Confidential
Review • To find median first write all the numbers in the ascending order. • If the number of data points is odd then the middle number will be the median. • If the number of data points in the set is even, the median is the average of the two middle numbers. • The Lower Quartile (LQ) of a data set is the median of lower half of the ordered data. • The Upper Quartile (UQ) of a data set is the median of the upper half of the data. Confidential
Review • Inner Quartile Range (IQR) is the difference between the Upper Quartile and the Lower Quartile Fences are the limits till which we will accept the values to be correct and any data points outside these fences will be the ‘outliers’ and hence discarded. • A circle graph is an efficient way to present certain types of data. • The graph shows data as percent or fractions of a whole. • The total should be 100% or 1. • This graph is used to show the parts of a whole Confidential
Basic Concepts in Statistics Lets get Started • In a statistical study, the objects whose characteristics are studied are called Individuals Or Units. • With any statistical study, the collection of all individuals under consideration is called Population or Universe. Ex: In the study of financial condition of families of a particular tribe, the set of all families belonging to the tribe is the population. The families are individuals. • Parameter is a a summary description of a particular aspect of the entire population. Example: The mean age of citizens in the country. Confidential
Basic Concepts in Statistics • The study of characteristics of individuals of a population by using statistical devices and techniques is called StatisticalInvestigation or Statistical Enquiry. The person who conducts the statistical investigation is called Investigator. • For a statistical study, the investigator collects the information about the individuals of the population. The persons who supply information are Informants. • The investigator may directly collect the information from the informants or he may collect through his agents. The agents who collect and hand over the information to the investigator are Enumerators. Confidential
Statistical Survey • Statistical Survey is the process of collecting the information from the informants. • For a statistical investigation of a population, an investigator may collect information from each and every individual belonging to the population or he may collect information from selected representative individuals only.The group of representative individuals from whom information are collected is called a Sample. Thus, Sample is a representative portion of the population. • The number of individuals in a sample is called Sample Size. Confidential
The purpose of dealing with a sample is that, it enables us to study a large population and to learn things about it, so that we can draw important inferences, without having to go to the trouble of collecting data from every member of the entire population. • Example : In the study of infant health of all children born in the UK • in the 1990's, all babies born on 10th October in any of the years form a sample. Confidential
Sample Survey and Census Enumeration • A statistical survey in which sample is made use of is called Sample Survey. • A survey in which the whole population is made use of is calledCensus Enumeration. • Census method is costly and consumes much time and labor as it includes all the individuals. But the results arrived would be accurate and free of sampling errors. Confidential
Sample survey is scientific in nature. A well planned sample survey will give as valid result as a census method. It is cheaper than census method and consumes less time and labor. • Census method is preferred when the population is very small and Sample survey is preferred when the population is very large. • Ex: The 10 yearly population census of China is a Census enumeration The average height of girls belonging to grade 8 in different schools of the city is a Sample Survey Confidential
Sampling One important point in working with samples is the selection of a truly representative sample. The collection of individual items for observation which accurately represents the larger population is called Sampling. Since validity of results of an investigation depends mainly on the selection of the sample, the sample should be obtained with utmost care. Sampling frame :It isthe list of units comprising a population from which a sample is to be selected. If the sample is to be representative of the population, the sampling frame should include all members of the population. Confidential
Example: Telephone book Statistic:It is a summary description of a particular aspect of a sample. Example: Mean age of the people in a sample. Sampling Error:Whatever may be the sampling method adopted, the sample selected is likely to differ slightly in structure from the population. This difference leads to an error in the estimation of the population. This error is called sampling error. Confidential
Random Sampling The sampling which involves the selection of a sample from a population, based on the principle of randomization or chance is called Probability Sampling or Random Sampling. The sampling method which focus on volunteers, easily available units, or those that just happen to be present when the research is done is called Non Probability Sampling or Non Random Sampling. Non-probability samples are useful for quick and cheap studies, for case studies, for qualitative research, for pilot studies, and for developing hypotheses for future research. Confidential
There are several different ways in which a probability sample can be selected. The most common probability sampling methods are, 1. Simple Random Sampling 2. Systamatic Sampling 3. Stratified Sampling 4. Cluster Sampling Confidential
Simple Random Sampling In simple random sampling, each member of a population has an equal chance of being included in the sample. Also, each combination of members of the population has an equal chance of composing the sample. Those two properties are what defines simple random sampling. To select a simple random sample, you need to list all of the units in the survey population. Confidential
Generally simple random sample is obetained either by Lotterymethod or by the use of the Table of Random Numbers Lottery Method: Consider a population of 1000 units. Let a sample of size 100 be required. First let us assign the 1000 units with numbers from 1 to 1000. Let us put these 1000 identical chits in a box. Then, let us shake the box and then without looking ta the numbers, draw 100 chits from the box. Then, the 100 units with these picked numbers form the sample. Example: A lottery draw is a good example of simple random sampling. Confidential
Using table of random numbers for Simple Random Sampling Table of Random Numbers is a tabular arrangement of randomly selected digits. The digit given in each position in the table was originally chosen randomly from the digits 1 ,2, 3,4,5,6, 7,8,9,0 by a random process in which each digit is equally likely to be chosen. To select a sample by this method, first of all, the units are numbered. From the table of random numbers, in an orderly way, required number of random numbers are selected(unwanted numbers in the selection may be dropped).The units with these numbers are selected to form the sample. Confidential
Using table of random numbers for Simple Random Sampling - Examples First let us assume a random number table with only 10 numbers,which is as shown below. Table1: Random Digits: 12429 63527 74608 01549 00793 28354 61218 95782 63940 58128 Table 2 : Frequency of Occurrence of Each Digit in Table 1: Digit : 1 2 3 4 5 6 7 8 9 0 Frequency : 5 7 4 5 5 4 4 6 5 5 Let us now learn to use the tables to solve an example Confidential
Example 1. Obtain a random sample of 4 out of 8 using the random digits in table 1 Let us simply read random digits ignoring those that are out of range or recur until we get four of them. Going from left to right across the top row of Table 1 we get, 1 2 4 [2] [9] 6 3 5 ; ; 1 2 4 [2] [9] 6 3 5 ; ; ; ; Probability of being selected = (Sample Size, n ÷Total Population ,N ) * 100% = 4 ÷8 * (100) = 50% (Numbers within square brackets are either repeats of previously appearing numbers or out of range.) Taking the first four usable numbers we get, Random sample : 1, 2,4,6 Confidential
Using table of random numbers for Simple Random Sampling - Examples Example 2: To draw a simple random sample from a telephone book, each entry would need to be numbered sequentially. If there were 10,000 entries in the telephone book and if the sample size were 2,000, then 2,000 numbers between 1 and 10,000 would need to be randomly generated by a computer. Each number will have the same chance of being generated by the computer The 2,000 telephone entries corresponding to the 2,000 computer-generated random numbers would make up the sample. Confidential
Systematic Sampling • Systematic sampling means that there is a gap, or interval, between • each selected unit in the sample.Therefore, sometimes it is also called • as interval sampling. • Example:Selection of a syatamatic sample of size 100 from a population • having 400 units. • In order to select a systematic sample, you need to follow these steps. • Number the units on your frame from 1 to N (where N is the total population size). Here N = 400. • Determine the sampling interval (K) by dividing the number of units in the population by the desired sample size(n). Here n = 100. • Sampling Interval, K = N ÷ n = 400 ÷ 100 = 4. • Since K = 4, you will need to select one unit out of every four units to • end up with a total of 100 units in your sample. Confidential
Systematic Sampling • Select a number between 1 and K at random. This number is called the random start(a) and would be the first number in the sample. • Let a = 3. Then the systamatic sample is formed by selecting the units which are having numbers, a, a+k, a+2k, ….., a+99k. • i.e the units with the numbers 3, 7, 11, 15,……… , 399 • In the same way, you can have only four possible samples that • can be selected, corresponding to the four possible random • starts. They are as follows. • 1, 5, 9, 13...393, 397 • 2, 6, 10, 14...394, 398 • 3, 7, 11, 15...395, 399 • 4, 8, 12, 16...396, 400 Confidential
Systematic Sampling - example Example : Obtain a systematic sample of 500 students by conducting a survey on student housing for a college, which has an enrolment of 10,000 students. First determine sampling interval (K) Sampling interval, K = Total population ÷ sample size K = 10,000 ÷ 500 = 20 Confidential
To begin systematic sampling, • Let us assign sequential numbers to all the students. • Choose a starting point by selecting a random number between 1 and 20. i.e let the random start be 9. Then the 9th student on the list would be the first member in the sample and every 20th student thereafter. • 3. One of the the systematic samples of students would be those corresponding to student numbers 9, 29, 49, 69...9,929, 9,949, 9,969 and 9,989. Confidential
Stratified Sampling • In this method of sampling, the population is split into homogeneous groups called Strata.Then from each stratum, appropriate number of units are randomly selected to form the sample. • The sampling method can vary from one stratum to another. When simple random sampling is used to select the sample within each stratum, the sample design is calledstratified simple random sampling • This method of sampling is adopted when the population can be split into groups of units which are homogeneous with regard to some characteristics. Confidential
The most important merit of this method is that the sample has representations from all the strata and hence all the categories are represented Example: For obtaining a sample from the population of students in a college, groups of students studying in various classes may be treated as a strata and from each stratum (class), some students may be randomly selected to form a sample Confidential
Cluster Sampling • Cluster sampling divides the population into groups or clusters. A number of clusters are selected randomly to represent the total population, and then all units within selected clusters are included in the sample. No units from non selected clusters are included in the sample. • This method is adopted when it is too expensive to spread a sample across the population as a whole. Travel costs can become expensive if interviewers have to survey people from one end of the country to the other. To reduce costs, statisticians may choose a cluster sampling technique. Another reason is that sometimes a list of all units in the population is not available, while a list of all clusters is either available or easy to create. Example: For the study of standard of living of the bank employees in a city, the bank offices in the city may be treated as clusters of employees. So, some offices are selected and all the employees in the those selected offices are included in the sample. Confidential
Drawbacks of Cluster Sampling Disadavtages of Cluster Sampling: 1. Loss of efficiency when compared with simple random sampling. Surveying a large number of small clusters is better than surveying a small number of large clusters. This is because neighbouring units tend to be more alike, resulting in a sample that does not represent the whole spectrum of opinions or situations present in the overall population. 2.One will not have total control over the final sample size. Considering the given example, as all the bank offices in the city have the same number of employees and one must interview all the employees. Finally the sample size would be either smaller or larger than the expected. Confidential
Your Turn Explain the following and give examples. 1. Population 2. Sample 3. Sample Size 4. parameter Confidential
Your Turn 5. Distinguish between Census Enumeration and Sample Survey . 6. What do you mean by Random Sampling? What are its relative merits? 7. Do any of the following use simple random sampling? Provide a brief explanation of how each example uses the sampling method. a) Census b) Bingo Game Confidential
Your Turn • Imagine that a local clothing manufacturer has 2,700 employees. The personnel manager decides to ask the employees for suggestions on how to improve their workplace. It would take too long to survey everyone, so the manager chooses to systematically sample 300 of the employees. • What would be the sampling interval? • If the number 6 was your first randomly drawn number, what would be the first 8 numbers of your sample? Confidential
Your Turn • Explain Systematic Sampling with an example 10. What are the disadvantages with non probability samples? Confidential
Refreshment time Confidential
Click on the duck to play a game Confidential
1. New Horizon Academy has been given a sizeable grant: enough to build either a new play ground or swimming pool. But, as there is only money enough to build one facility, the principal wants to ask her students which one they feel is in greater need of renovation.The table below indicates the number of students by sex, per grade from Kindergarten to Grade 7. Confidential
What is the total population of Poplar Ridge Academy? b) The principal wants to sample 50% of the students. How many students would this be? Confidential
2. Which sampling method can be adopted in the following case and what are its benefits? Suppose a farmer wishes to work out the average milk yield of each cow type in his herd which consists of Ayrshire, Friesian, Galloway and Jersey cows. Confidential
3. When is cluster sampling preferred? . Confidential
Let’s summarize what we have learnt today 1)Samplingisthe art of learning about a very large group of people by getting information from a small set of people. 2) Population is the entire set of individuals, events, units with specified characteristics. 3) Parameter is a summary description of a particular aspect of the entire population. 4) Sample is the subset of the population from which data is collected and used as a basis for making statements about the entire population. Confidential
Let’s summarize what we have learnt today 5)Statistic is a summary description of a particular aspect of a sample. Statistics are used to describe samples and to estimate population parameters. 6) Censusisa sample that includes the entire population which is very expensive, time-consuming. 7) Sampling frame is a list of units comprising a population from which a sample is to be selected. Confidential
8) In Probability or Random sampling, every unit of the population of interest must be identified, and all units must have a known, non-zero chance of being selected into the sample. 9) Non probability samples focus on volunteers, easily available units, or those that just happen to be present when the research is done. 10) In Simple Random Sampling, each individual is chosen entirely by chance and each member of the population has an equal chance of being included in the sample Confidential
Let’s summarize what we have learnt today 11) In Systematic Sampling there is a gap, or interval, between each selected unit in the sample. • A Stratified Sample is obtained by taking samples from each stratum or sub-group of a population. 13) Cluster sampling is a sampling technique where the entire population is divided into groups, or clusters, and a random sample of these clusters are selected. Confidential
You did a great job ! Always aim at high.... Confidential
9. When is cluster sampling preferred? Answer: Cluster sampling is typically used when the researcher cannot get a complete list of the members of a population they wish to study but can get a complete list of groups or 'clusters' of the population. It is also used when a random sample would produce a list of subjects so widely scattered that surveying them would prove to be far too expensive. Confidential