200 likes | 352 Views
Chapter 2 – Data Collection and Presentation. In chapter one, we discussed briefly the importance of samples. When we select a sample from a population, the sample must be representative of the population. Let’s consider an example:. Sampling Designs.
E N D
Chapter 2 – Data Collection and Presentation In chapter one, we discussed briefly the importance of samples. When we select a sample from a population, the sample must be representative of the population. Let’s consider an example:
Sampling Designs Methods by which a representative sample can be chosen from a population. Four sampling designs in common use: 1. Simple random sampling2. Systematic sampling3. Stratified sampling4. Cluster sampling
Sampling Designs Simple Random SamplingThe example of putting all students’ names and thoroughly mixing these names before drawing each name represents a simple random sampling.
Sampling Designs Systematic Samplingin this sampling design, every kth unit (or item) is selected from a population until the sample size is reached. K = (size of population) ------------------------- (size of sample)
Sampling Designs Stratified SamplingIn this sampling, the entire population is divided in to several groups, called strata, and a subsample is selected from each group. All subsamples are then combined to form a sample. This sampling design is used when a population is not homogeneous.
Sampling Designs Stratified sampling could be either proportionate or disproportionate, depending on the number of units selected from each group.
Sampling Designs Cluster SamplingThis sampling design involves selecting at random a few groups, called clusters, from a population, and then selecting units from each cluster. Cluster sampling is used when a population is large, fairly homogeneous and scattered over a large geographical area.
Data Organization The process of selecting a sample from a population amounts to data collection. Once the data has been collected, it must be organized to make it meaningful. Unorganized data does not convey any meaningful information.
Raw DataA set of unorganized data Data Organization requires 2 major steps: 1. Forming an array 2. Creating a frequency distribution table.
Array and Frequency Distribution ArrayIf a set of data is organized in either ascending or descending order, an array is formed. From the array, one can get some useful information, such as the lowest and the highest data value.
Frequency DistributionTable that arranges data into several classes. All classes have: • A lower limit • An upper limit Two questions: • how many classes to select? • what are the class limits?
Number of Classes Generally, the number of classes should be no fewer than six and no more than 20. A Simple formula could be used to find the total number of classes: THE TOTAL NUMBER OF CLASSES IS k SUCH THAT 2k IS AT LEAST EQUAL TO THE TOAL NUMBER OF OBSERVATIONS IN THE DATA SET
Class Limits Once we know the number of classes, we can find the class limits (lower and upper limits) of the classes. •Certain guidelines should be followed: 1. If the data values are integer, the lower limit of the first class should be 0.5 less than the lowest data value.
Class Limits The midpoint of the class should be an integer. •For other classes, follow the guideline below:1. The lower limit is the same as the upper limit of the preceding class. 2. The interval length is the same for all classes. FREQUENCY DISTRIBUTION TABLE LOOK AT TABLE 2-2 ON PAGE 21
Relative Frequency Distribution • A frequency distribution can be converted into a relative frequency distribution. Look at table 2-3 on page 22. • The relative cumulative frequency column is obtained by adding cumulatively relative frequencies.
Data Presentation • Data can be presented in several ways. HistogramRelative frequency histogramPolygonOgive
Data Presentation • HistogramA type of bar chart in which class limits are shown on x-axis and frequencies on Y-Axis. Figure 2-1. (page 25) • Relative Frequency HistogramIf relative frequencies are shown on Y-Axis, a histogram is called a relative frequency histogram. See Figure 2-2 on page 25.
Data Presentation • The PolygonIf the mid-points of all classes of a histogram are connected together, a frequency polygon is formed. Figure 2-3 (page 26) is a frequency polygon. A relative frequency polygon is created from a relative frequency histogram by connecting the mid-points of the classes as in a histogram. See Figure 2-4 on page 26.
The Ogive • On an ogive, the x-axis represents the upper limit of each class and the y-axis represents cumulative frequencies. The points are connected. The lower limit of the first class is the beginning point with zero frequency. Figures 2-5 on page 27 is an ogive. A relative cumulative frequency ogive can be formed by replacing cumulative frequencies of an ogive with relative cumulative frequencies. Look at Figure 2-6 on page 27. Other tools for data presentation are pie charts and bar charts shown on pages 28 and 29.