1 / 104

FREQUENCIES

FREQUENCIES. Running the Analysis. The Frequencies procedure provides statistics and graphical displays that are useful for describing many types of variables. The Frequencies procedure is a good place to start looking at your data.

pcapozzi
Download Presentation

FREQUENCIES

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. FREQUENCIES

  2. Running the Analysis • The Frequencies procedure provides statistics and graphical displays that are useful for describing many types of variables. The Frequencies procedure is a good place to start looking at your data. • For a frequency report and bar chart, you can arrange the distinct values in ascending or descending order, or you can order the categories by their frequencies. The frequencies report can be suppressed when a variable has many distinct values. You can label charts with frequencies (the default) or percentages. • To run a Frequencies analysis, from the menus choose: •  Analyze • Descriptive Statistics    • Frequencies...

  3. Running the Analysis Select staff group variable and count variable as an analysis variable. ► Click Charts. ► Select Pie charts. ► Click Continue. ► Click OK in the Frequencies dialog box. These selections produce a frequency table and pie chart of the staff group to which your contacts belong.

  4. Running the Analysis To summarize variables of your work, from the menus choose: • Analyze • Descriptive Statistics • Frequencies... • Click Statistics in the Frequencies dialog box. ► Check, Std. deviation, Minimum, Maximum, Mean, Median, Skewness, and Kurtosis. ► Click Continue. ► Click OK in the Frequencies dialog box.

  5. Frequency Table The frequency table shows the precise frequencies for each category.

  6. Frequency Table The Frequency column reports that 22 of your contacts come from the Sr Managers group. The frequency table shows the precise frequencies for each category.

  7. Frequency Table This is equivalent to 4,5% of the total number of contacts and 4,5% of the contacts whose staff groups are known. The frequency table shows the precise frequencies for each category.

  8. Statistics table The statistics table tells you several interesting things about the distribution of smoking count, starting with the summary.

  9. Statistics table The statistics table tells you several interesting things about the distribution of smoking count, starting with the summary.

  10. Statistics table The statistics table tells you several interesting things about the distribution of smoking count, starting with the summary. • Skewness. A measure of the asymmetry of a distribution. The normal distribution is symmetric and has a skewness value of 0. A distribution with a significant positive skewness has a long right tail. A distribution with a significant negative skewness has a long left tail. As a guideline, a skewness value more than twice its standard error is taken to indicate a departure from symmetry. • Kurtosis. A measure of the extent to which observations cluster around a central point. For a normal distribution, the value of the kurtosis statistic is zero. Positive kurtosis indicates that the observations cluster more and have longer tails than those in the normal distribution, and negative kurtosis indicates that the observations cluster less and have shorter tails.

  11. The statistics table tells you several interesting things about the distribution of smoking count, starting with the summary.

  12. A pie chart is a good visual tool for assessing the relative frequencies of each category.

  13. DESCRIPTIVES

  14. Running the Analysis The Descriptives procedure displays univariate summary statistics for several variables in a single table and calculates standardized values (z scores). Variables can be ordered by the size of their means (in ascending or descending order), alphabetically, or by the order in which you select the variables (the default). The Descriptives procedure is useful for obtaining summary comparisons of approximately normally distributed scale variables and for easily identifying unusual cases across those variables by computing z scores. • To run a Descriptives analysis, from the menus choose: • Analyze     • Descriptive Statistics   • Descriptives...

  15. Running the Analysis A telecommunications company maintains a customer database that includes, among other things, information on how much each customer spent on long distance, toll-free, equipment rental, calling card, and wireless services in the previous month. Use Descriptives to study customer spending to determine which services are most profitable. Select Long distance last month, Toll free last month, Equipment last month, Calling card last month, and Wireless last month as analysis variables. ► Click OK. These selections produce a descriptive statistics table which can be used to compare the amounts spent on each service.

  16. Descriptive statistics table From the table, it's difficult to tell which service is most profitable. On average, customers spend the most on equipment rental, but there is a lot of variation in the amount spent.

  17. Descriptive statistics table Customers with calling card service spend only slightly less, on average, than equipment rental customers, and there is much less variation in the values

  18. Descriptive statistics table The real problem here is that most customers don't have every service, so a lot of 0's are being counted. One solution to this problem is to treat 0's as missing values so that the analysis for each service becomes conditional on having that service.

  19. Recoding the Variables • ► To recode 0's as missing values, from the menus choose: •   Transform • Recode Into Same Variables... ► Select Long distance last month, Toll free last month, Equipment last month, Calling card last month, and Wireless last month as numeric variables. ► Click Old and New Values.

  20. Recoding the Variables ► Type 0 as the Old Value. ► Select System-missing New Value. ► Click Add.

  21. Recoding the Variables ► Click Continue. ► Click OK in the Recode into Same Variables dialog box. Further analysis of each variable should now be considered conditional upon the customer's having the service.

  22. Running the Analysis ► To run a Descriptives analysis on the recoded variables, recall the Descriptives dialog box. ► Click Options in the Descriptives dialog box. ► Deselect Minimum and Maximum. ► Select Skewness and Kurtosis. ► Click Continue. ► Click OK in the Descriptives dialog box.

  23. Descriptive Statistics When the analysis is conditional upon the customer's actually having the service, the results are dramatically different.

  24. Descriptive Statistics Wireless and equipment rental services bring in far more revenue per customer than other services.

  25. Descriptive Statistics Moreover, while wireless service remains a high variable prospect, equipment rental has one of the lowest standard deviations.

  26. Descriptive Statistics This hasn't solved the problem of who purchases these services, but it does point you in the direction of which services deserve greater marketing.

  27. CROSSTABS

  28. Running the Analysis The Crosstabs procedure forms two-way and multiway tables and provides a variety of tests and measures of association for two-way tables. The structure of the table and whether categories are ordered determine what test or measure to use. The crosstabulation table is the basic technique for examining the relationship between two categorical ( nominal or ordinal) variables. The Crosstabs procedure offers tests of independence and measures of association and agreement for nominal and ordinal data. ► To run a Crosstabs analysis, from the menus choose:   Analyze     Descriptive Statistics      Crosstabs...

  29. Running the Analysis In order to determine customer satisfaction rates, a retail company conducted surveys of 582 customers at 4 store locations. From the survey results, you found that the quality of customer service was the most important factor to a customer's overall satisfaction. Given this information, you want to test whether each of the store locations provides a similar and adequate level of customer service. Use the Crosstabs procedure to test the hypothesis that the levels of service satisfaction are constant across stores. ► Select Store as the row variable. ► Select Service satisfaction as the column variable. ► Click Statistics.

  30. Running the Analysis ► Select Chi-square, Contingency Coefficient, Phi and Cramer's V, Lambda, and Uncertainty coefficient. ► Click Continue. ► Click OK in the Crosstabs dialog box. These selections produce a crosstabulation table, a chi-square test, and nominal-by-nominal measures of association for Store by Service satisfaction.

  31. Crosstabulation Table The crosstabulation shows the frequency of each response at each store location. If each store location provides a similar level of service, the pattern of responses should be similar across stores.

  32. Crosstabulation Table At each store, the majority of responses occur in the middle.

  33. Crosstabulation Table Store 2 appears to have fewer satisfied customers.

  34. Crosstabulation Table Store 3 appears to have fewer dissatisfied customers.

  35. Crosstabulation Table From the crosstabulation alone, it's impossible to tell whether these differences are real or due to chance variation. Check the chi-square test to be sure.

  36. Chi-square Tests The chi-square test measures the discrepancy between the observed cell counts and what you would expect if the rows and columns were unrelated.

  37. Chi-square Tests The two-sided asymptotic significance of the chi-square statistic is greater than 0.05, so it's safe to say that the differences are due to chance variation, which implies that each store offers the same level of customer service.

  38. MEANS

  39. The Means Procedure The Means procedure calculates subgroup means and related univariate statistics for dependent variables within categories of one or more independent variables. Optionally, you can obtain a one-way analysis of variance, eta, and tests for linearity. The Means procedure is useful for both description and analysis of scale variables. Using its descriptive features, you can request a variety of statistics to characterize the central tendency and dispersion of your test variables. Any number of grouping variables can be layered, or stratified into cells that precisely define your comparison groups. Using its hypothesis testing features, you can test for differences between group means using one-way ANOVA. The one-way ANOVA in Means provides you with linearity tests and association measures to help you understand the structure and strength of the relationship between the groups and their means.

  40. One-Sample T Test The One-Sample T Test procedure: Tests the difference between a sample mean and a known or hypothesized value Allows you to specify the level of confidence for the difference Produces a table of descriptive statistics for each test variable A manufacturer of high-performance automobiles produces disc brakes that must measure 322 millimeters in diameter. Quality control randomly draws 16 discs made by each of eight production machines and measures their diameters. Use One Sample T Test to determine whether or not the mean diameters of the brakes in each sample significantly differ from 322 millimeters. A nominal variable, Machine Number, identifies the production machine used to make the disc brake. Because the data from each machine must be tested as a separate sample, the file must first be split into groups by Machine Number.

  41. Treating Each Machine as a Separate Sample To split the file, from the Data Editor menus choose:  Data  Split File... ► Select Compare groups. ► Select Machine Number. ► Click OK.

  42. Testing Sample Means against a Known Value To begin the one-sample t test, from the menus choose:  Analyze  Compare Means   One-Sample T Test... ► Select Disc Brake Diameter (mm) as the test variable. ► Type 322 as the test value. ► Click Options.

  43. Testing Sample Means against a Known Value ► Type 90 as the confidence interval percentage. ► Click Continue. ► Click OK in the One-Sample T Test dialog box.

  44. Descriptive Statistics The Descriptives table displays the sample size, mean, standard deviation, and standard error for each of the eight samples. The sample means disperse around the 322mm standard by what appears to be a small amount of variation.

  45. Test Results The test statistic table shows the results of the one-sample t test

  46. Test Results The t column displays the observed t statistic for each sample, calculated as the ratio of the mean difference divided by the standard error of the sample mean

  47. Test Results The df column displays degrees of freedom. In this case, this equals the number of cases in each group minus 1.

  48. Test Results The column labeled Sig. (2-tailed) displays a probability from the t distribution with 15 degrees of freedom. The value listed is the probability of obtaining an absolute value greater than or equal to the observed t statistic, if the difference between the sample mean and the test value is purely random.

  49. Test Results The Mean Difference is obtained by subtracting the test value (322 in this example) from each sample mean.

  50. Test Results The 90% Confidence Interval of the Difference provides an estimate of the boundaries between which the true mean difference lies in 90% of all possible random samples of 16 disc brakes produced by this machine. Since their confidence intervals lie entirely above 0.0, you can safely say that machines 2, 5 and 7 are producing discs that are significantly wider than 322mm on the average. Similarly, because its confidence interval lies entirely below 0.0, machine 4 is producing discs that are not wide enough.

More Related