1 / 58

Week 3 Lecture Notes PSYC2021: Winter 2019

Learn about frequency distribution, descriptive statistics, and graphical displays of categorical data with practical examples and explanations.

dleopold
Download Presentation

Week 3 Lecture Notes PSYC2021: Winter 2019

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Week 3 Lecture Notes PSYC2021: Winter 2019

  2. The following program, set of slides, are rated with low intensity in terms of introducing or reviewing statistical symbols. However, viewer’s discretion is advised.

  3. Frequency Distribution • The pattern of variation of a variable is called “distribution”. • A frequency distribution is an organized tabulation of the number of individuals (number of observations) located in each category on the scale of measurement. • A frequency distribution takes a disorganized set of scores and places them in an order from highest to lowest, group together individuals who all have the same score. • A frequency distribution can be structured either as a table or as a graph, but in either case, the distribution presents the same two elements: • The set of categories that make up the original measurement scale. • A record of the frequency, or number of individuals in each category. • Thus, a frequency distribution presents a picture of how the individual scores (observations) are distributed on the measurement scale.

  4. Describing Categorical Data Descriptive Statistics: • Frequency: counts • Relative Frequency (Proportion): Frequency (count) of a category divided by total counts Note: Proportion is a number between 0 and 1 • Percentage: proportion (relative frequency) x 100 Table: Frequency Distribution Table (e.g., frequency; relative frequency (proportion); percentage) *additionally, we can obtain cumulative percentages (see an example in upcoming slides) Graphical Display: • Bar Chart (separate bars to display distinct categories) • Pie Chart (displays categories in pies; fraction of a whole)

  5. Describing A Categorical Data: Canadian Data SetAttitudes Toward Learning (ASETS, 2008) • The Access and Support to Education and Training Survey (ASETS) 2008 asked a random sample of Canadians aged 18 to 64 years old: “To what extend do you agree or disagree with the statement: Learning New Things is Fun.” The responses ranged from 1 to 5: (1 = Strongly Agree, 2 = Somewhat Agree, 3 = Somewhat Disagree, 4 = Strongly Disagree, 5 = Neither Agree nor Disagree) • We can access this data from: Computing in the Humanities and Social Sciences - U of T: http://sda.chass.utoronto.ca/sdaweb/sda.htm • Click on: • Access and Support to Education and Training Survey, 2008 (ASETS)

  6. A Categorical Data: Attitudes Toward Learning (ASETS, 2008) • Click on Data

  7. A Categorical Data: Attitudes Toward Learning (ASETS, 2008) • Click on Codebooks > SDA codebooks

  8. A Categorical Data: Attitudes Toward Learning (ASETS, 2008) • Click on Sequential Variable List

  9. Attitudes Toward Learning New Things is Fun (ASETS, 2008) • Click on Attitudes Towards Learning

  10. Attitudes Toward Learning New Things is Fun (ASETS, 2008) • Click on item: al_g02 Learning new things is fun • What do you think about Canadians’ responses to this item is? • Do you think they all agreed? Or some of them agreed?

  11. Attitudes Toward Learning New Things is Fun (ASETS, 2008) • Do you expect to obtain the same answers (responses) from different selection of Canadians in year 2008? • Do you expect to obtain the same responses from the same selected Canadians in 2018?

  12. Summarizing and Describing a Categorical VariableAttitudes Toward Learning New Things is Fun (ASETS, 2008) • For Variable Selection: Click on “Attitudes toward learning” and select “al_g02” and then “Copy to Row” • On the right-side menu: Change Weight to “No Weight” and for Chart Option, select “Bar chart”. • You can also select the “Question text”. Click on “Run the Table”.

  13. Frequency Distribution TableAttitudes Toward Learning New Things is Fun (ASETS, 2008) Majority (66.8%) of the respondents strongly agreed with the statement.

  14. Frequency Distribution TableAttitudes Toward Learning New Things is Fun (ASETS, 2008) Frequency Table: • Count the number of cases corresponding to each category and put them into a table. • Frequency table records the totals and uses the category names to label each row. The table on the right describes the distribution of Canadian responses to the statement “Learning new things is fun”, because it names the possible categories and tell how frequently each occur (how cases are distributed across the categories). Example: 15,712 participants strongly agreed to the statement Relative Frequency: • Divide the count by the total number of cases. This gives fraction (proportion) of the whole. Example: 15712/23519 = 0.668 • Multiply the proportions by 100 to obtain the percentages. Example: 0.668 x 100 = 66.8% Majority (66.8%) of the respondents strongly agreed with the statement.

  15. Visualizing Categorical Variable with Bar ChartBar Chart of Attitudes Toward Learning New Things is Fun (ASETS, 2008) Bar Chart: • Display the distribution of a categorical variable. • Shows the frequency (count) for each category next to each other for easy comparison. • The height of the bar shows the count for its category • It is better to have spaces between bars to indicate that these are freestanding bars that could be arranged into any order. • The bars are the same width so their heights determine the areas. • These areas are proportional to the counts in each category. Note: Bar chart stays true to the Area Principle. Area Principle: The area occupied by a part of the graph should correspond to the magnitude of the value it represents.

  16. Visualizing Categorical Variable with Pie ChartAttitudes Toward Learning New Things is Fun (ASETS, 2008) Pie Chart: • Display the whole group of cases as a circle. • It slices the circle into pieces whose size is proportional to the fraction of a whole. Majority (66.8%) of the respondents strongly agreed with the statement.

  17. What could be a possible visual problem (or a confusion) with this bar graph?

  18. Important Consideration Regarding Bar Graphs • There should be spaces between adjacent bars. • For nominal scale, separate bars emphasize that the scale consists of separate, distinct categories. • For ordinal scales, the separate bars are used because you cannot assume that the categories are all the same.

  19. Export Learning New Things is Fun Data in a Text File The link to SDA at CHASS to access this data: http://sda.chass.utoronto.ca/cgi-bin/sda/hsda?harcsda+asets08

  20. Export Learning New Things is Fun Data in a Text File • In data file, choose: “csv file” • At the bottom of this page, select variables from two components: • Demographic variables • Attitudes towards learning • Click on continue at the bottom of this page.

  21. Export Learning New Things is Fun Data in a Text File • For demographic variables, choose: • Sexes: sex of the main respondents • For Attitudes toward learning, choose: • Al_g02: learning new things is fun. • Click on continue at the bottom of this page.

  22. Export Learning New Things is Fun Data in a Text File • Click on Create the Files. • Note the “Files to create” and the “individual variables specified (including partial groups). • You should see the variables of interest to export as CSV file format.

  23. Export Learning New Things is Fun Data in a Text File • Right click on Data file, and save link as … • Note this format might be different for MAC • I saved the file as “Learning_New_Fun”

  24. Export Learning New Things is Fun Data in a Text File Open the saved excel file (CSV file)

  25. Export Learning New Things is Fun Data in a Text File Save the CSV file as a Text file (I believe this works better for MAC users when reading data in R).

  26. Export Learning New Things is Fun Data in a Text File Put the text file “learning_new_fun.txt” into your Rdata folder so that R program in your computer can locate it. Note: I created a sub-folder in my Rdata and named it PSYC. I put my data sets for our course in my Rdata/PSYC path. You don’t need to do this; but, if you do, just make sure that your RStudio is pointing to PSYC folder. In other words, set your working directory in R by going to Tool > Global Options and browse the folder that you store the data sets.

  27. Read (Import) Learning New Things is Fun Data into R

  28. Frequency Distribution Table in R Responses to Learning New Things is Fun

  29. Cleanup the Frequency Distribution Table in RValid Responses to Learning New Things is Fun

  30. Change a Variable Name in R Change Name from al_g02 to Learning.New.Fun

  31. Change a Category Names in R Relabel Categories in Learning.New.Fun

  32. Obtain Counts: Add Margins to a Table in R

  33. Relative Frequency Distribution in RProportion of Participants Responding • 66.81% of respondents (about 67%) strongly agreed with the statement that “learning new things is fun”. • 31.11% of respondents (about 31%) somewhat agreed with the statement that “learning new things is fun”. • Cumulative Percentage: • 97.92% (about 98%) either strongly agreed or somewhat agreed with the statement: “learning new things is fun”.

  34. Obtain Bar Plot in RBar Plot of Responses to Learning New Things is Fun

  35. Bar Plot of Responses to Learning New Things is Fun in R using Colors, Labels

  36. Obtain Pie Chart in RPie Chart of Responses to Learning New Things is Fun

  37. Summarizing and Describing a Categorical Variable (ASETS, 2008)Sex of the Respondents The link to SDA at CHASS to access this data: http://sda.chass.utoronto.ca/cgi-bin/sda/hsda?harcsda+asets08 • For Variable Selection: Click on “Demographic Variables” and select “sexs” and then “Copy to Row” • On the right-side menu: Change Weight to “No Weight” and for Chart Option, select “Bar chart”.

  38. Summarizing and Describing a Categorical Variable (ASETS, 2008)SDA output: Frequency Distribution of Sex of the Respondents

  39. Frequency Distribution Table in RSex of the Respondents

  40. Bar Plot of Frequency of Sex of the Respondents in R

  41. Exploring Relationship Between Two Categorical Variables • Use the either row or column percentages to compare the percentages. • That is, find the conditional distribution of one variable within each level of another variable. • When the distribution of one variable is different for all categories of another variables, we say that the variables are dependent (the variables are associated; the variables are related). • When the distribution of one variable is the same for all categories of another variables, we say that the variables are independent (the variables are not associated; the variables are not related). Note: The points made above are an informal method of comparing distributions. We will see a formal way of testing for independence in chapter 12 (Significance test regarding the independence of two categorical variables: Chi-square test of independence).

  42. Joint Frequency Distribution of Responses and Sex of Respondents Contingency Table: Cross-tabulations Contingency Table: Classification with respect to two categorical variables. Cross-tabulations (Crosstabs) are joint frequency distribution of two categorical variables. • One can be considered an explanatory variable, the other a response variable if you like. The data are summarized in the two-way table below. • This table is called a 2 x 5 (read as “2-by-5”) contingency table (two rows and three columns). • It presents count data classified on two scales, or dimensions, of classification: Sex of respondents, and Attitudes Toward Learning New Things is Fun.

  43. Association Between Opinion Regarding Attitudes Toward Learning New Things is Fun and the Sex of the Respondents Research Question (the following questions address the same investigation): • Is there an association between attitudes toward learning new things is fun and the sex of the respondents? • Is there a relationship between attitudes toward learning new things is fun and the sex of the respondents? • Do opinion regarding attitudes toward learning new things is fun depend on the sex of the respondents? • Do opinion regarding attitudes toward learning new things is fun differ between males and females? • Do males and females differ in their conditional distribution on attitudes toward learning new things is fun ? • Response variable: attitudes toward learning new things is fun Type: Categorical (Strongly Agree, Somewhat Agree, Somewhat Disagree, Strongly Agree, Neither Agree Nor Disagree) • Explanatory variable: Sex of the respondents Type: Categorical (Male, Female)

  44. Examine Association Between Two Categorical VariablesExample of Conditional Distribution of Responses Consider the conditional distribution of responses regarding learning new things is fun on sex of the respondents. The link to SDA at CHASS to access this data: http://sda.chass.utoronto.ca/cgi-bin/sda/hsda?harcsda+asets08

  45. Examine Association Between Two Categorical VariablesCompare Distribution of Responses for Learning New Things is Fun for Males and Females The SDA output for this Analysis: The conditional distribution of responses regarding learning new things is fun on gender • How do the response percentages regarding the statement learning new things is fun differ between males and females? • Compare the row percentages.

  46. Examine Association Between Two Categorical VariablesCompare Distribution of Responses for Learning New Things is Fun for Males and Females The SDA output for this Analysis: The conditional distribution of responses regarding learning new things is fun on gender • How do the response percentages regarding the statement learning new things is fun differ between males and females? • Compare the row percentages. • Women (70.7%) are more likely to strongly agree with the statement, compared with the men (62.2%). • However, men (35.2%) are more likely to somewhat agree with the statement, compared with the women (27.6%). • There is not much of a difference between the sexes in the likelihood of somewhat disagreement, strong disagreement, and neither agreement nor disagreement.

  47. Construct a Contingency Table in RConditional Distribution of Responses to Learning New Things is Fun on Sex

  48. Add Margins to a Contingency Table in R The third row shows: Unconditional Distribution of Responses to Learning New Things (Regardless of the sex of the respondents) The last column shows: Unconditional Distribution of Sex – that is the count only for males and females (Regardless of the responses)

  49. Construct a Clustered Bar Chart in RConditional Distribution of Responses to Learning New Things is Fun on Sex

More Related