150 likes | 171 Views
This chapter explores various methods of displaying and describing categorical data, including frequency tables, relative frequency tables, contingency tables, and graphs such as bar graphs, pie graphs, and segmented bar graphs.
E N D
Chapter 3 Displaying and Describing Categorical Data Addie Molique, Ash Nair
Key Terms • Frequency Table- places data into categories and totals • Relative Frequency Table- places data into categories and percentages • Area Principle- the area occupied by a part of the graph should correspond to the magnitude of the value it represents • Independent Variables- the distribution of one variable in a contingency table is the same for all categories of the other variable
Key Terms: Tables • Contingency Table- shows how individuals are distributed along each variable, contingent on the value of the other variable • Marginal Distribution- each frequency distribution of its respective variable • Cell- a value in the table that gives a count for the combination of the two variables • Conditional Distribution-shows the distribution of one variable for just the individuals who satisfy some condition on another variable
Key Terms: Graphs • Bar Graph • Pie Graph • Relative Frequency Bar Graph • Segmented Bar Graph
Bar Graph • Displays the distribution of a categorical variable, showing the counts for each category next to each other for easy comparison • Follows the area principle • Used for categorical data • Bars do NOT touch • Categorical variable is typically on the x-axis • Can be used for bivariate categorical data sets
Relative Frequency Bar Graph • Displays the relative proportion of counts for each category • Also stays true to area principle • Similar to regular bar graph in layout
Pie Graph • Shows the whole group of cases as a circle; slices the circle into pieces whose size is proportional to the fraction of the whole in each category • Each value in the graph is represented as a part of the whole
Segmented Bar Graph • Displays the same information as a pie chart, but in the form of bars instead of circles • Data also displayed in the form of a part of the whole • Can be used for bivariate categorical data sets
Practice Problem: pg. 38 #17 A survey of athletic trainers asked what complications were most commonly associated with various treatment options for injuries. Of those identifying cryotherapy (ice), 86 respondents reported allergic reactions, 23 reported burns, 16 reported pain intolerance, and 6 reported frostbite.
Practice Problem: pg. 38 #17 • Make an appropriate display of these data.
Practice Problem: pg. 38 #17 • Specify the Who for these data. Would the data provide the most useful information about the risks of cryotherapy? The Who is athletic trainers. The data could be misleading, as a trainer who has used cryotherapy multiple times would be more likely to see problems than a trainer who is new to cryotherapy, or a trainer who doesn’t commonly use this treatment method.
Practice Problem: pg. 39 #19 An article in the Winter 2003 issue of Chance magazine examined the impact of an applicant’s ethnicity on the likelihood of admission to the Houston Independent School District’s magnet schools programs. Those data are summarized in the table below. Admission Decision Ethnicity
Practice Problem: pg. 39 #19 • What percent of all applicants were Asian? # of applicants: 1755 # of Asian applicants: 292 292/1755= 0.16638 x100= 16.6% • What percent of the students accepted were Asian? # of students accepted: 931 # of Asian students accepted: 110 110/931= 0.11815 x100= 11.8%
Practice Problem: pg. 39 #19 • What percent of Asians were accepted? # of Asians: 292 # of Asians accepted: 110 110/292= 0.37671 x100= 37.7% • What percent of all students were accepted? # of students: 1755 # of students accepted: 931 931/1755= 0.53048 x100: 53.0%
Review • Four different types of graphs meant for displaying categorical data: bar, pie, relative frequency bar, segmented bar • The size of the graph must match the value it represents (according to the area principle) • Simpson's Paradox- values cannot be compared unreasonably, based on incorrect assumptions, incomplete or misguided information, or a lack of understanding