100 likes | 335 Views
Chapter 4 : More on Two Variable Data . Section 4.3 – Relations in Categorical Data. Analyzing Categorical Data. To analyze categorical data we use counts or percents of individuals that fall into various categories .
E N D
Chapter 4: More on Two Variable Data Section 4.3 – Relations in Categorical Data
Analyzing Categorical Data • To analyze categorical data we use counts or percents of individuals that fall into various categories. • Two way tables contain two categorical variables, one represented in the rows and one in the columns. Column variable Row variable
Definitions • Marginal distributions – The totals of each row and column that appear in the margins are referred to the marginal distributions. • Roundoff error – The difference between actual and the marginal distributions due to rounding of the sums. • In the table the total of “35-54” age group doesn’t match the total list. This is because the table is in thousands of persons and each is rounded to the nearest thousand. Marginal distributions
Marginal Distribution • Percents are often more informative than counts • You can represent marginal distributions using percents and a bar graph • Example • Suppose you want to display the distribution of years of schooling completed among people aged 25 years or older
Example 1 – How Common is College? • Create a bar graph that compares the percents of the three age groups who have completed 4 or more years of college.
Conditional Distributions • In the last example we only compared the percents of people who finished college (≥4 years) • If you look back you can see that it’s simply a distribution of percents. The percents don’t add up to 100%. • If we were to take the age group 25- 34-year olds, the percents would add up to 100% because all 25- to 34-year-olds would fall into one of the educational categories. These four percents all together would be a conditional distribution. • Conditional distribution – The total counts or percents of a given categorical variable.
Example 2 – Conditional Distribution • Find the conditional distribution of years of school completed among people age 55 and over.