60 likes | 308 Views
4.3 Relations in categorical Data. Categorical data. Some variables are inherently categorical, for example: Sex Race Occupation Other categorical variables are created by grouping values of a quantitative variable into classes.
E N D
Categorical data • Some variables are inherently categorical, for example: • Sex • Race • Occupation • Other categorical variables are created by grouping values of a quantitative variable into classes. • To analyze categorical data, we will use counts or percents.
Two-way tables • Two way tables are used to describe two categorical variables. • Each table will contain a row and column variable.
Distribution of a categorical variable • The distribution of a categorical variable just says how often each outcome occurred. • If the row and column totals are missing in a two way table the first thing to do is to calculate them. • Marginal distributions appear to the right and bottom of two-way tables
How can we describe the relationship? • No single graph (scatterplot) portrays the form of the relationship between categorical variables, and no single numerical measure (correlation) summarizes the strength of an association. • To describe relationships among categorical variable, calculate appropriate percents from the counts given.
Simpson’s paradox • Refers to the reversal of the direction of a comparison or an association when data from several groups are combined to form a single group.