E N D
An important measure of the performance of a locomotive is its "adhesion," which is the locomotive's pulling force as a multiple of its weight. The adhesion of one 4400-horsepower diesel locomotive model varies in actual use according to a Normal distribution with mean μ = 0.34 and standard deviation σ = 0.049 What proportion of adhesions (± 0.001) measured in use are higher than 0.47? Z = (0.47-0.34) / 0.049 = 2.653 Area to the right of z = 2.65 is 0.0040 What proportion of adhesions (± 0.001) are between 0.47 and 0.49? The new z-score is (0.49-0.34)/0.049 = 3.06 To find the area between the two z-scores, we find the difference in the areas to the left of each. Area left of 3.06 = 0.9989 Area left of 2.65 = 0.9960 Area between = 0.9989-0.9960 = 0.0029 What do you do if your z-value is bigger than the table values? Use the last value on the table. We know that the probability to the right of z = 3.939 is smaller than the area to the right for 3.49 (or whatever the last value on the table is). For a z-score of 6.2978, you can safely put down 0 or 1 (whichever side is appropriate) and be correct to within rounding. An important measure of the performance of a locomotive is its "adhesion," which is the locomotive's pulling force as a multiple of its weight. The adhesion of one 4400-horsepower diesel locomotive model varies in actual use according to a Normal distribution with mean μ = 0.34 and standard deviation σ = 0.049 What proportion of adhesions (± 0.001) measured in use are higher than 0.47? Z = (0.47-0.34) / 0.049 = 2.653 Area to the right of z = 2.65 is 0.0040 What proportion of adhesions (± 0.001) are between 0.47 and 0.49? The new z-score is (0.49-0.34)/0.049 = 3.06 To find the area between the two z-scores, we find the difference in the areas to the left of each. Area left of 3.06 = 0.9989 Area left of 2.65 = 0.9960 Area between = 0.9989-0.9960 = 0.0029 What do you do if your z-value is bigger than the table values? Use the last value on the table. We know that the probability to the right of z = 3.939 is smaller than the area to the right for 3.49 (or whatever the last value on the table is). For a z-score of 6.2978, you can safely put down 0 or 1 (whichever side is appropriate) and be correct to within rounding. Chapter 6
Chapter 6 Two-Way Tables Chapter 6
Categorical Variables • In this chapter we will study the relationship between two categorical variables(variables whose values fall in groups or categories). • To analyze categorical data, use the counts or percents of individuals that fall into various categories. Chapter 6
Two-Way Table • When there are two categorical variables, the data are summarized in a two-way table • each row in the table represents a value of the row variable • each column of the table represents a value of the column variable • The number of observations falling into each combination of categories is entered into each cell of the table Chapter 6
Marginal Distributions • A distribution for a categorical variable tells how often each outcome occurred • totaling the values in each row of the table gives the marginal distribution of the row variable (totals are written in the right margin) • totaling the values in each column of the table gives the marginal distributionof the column variable (totals are written in the bottom margin) Chapter 6
Marginal Distributions • It is usually more informative to display each marginal distribution in terms of percents rather than counts • each marginal total is divided by the table total to give the percents • A bar graph could be used to graphically display marginal distributions for categorical variables Chapter 6
Case Study Age and Education (Statistical Abstract of the United States, 2001) Data from the U.S. Census Bureau for the year 2000 on the level of education reached by Americans of different ages. Chapter 6
Variables Case Study Age and Education Marginal distributions Chapter 6
Variables 15.9%33.1%25.4%25.6% 21.6% 46.5% 32.0% Marginal distributions Case Study Age and Education Chapter 6
Case Study Age and Education Marginal Distributionfor Education Level Chapter 6
Conditional Distributions • Relationships between categorical variables are described by calculating appropriate percents from the counts given in the table • prevents misleading comparisons due to unequal sample sizes for different groups Chapter 6
Case Study Age and Education Compare the 25-34 age group to the 35-54 age group in terms of success in completing at least 4 years of college: Data are in thousands, so we have that 11,071,000persons in the 25-34 age group have completed at least 4 years of college, compared to 23,160,000 persons in the 35-54 age group. The groups appear greatly different, but look at the group totals. Chapter 6
Case Study Age and Education Compare the 25-34 age group to the 35-54 age group in terms of success in completing at least 4 years of college: Change the counts to percents: Now, with a fairer comparison using percents, the groups appear very similar. Chapter 6
Case Study Age and Education If we compute the percent completing at least four years of college for allof the age groups, this would give us the conditional distributionof age, given that the education level is “completed at least 4 years of college”: Chapter 6